Issue
Extending this question and answer, I'd like some help exploring some solutions to making this exercise of using source
to bring a config file into a Bash file more "safely." I say "more safely" because I recognize it may be impossible to do with 100% safety.
I want to use a config file to set variables and arrays and have some comments throughout. Everything else should be disallowed.
The above Q&A suggested starting a regex line to check for things we want, versus what we don't want, before passing it to source
.
For example, the regex could be:
(^\s*#|^\s*$|^\s*[a-z_][^[:space:]]*=[^;&\(\`]*$|[a-z_][^[:space:]]*\+?=\([^;&\(\`]*\)$)
But I'm looking for help in both refactoring that regex, or considering other pathways to get what we're after in the Bash script below, especially after wondering if this approach is futile in the first place?
Example
This is what the desired config file would look like:
#!/bin/bash
disks=([0-UUID]=1234567890123 [0-MountPoint]='/some/path/')
disks+=([1-UUID]=4567890123456 [1-MountPoint]='/some/other/path')
# ...
someNumber=1
rsyncExclude=('.*/' '/dev/' '/proc/' '/sys/' '/tmp/' '/mnt/' '/media/' '/lost+found' '.Trash-*/' '[$]RECYCLE.BIN/' '/System Volume Information/' 'pagefile.sys' '/temp/' '/Temp/' '/Adobe/')
remote='[email protected]'
# there should be nothing in the config more complicated than above
And this is a simplified version of the bash script it will go into, using the example from @Erman in the Q/A linked to above, to do the checking:
#!/bin/bash
configFile='/blah/blah/config.file'
if [[ -f "${configFile}" ]]; then
# check if the config file contains any commands because that is unexpected and unsafe
disallowedSyntax="(^\s*#|^\s*$|^\s*[a-z_][^[:space:]]*=[^;&\(\`]*$|[a-z_][^[:space:]]*\+?=\([^;&\(\`]*\)$)"
if egrep -q -iv "${disallowedSyntax}" "${configFile}"; then
printf "%s\n" 'The configuration file is not safe!' >&2 # print to STDERR
exit 1
else
# config file might be okay
if result=$( bash -n "${configFile}" 2>&1 ); then
# set up the 'disk' associative array first and then import
declare -A disks
source <(awk '/^\s*\w++?=/' "${configFile}")
# ...
else
# config has syntax error
printf '%s\n' 'The configuration file has a syntax error.' >&2
exit 1
fi
fi
else
# config file doesn't exist?
printf '%s\n' "The configuration file doesn't exist." >&2
exit 1
fi
I imagine below is ideally what we want to be allowed and disallowed as a starting point?
Allowed
# whole numbers only
var=1
var=123
# quoted stuff
var='foo bar'
var="foo bar"
# arrays
var=('foo' 'bar')
var=("foo" "bar")
var=([0-foo]=1 [0-bar]='blah' ...
var+=(...
# vars with underscores, same format as above
foo_bar=1
...
foo_bar+=(...
# and that's it?
Not allowed*
* Not an exhaustive list (and I'm certain I'm missing things) but the idea is to at least disallow anything not quoted (unless it's a number), and then also anything else that would allow unleash_virus
to be run:
var=notquoted
...
var=notquoted unleash_virus
var=`unleash_virus`
...
var='foo bar' | unleash_virus
...
var="foo bar"; unleash_virus
var="foo bar" && unleash_virus
var="foo bar $(unleash_virus)"
...
Solution
Here's a start, thanks to @SasaKanjuh.
Instead of checking for disallowed syntax, we could use awk
to only pass parts of the config file that match formatting we expect to eval
, and nothing else.
For example, we expect that variables must have some kind of quoting (unless they solely contain a number); arrays start and end with ()
as usual; and everything else should be ignored...
Here's the awk
line that does this:
awk '/^\s*\w+\+?=(\(|[0-9]+$|["'\''][^0-9]+)/ && !/(\$\(|&&|;|\||`)/ { print gensub("(.*[\"'\''\\)]).*", "\\1", 1) }' ./example.conf
- first part captures line starting with variable name, until
=
- then after
=
sign, it is looking for(
, numerical value, or'
or"
followed by a string - second part excludes lines with
$()
,&&
,;
and|
- and
gensub
captures everything including last occurrence of'
or"
or)
, ignoring everything after.
#!/bin/bash
configFile='./example.conf'
if [[ -f "${configFile}" ]]; then
# config file exists, check if it has OK bash syntax
if result=$( bash -n "${configFile}" 2>&1 ); then
# seems parsable, import the config file
# filter the contents using `awk` first so we're only accepting vars formatted like this:
# var=1
# var='foo bar'
# var="foo bar"
# var=('array' 'etc')
# var+=('and' "so on")
# and everything else should be ignored:
# var=unquoted
# var='foo bar' | unleash_virus
# var='foo bar'; unleash_virus
# var='foo' && unleash_virus
# var=$(unleash_virus)
# var="$(unleash_virus)"
# ...etc
if config=$(awk '/^\s*\w+\+?=(\(|[0-9]+$|["'\''][^0-9]+)/ && !/(\$\(|&&|;|\||`)/ { print gensub("(.*[\"'\''\\)]).*", "\\1", 1) }' "${configFile}"); then
# something matched
# now actually insert the config data into this session by passing it to `eval`
eval "${config}"
else
# no matches from awk
echo "No config content to work with."
exit 1
fi
else
# config file didn't pass the `bash -n` test
echo "Config contains invalid syntax."
exit 1
fi
else
# config file doesn't exist or isn't a file
echo "There is no config file."
exit 1
fi
Answered By - nooblag