Issue
I'm trying to print out a sub string from all pdf files in a directory. I can't seem to make sed work with it. The regex is correct, but sed gives an error when I use \1
for old in ./*.pdf; do
new=$(echo $old | sed -e 's/(\.\/)?\d+_(\w\w\-\d+).+/\1/')
echo $new
done
I'm using sed (GNU sed) 4.4
The output is:
sed: -e expression #1, char 32: invalid reference \1 on `s' command's RHS
for each file in the directory...
Thanks!
Solution
You may use
sed -E 's/(\.\/)?[0-9]+_[A-Z][A-Z]-[0-9]+.+/\1/'
Note that sed
does not support PCRE regex, thus, \d
and \w
are just plain invalid constructs here. To match any letter, you may use [:alpha:]
POSIX character class, or if you wish to match uppercase letters, use [:upper:]
.
Instead of \d
, use [0-9]
or [:digit:]
.
In the BRE POSIX pattern, (
and )
denote the literal parentheses, that is why you got an error saying you cannot refer to any capturing group value - there was none defined in the pattern. To make unescaped parentheses create a group in a POSIX BRE pattern, you need to escape them, or - if you use a POSIX ERE pattern (sed
with -r
or -E
option), you may use them unescaped.
Same goes for +
quantifier: in a POSIX BRE pattern it should escaped, in an ERE pattern, it is OK to use it unescaped.
Besides, you do not need to use a second capturing group since you are not using \2
in the replacement.
Answered By - Wiktor Stribiżew