Friday, April 15, 2022

[SOLVED] sed output first match only between brackets

Issue

using sed, i would like to extract the first match between square brackets. i couldn't come up with a matching regex, since it seems that sed is greedy in its regex. for instance, given the regex \[.*\] - sed will match everything between the first opening bracket and the last closing bracket, which is not what i am after (would appreciate your help on this).

but until i could come up with a regex for that, i made an assumption that there must be a space after the closing bracket, to come up with a regex that will let me continue my work \[[^ ]*\].

i have tried it with grep, e.g.

$ echo '++  *+   ++ + [SPAM] foo(): z.y.o ## [x.y.z]----- ' | grep -oE '\[[^ ]*\]'
[SPAM]
[x.y.z]

i would like to use the regex in sed (not in grep) and output the first match (i.e. [SPAM]). i have tried it as follows, but wasn't able to do that

$ echo '++  *+   ++ + [SPAM] foo(): z.y.o ## [x.y.z]----- ' | sed 's/\[[^ ]*\]/\1/'
sed: 1: "s/\[[^ ]*\]/\1/": \1 not defined in the RE

$ echo '++  *+   ++ + [SPAM] foo(): z.y.o ## [x.y.z]----- ' | sed 's/\(\[[^ ]*\]\)/\1/'
++  *+   ++ + [SPAM] foo(): z.y.o ## [x.y.z]-----

would appreciate if you could assist me in:

  1. constructing a regex to match all text between every opening and closing square brackets (see grep example above)
  2. use the regex in sed and output only the first occurrence of the match

Solution

You can use this sed:

s='++  *+   ++ + [SPAM] foo(): z.y.o ## [x.y.z]----- '
sed -E 's/[^[]*(\[[^]]*\]).*/\1/' <<< "$s"

[SPAM]

Here:

  • [^[]* match 0 or more of any non-[ character
  • (\[[^]]*\]) matches a [...] substring and captures in group #1
  • .* matches rest of the string till end
  • \1 in substitution puts value captured in group #1 back in output

An awk solution would be nice as well:

awk 'match($0, /\[[^]]*\]/){print substr($0, RSTART, RLENGTH)}' <<< "$s"

[SPAM]


Answered By - anubhava
Answer Checked By - Pedro (WPSolving Volunteer)