Tuesday, July 26, 2022

[SOLVED] Need to exclude and match regex in bash

Issue

i am new to bash and is trying to write a script that searches for some specific words in the codebase. But because there are a lot of false positives i also need to maintain a exclude_pattern list so that anything matching that is ignored Currently my script is returning the correct matches and the relevant line looks like this

output=$(find $sourceDir -path "*/.git" -prune -o -type f \( -name "*.cpp" -o -name "*.h" \) -exec grep -E -H -i -R --color=always "$matching_regex" {} \; )

Now i am unable to use this output and run a exclude pattern on it i tried t do something like this but it did not work

while IFS= read -r line
do
  foundFinal=$(grep -v "$exclude_matches" "$line")
done <<< "$output"

Maybe i do not need to do the exclude part separately but i could do both matching and excluding in the first command itself but i have so far been unsuccessful. Would be great if i can get any feedback or examples that could show me what i might be missing or doing incorrectly. Btw as already stated i am a newbie with bash, so if grep is not the command for my use case, please do not hesitate to comment.


Solution

output=$(
    find "$sourceDir" \
        -name .git -prune \
      -o \
        -type f \( -name '*.cpp' -o -name '*.h' \) \
        -exec grep -E -H -i -- "$matching_regex" {} +
)
foundFinal=$(
    grep -E -v "exclude_matches" <<<"$output"
)

Or more efficiently, if you don't need output, just pipe the two together:

foundFinal=$(
    find "$sourceDir" \
        -name .git -prune \
      -o \
        -type f \( -name '*.cpp' -o -name '*.h' \) \
        -exec grep -E -H -i -- "$matching_regex" {} + \
    | grep -E -v -- "$exclude_matches"
)
  • I simplified the git check
  • I replaced \; with + to reduce the number of invocations of grep
  • I removed -R (which should never succeed anyway)
  • I removed --color==always which could interfere with the second grep
  • I added -E to the second grep to match the first one
  • I added -- to protect against regex that start with hyphen

If you want to colourize for display, you can re-run the grep on the (presumably not too long) result:

grep --colour=auto -E -i -- "$matching_regex" <<<"$foundFinal"


Answered By - jhnc
Answer Checked By - Dawn Plyler (WPSolving Volunteer)