Wednesday, December 29, 2021

[SOLVED] Filter grep results down to paths to existing files only

Issue

I'm using git diff and grep to give me list of all files from a specific folder that were changed in the last commit:

paths=$(git diff --name-only HEAD HEAD~1 | grep "desired_folder/")

Now I need to filter the list only to strings that actually point to an existing file, and get rid of strings that don't. How can I do this?

The --diff-filter flag is not useful for me, because I need to get the change list on one branch and then filter only to files that exist on a different branch.

Edit - example behaviour:
Input string:

"file1.txt file2.txt folder/file3.txt"

Current folder tree:

|-- file1.txt
|-- folder/
    |-- file3.txt

Expected output:

# file2.txt got removed, because it does not represent 
# path to an existing file
"file1.txt folder/file3.txt"

Solution

You can loop though the results from git diff with with a while loop, the syntax is a bit weird, to avoid having to create a subshell:

while read; do
  :
done < <(cmd)

In your case:

while IFS= read file
do
  [ -f "$file" ] || continue # skip non existing files
  echo "$file" # file exists and is a regular file
done < <(git diff --name-only HEAD HEAD~1)

You can stick your grep behind the git diff, just note you'll need to keep in inside the process substitution:

< <(git diff ... | grep ...)

The while loop will read each line from the output of git/grep and store the line in the variable file.

Note that git diff --name-only show files relative to the root of the git repo, you might want to add --relative to make them relative to the current working directory.



Answered By - Andreas Louv