Issue
What is the best way to extract lines from a very large gz file that match multiple strings in a second file?
I've tried, which works for that string and surrounding:
gunzip -c /myfolder/large_file.gz | grep -B 50 "33754548" > /myfolder/specific_linesfrom_large_files.txt
However, sometimes the strings needed are not in 50 lines near, so I attempted:
gunzip -c /myfolder/large_file.gz | grep -F /myfolder/multiple_strings.txt > /myfolder/specific_linesfrom_large_files.txt
Which didn't work, any suggestions?
for example, the multiple_strings.txt file might contain:
16804029
42061608
42069963
42072123
177479064
177420374
Solution
use zgrep
to search into compressed files. There are also other commands like bzgrep
(for bzip2 files), xzgrep
etc for compressed files.
zgrep -f match_strings.txt file.gz
-f
is the flag for reading the patterns from a specified file.
Answered By - thanasisp Answer Checked By - Cary Denson (WPSolving Admin)