Issue
I am meeting some difficulties when using AWK to search a huge csv file (this can be called file1). fortunately, I have a list file (this can be called file2). I can just search the rows that I need basing on the index list file in file2. however, the file1 is not like any other normal file, it's like:
ID1, AC000112;AC000634;B0087;P01116;,ID1_name
ID2, AC000801;,ID2_name
ID3, P01723;F08734;,ID3_name
ID4, AC0014;AC0114;P01112;,ID4_name
...
IDn, AC0006;,IDn_name
IDm, Ac8007; P01167;,IDm_name
the index file2 like:
AC000112
AC000801
P01112
P01167
the desired output should be:
ID1, AC000112;AC000634;B0087;P01116;,ID1_name
ID2, AC000801;,ID2_name
ID4, AC0014;AC0114;P01112;,ID4_name
IDm, Ac8007; P01167;,IDm_name
if I use
awk -F, 'NR==FNR{a[$1]; next} ($2 in a)' file2 file1
I will get nothing, if I add ";" at the end of each line in file2, I will only get ID2, AC000801;,ID2_name
. and if I change $2 ~ a[$1]
, it still didn't work.
So, I was wondering how to change this command to get the desired result. Thanks!
Solution
If you aren't restricted to awk, I would use grep for this task:
grep -Fwf file2 file1
-f file2
: use each line offile2
as a search string.-w
: match whole words only (such that patternP01167
won't matchP011670
). Any character other than letters, digits and underscore delimit a word (soP01167;,
will match).-F
: fixed strings - match the string exactly, such that any regex characters have no special meaning.
Answered By - dan Answer Checked By - Timothy Miller (WPSolving Admin)