Issue
I want to search the content of one file into another file and print the matched line and line that followed the matched line from the second file. The content of the first file can be found in the lines starting with >
under GN
column in the second file. I want to write the line that matches (starting with >
) followed by the line after that which has the sequence of amino acid ( string of capital letters starting with "M")
File 1:
thrB
yaaX
thrC
dnaK
dnaJ
File 2:
>sp|B1XBC8|KHSE_ECODH Homoserine kinase OS=Escherichia coli (strain K12 / DH10B) OX=316385 GN=thrB PE=3 SV=1
MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEP
>sp|P0AD61|KPYK1_ECOLI Pyruvate kinase I OS=Escherichia coli (strain K12) OX=83333 GN=pykF PE=1 SV=1
MKKTKIVCTIGPKTESEEMLAKMLDAGMNVMRLNFSHGDYAEHGQRIQNLRNVMSKTGKT
>sp|P75616|YAAX_ECOLI Uncharacterized protein YaaX OS=Escherichia coli (strain K12) OX=83333 GN=yaaX PE=3 SV=1
MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH
and i am expecting output as:
>sp|B1XBC8|KHSE_ECODH Homoserine kinase OS=Escherichia coli (strain K12 / DH10B) OX=316385 GN=thrB PE=3 SV=1
MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEP
>sp|P75616|YAAX_ECOLI Uncharacterized protein YaaX OS=Escherichia coli (strain K12) OX=83333 GN=yaaX PE=3 SV=1
MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH
so far I have tried
grep -F -f file1 file2
which only prints the line with the match found
with awk I only have written
awk 'NR==FNR{a[$1]++;next}{} file1 file2
I can print the matching line but I don't know how to print the line after that (starting with "M").
Can anyone help me in getting through this?
I would be really grateful for your help.
Also, what if my second file has multiple matches of the string in file 1 and I want to print all such occurrences?
Thanks in Advance
Solution
If you have GNU grep
grep --no-group-separator -A1 -Ff file1 file2
-A1
will tell grep to print the matching line as well as the next line- by default, the output groups will be separated by
--
, so use--no-group-separator
if you wish to avoid this line
Answered By - Sundeep Answer Checked By - Cary Denson (WPSolving Admin)