Issue
I have a bunch of text files, all with the same structure, and I need to extract a specific piece in a specific line.
I can easily extract the line with awk:
awk 'NR==23' blast_out.txt
CP046310.1 Lactobacillus jensenii strain FDAARGOS_749 chromosome,... 787 0.0
But I don't want the whole line, rather just the part between the first space on the left (after CP046310.1
) and the double space on the right (before 787
). The final output should be:
Lactobacillus jensenii strain FDAARGOS_749 chromosome,...
I tried several combination of awk and grep but cannot find the correct one to extract this specific pattern.
Solution
1st solution: With your shown samples, please try following awk
code. Simple explanation would be, nullifying 1st, 2nd last field and last field, then globally substituting starting and ending space with NULL, then printing the line.
awk '{$1=$NF=$(NF-1)="";gsub(/^ +| +$/,"")} 1' Input_file
OR to run it on 23rd line change it to:
awk 'FNR==23{$1=$NF=$(NF-1)="";gsub(/^ +| +$/,"");print;exit}' Input_file
2nd solution: Going through fields and printing values which are required as per need.
awk '{for(i=2;i<(NF-1);i++){printf("%s%s",$i,i==(NF-2)?ORS:OFS)}}' Input_file
OR on 23rd line try following:
awk 'FNR==23{for(i=2;i<(NF-1);i++){printf("%s%s",$i,i==(NF-2)?ORS:OFS)};exit}' Input_file
Answered By - RavinderSingh13 Answer Checked By - Cary Denson (WPSolving Admin)