Sunday, June 5, 2022

[SOLVED] Pipe awk and grep to save a particular field of a file

Issue

What I want to achieve:

  • grep: extract lines with the contig number and length
  • awk: remove "length:" from column 2
  • sort: sort by length (in descending order)

Current code

grep "length:" test_reads.fa.contigs.vcake_output | awk -F:'{print $2}' |sort -g -r > contig.txt

Example content of test_reads.fa.contigs.vcake_output:

>Contig_11 length:42
ACTCTGAGTGATCTTGGCGTAATAGGCCTGCTTAATGATCGT
>Contig_0 length:99995
ATTTATGCCGTTGGCCACGAATTCAGAATCATATTA

Expected output

>Contig_0 99995
>Contig_11 42

Solution

With your shown samples, please try following awk + sort solution here.

awk -F'[: ]' '/^>/{print $1,$3}' Input_file | sort -nrk2

Explanation: Simple explanation would be, running awk program to read Input_file first, where setting field separator as : OR space and checking condition if line starts from > then printing its 1st and 2nd fields then sending its output(as a standard input) to sort command where sorting it from 2nd field to get required output.



Answered By - RavinderSingh13
Answer Checked By - Gilberto Lyons (WPSolving Admin)