Issue
I was using the less
command to browse a very huge text log file (15 GB) and was trying to search for a multiline pattern but after some investigation, less
command can only search single line patterns.
Is there a way to use grep
or other commands to return the number line of a multiline pattern?
The format of the log is something like this in iterations of hundred thousands:
Packet A
op_3b : 001
ctrl_2b : 01
ini_count : 5
Packet F
op_3b : 101
ctrl_2b : 00
ini_count : 4
Packet X
op_3b : 010
ctrl_2b : 11
ini_count : 98
Packet CA
op_3b : 100
ctrl_2b : 01
ini_count : 5
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
Packet ZZ
op_3b : 111
ctrl_2b : 01
ini_count : 545
Packet QEA
op_3b : 111
ctrl_2b : 11
ini_count : 0
And what I am trying to get is to have grep
or some other command to return the start of the line number of when these three line pattern occurs:
op_3b : 001
ctrl_2b : 00
ini_count : 0
Solution
Suppose that pattern is in file pattern
like this:
$ cat pattern
op_3b : 001
ctrl_2b : 00
ini_count : 0
Then, try:
$ awk '$0 ~ pat' RS= pat="$(cat pattern)" logfile
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
How it works
RS=
This sets the Record Separator
RS
to an empty string. This tells awk to use an empty line as the record separator.pat="$(cat pattern)"
This tells awk to create an awk variable
pat
which contains the contents of the filepattern
.If your shell is bash, then a slightly more efficient form of this command would be
pat="$(<pattern)"
. (Don't use this unless you are sure that your shell is bash.)$0 ~ pat
This tells awk to print any record that matches the pattern.
$0
is the contents of the current record.~
tells awk to do a match between the text in$0
and the regular expression inpat
.(If the contents of
pattern
had any regex active characters, we would need to escape them. Your current example does not have any so this is not a problem.)
Alternative style
Some people prefer a different style for defining awk variables:
$ awk -v RS= -v pat="$(cat pattern)" '$0 ~ pat' logfile
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
This works the same.
Displaying line numbers
$ awk -F'\n' '$0 ~ pat{print "Line Number="n+1; print "Packet" $0} {n=n+NF-1}' RS='Packet' pat="$(cat pattern)" logfile
Line Number=20
Packet LP
op_3b : 001
ctrl_2b : 00
ini_count : 0
Answered By - John1024