Issue
I've got multiple log files in a directory and trying to extract just the timestamp and a section of the log line i.e. the value of the fulltext query param. Each query param in a request is separated by an ampersand(&) as shown below.
Input
30/Mar/2022:00:27:36 +0000 [59823] -> GET /libs/granite/omnisearch?p.guessTotal=1000&fulltext=798&savedSearches%40Delete=&
31/Mar/2022:00:27:36 +0000 [59823] -> GET /libs/granite/omnisearch?p.guessTotal=1000&fulltext=Dyson+V7&savedSearches%40Delete=&
Intended Output
30/Mar/2022:00:27:36 -> 798
31/Mar/2022:00:27:36 -> Dyson+V7
I've got this command to recursively search over all the files in the directory.
grep -rn "/libs/granite/omnisearch" ~/Downloads/ReqLogs/ > output.txt
This prints the entire log line starting with the directory name, like so
/Users/****/Downloads/ReqLogs/logfile1_2022-03-31.log:6020:31/Mar/2022:00:27:36 +0000 [59823] -> GET /libs/granite/omnisearch?p.guessTotal=1000&fulltext=798&savedSearches%4
Please enlighten, How do i manipulate this to achieve the intended output.
Solution
grep
can return the whole line or the string which matched. For extracting a different piece of data from the matching lines, turn to sed
or Awk.
awk -v search="/libs/granite/omnisearch" '$0 ~ search { s = $0; sub(/.*fulltext=/, "", s); sub(/&.*/, "", s); print $1, s }' ~/Downloads/ReqLogs/*
or
sed -n '\%/libs/granite/omnisearch%s/ .*fulltext=\([^&]*\)&.*/\1/p' ~/Downloads/ReqLogs/*
The sed
version is more succinct, but also somewhat more oblique.
\%...%
uses the alternate delimiter %
so that we can use literal slashes in our search expression.
The s/ .../\1/p
then says to replace everything on the matching lines after the first space, capturing anything between fulltext=
and &
, and replace with the captured substring, then print the resulting line.
The -n
flag turns off the default printing action, so that we only print the lines where the search expression matched.
The wildcard ~/Downloads/ReqLogs/*
matches all files in that directory; if you really need to traverse subdirectories, too, perhaps add find
to the mix.
find ~/Downloads/ReqLogs -type f -exec sed -n '\%/libs/granite/omnisearch%s/ .*fulltext=\([^&]*\)&.*/\1/p' {} +
or similarly with the Awk command after -exec
. The placeholder {}
tells find
where to add the name of the found file(s) and +
says to put as many as possible in one go, rather than running a separate -exec
for each found file. (If you want that, use \;
instead of +
.)
Answered By - tripleee Answer Checked By - Katrina (WPSolving Volunteer)