Issue
I have the command below which prints out hits, host IP (local server/load balancer) and external IP (the one causing the hit) I would also like to print out the User Agent information alongside the information given. How can this be achieved please?
cat access.log | sed -e 's/^\([[:digit:]\.]*\).*"\(.*\)"$/\1 \2/' | sort -n | uniq -c | sort -nr | head -20
What I get is below...
Hits, Host IP, External IP
What I'd like if possible...
Hits, IP (host example), External IP (causing the hit), User Agent
10000 192.168.1.1 148.285.xx.xx Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/98 Safari/537.4
Attached below is an excerpt from the log
192.168.xxx.x - - [10/Jun/2019:12:40:15 +0100] "GET /company-publications/152005 HTTP/1.1" 200 55848 "google.com" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.12) Gecko/20080219 Firefox/2.0.0.12 Navigator/9.0.0.6" "xx.xx.xx.xx"
Solution
If GNU AWK (gawk
) is available, please try the following:
awk -v FPAT='(\"[^"]+\")|(\\[[^]]+])|([^ ]+)' '
{ gsub("\"", "", $9); gsub("\"", "", $10); print $1 " " $10 " " $9 }
' access.log | sort -n | uniq -c | sort -nr | head -20
- The value of
FPAT
represents a regex of each field inaccess.log
. That is: "string surrounded by double quotes", "string surrounded by square brackets" or "string separated by whitespaces". - Then you can split each line of
access.log
into fields:$1
forhost IP
,$10
forexternal IP
, and$9
foruser agent
.
Answered By - tshiono