Monday, April 4, 2022

[SOLVED] Grep only last line with latest datetimes in a text file

Issue

I have a logfile in a Linux OS (redhat) which inserts events of a database. The file looks like this:

2021-08-04 09:35:00.212 +03 [100] FATAL: password fail for x
2021-08-04 09:35:20.276 +03 [101] FATAL: password fail for x
2021-08-04 09:36:05.223 +03 [104] FATAL: password fail for x
2021-08-04 09:36:20.823 +03 [305] FATAL: password fail for y
2021-08-04 09:37:00.299 +03 [322] FATAL: password fail for y
2021-08-04 09:37:50.350 +03 [328] FATAL: password fail for y
2021-08-04 09:38:20.822 +03 [340] FATAL: password fail for z
2021-08-04 09:38:22.500 +03 [370] FATAL: password fail for z
2021-08-04 09:38:50.210 +03 [420] FATAL: password fail for z
2021-08-04 09:39:01.372 +03 [423] FATAL: password fail for z

I want to get only lines with the latest datetime for each user(x,y,z). So it should look like below:

  2021-08-04 09:36:05.223 +03 [104] FATAL: password fail for x
  2021-08-04 09:37:50.350 +03 [328] FATAL: password fail for y
  2021-08-04 09:39:01.372 +03 [423] FATAL: password fail for z

Solution

We can use to get lines that have unique value on the latest column.
print unique lines based on field


To ensure those are the latest (datatime), I'd assume the following

  • The file is always sorted from old to new

Therefore, if we;

  • Reverse the file (to go from new -> old)
  • Get the unique user rows
  • Reverse it again (to go from old -> new)

Will get the last failed attempts for each user:

tac log.txt | awk -F" " '!_[$9]++' | tac

Example on my local machine:

$
$ cat log.txt
2021-08-04 09:35:00.212 +03 [100] FATAL: password fail for x
2021-08-04 09:35:20.276 +03 [101] FATAL: password fail for x
2021-08-04 09:36:05.223 +03 [104] FATAL: password fail for x
2021-08-04 09:36:20.823 +03 [305] FATAL: password fail for y
2021-08-04 09:37:00.299 +03 [322] FATAL: password fail for y
2021-08-04 09:37:50.350 +03 [328] FATAL: password fail for y
2021-08-04 09:38:20.822 +03 [340] FATAL: password fail for z
2021-08-04 09:38:22.500 +03 [370] FATAL: password fail for z
2021-08-04 09:38:50.210 +03 [420] FATAL: password fail for z
2021-08-04 09:39:01.372 +03 [423] FATAL: password fail for z
$
$ tac log.txt | awk -F" " '!_[$9]++' | tac
2021-08-04 09:36:05.223 +03 [104] FATAL: password fail for x
2021-08-04 09:37:50.350 +03 [328] FATAL: password fail for y
2021-08-04 09:39:01.372 +03 [423] FATAL: password fail for z
$


Answered By - 0stone0
Answer Checked By - Mary Flores (WPSolving Volunteer)