Issue
I'm trying to sort certificates by date leaving only the latest one on a separate file.
here's an example pki_certs.res input file for an example host with the list of its past certificates unsorted which I need to sort:
And here's the script to sort and pop the last one out:
cat "${_file}" | sort -k10,10 | sed -e 's/Not After : //' -e 's/GMT/GMT;/' | grep "${_domain}" | \
while read line; do
_first=`echo $line | cut -d';' -f1`
_second=`echo $line | cut -d';' -f2-`
_date=`date -d "${_first}" +%Y%m%d%H%M`
echo "$_date $_second"
done |sort -k 3,3 -k 1,1r | awk "{if (i[\$3] < \$1) i[\$3]=\$1} END{for(x in i){ print x\" \"i[x] }}" | \
sed -e 's/CN=//g' | sort -k 2,2 > pki_certs.final.sorted
Trouble is the sorting is leaving the last before the recent in the pki_certs.final.sorted file.
Expected output:
Apr 7 20:09:26 2023
but rather than have that an output I get this instead:
Apr 12 18:12:02 2022
Any thoughts on what am I missing please
Solution
Apply the DSU (Decorate/Sort/Undecorate) idiom using any version of the mandatory Unix tools awk, sort, cut, and head to get the whole line output:
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $0}' file |
sort -r | head -1 | cut -f2-
pki_certs.res:Not After : Apr 7 20:09:26 2023 GMT DNS:MDVARTREPO01.cpp.nonlive
or just the date:
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $4" "$5" "$6" "$7}' file |
sort -r | head -1 | cut -f2-
Apr 7 20:09:26 2023
The first awk adds a sortable version of the date+time at the front of each line, then sort sorts it by that timestamp, then cut removes the string that awk added. Seeing the intermediate output from each step shows how it works:
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $4" "$5" "$6" "$7}' file
2023040720:09:26 Apr 7 20:09:26 2023
2020050712:05:44 May 7 12:05:44 2020
2021040817:06:54 Apr 8 17:06:54 2021
2020050711:58:19 May 7 11:58:19 2020
2021040917:42:27 Apr 9 17:42:27 2021
2021041709:09:35 Apr 17 09:09:35 2021
2021040917:02:43 Apr 9 17:02:43 2021
2022041218:12:02 Apr 12 18:12:02 2022
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $4" "$5" "$6" "$7}' file | sort -r
2023040720:09:26 Apr 7 20:09:26 2023
2022041218:12:02 Apr 12 18:12:02 2022
2021041709:09:35 Apr 17 09:09:35 2021
2021040917:42:27 Apr 9 17:42:27 2021
2021040917:02:43 Apr 9 17:02:43 2021
2021040817:06:54 Apr 8 17:06:54 2021
2020050712:05:44 May 7 12:05:44 2020
2020050711:58:19 May 7 11:58:19 2020
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $4" "$5" "$6" "$7}' file | sort -r | head -1
2023040720:09:26 Apr 7 20:09:26 2023
$ awk '{printf "%04d%02d%02d%s\t%s\n", $7, (index("JanFebMarAprMayJunJulAugSepOctNovDec",$4)+2)/3, $5, $6, $4" "$5" "$6" "$7}' file | sort -r | head -1 | cut -f2-
Apr 7 20:09:26 2023
By the way in your code where you have:
awk "{if (i[\$3] < \$1) i[\$3]=\$1} END{for(x in i){ print x\" \"i[x] }}"
you're having to escape all those symbols because you're using the wrong quotes and so inviting the shell to interpret the script before awk sees it. Just don't do that, use the single quotes as you always should unless you have a specific reason not to:
awk '{if (i[$3] < $1) i[$3]=$1} END{for(x in i){ print x" "i[x] }}'
Answered By - Ed Morton Answer Checked By - Cary Denson (WPSolving Admin)