Issue
we have the following example list - /tmp/file
while the first field represented the date , and the second field represented the folder path
our target is to get only the list of folders that older then 30 days from current day
we tried with bash that include sort and awk/sed but without success
more /tmp/file
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0046
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0049
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0051
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0063
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0064
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0065
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0107
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0108
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0109
2023-09-19 /user/hdfs/.sparkStaging/application_1693406696051_0142
2023-09-19 /user/hdfs/.sparkStaging/application_1693406696051_0143
2023-10-11 /user/hdfs/.sparkStaging/application_1697021506895_0001
2023-10-11 /user/hdfs/.sparkStaging/application_1697021506895_0002
2023-10-11 /user/hdfs/.sparkStaging/application_1697021506895_0017
2023-10-12 /user/hdfs/.sparkStaging/application_1697021506895_0048
2023-10-12 /user/hdfs/.sparkStaging/application_1697021506895_0054
2023-10-12 /user/hdfs/.sparkStaging/application_1697021506895_0062
2023-10-12 /user/hdfs/.sparkStaging/application_1697021506895_0063
2023-10-15 /user/hdfs/.sparkStaging/application_1697021506895_0077
2023-10-15 /user/hdfs/.sparkStaging/application_1697021506895_0081
2023-10-15 /user/hdfs/.sparkStaging/application_1697021506895_0090
2023-10-16 /user/hdfs/.sparkStaging/application_1697021506895_0170
2023-10-16 /user/hdfs/.sparkStaging/application_1697021506895_0171
2023-10-19 /user/hdfs/.sparkStaging/application_1697021506895_0422
2023-10-19 /user/hdfs/.sparkStaging/application_1697021506895_0426
Solution
I wouldn't normally post an answer to a question that shows neither the OPs attempt in code nor the expected output but since there's other answers already....
Using GNU date plus any awk:
$ awk -v d="$(date -d '-30 days' +'%F')" '$1 < d' file
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0046
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0049
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0051
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0063
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0064
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0065
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0107
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0108
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0109
or using GNU awk (for time functions systime()
and strftime()
) alone:
$ awk 'BEGIN{d=strftime("%F",systime()-30*24*60*60)} $1 < d' file
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0046
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0049
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0051
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0063
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0064
2023-09-14 /user/hdfs/.sparkStaging/application_1693406696051_0065
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0107
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0108
2023-09-18 /user/hdfs/.sparkStaging/application_1693406696051_0109
Answered By - Ed Morton Answer Checked By - Katrina (WPSolving Volunteer)