Saturday, December 4, 2021

[SOLVED] One-liner to list old versions of files in a directory

December 04, 2021 awk, bash, linux, sed

Issue

I'm trying to work out a linux one-liner to list only files that are earlier version duplicates of files in a directory. e.g.:

filenames:

foo-bar-foo-1.3.42.jar
foo-bar-foo-1.2.21.jar
foo-2.3-foo-bar-3.1.2.jar
foo-2.3-foo-bar-3.2.4.jar
bar-foo-1.24.jar
bar-foo-2.0.jar
foobar-foobar-3.4.1.jar
barfoo-barfoo-1.2.1.jar

expected output:

foo-bar-foo-1.2.21.jar
foo-2.3-bar-3.1.2.jar
bar-foo-1.24.jar

This is similar to the question https://unix.stackexchange.com/questions/185193/remove-the-low-version-number-of-file , but that one relies on being able to set a file seperator on the first dash and mine have at least two dashes. I've had some limited success trying to tweak it like this:

ls -vr *.jar | awk -F-[0-9]+.[0-9]+.[0-9]+ '$1 == name{system ("ls \""$0"\"")}{name=$1}'

but it misses those with only 2 numbers in the version.

and using this gets caught up on files like foo-2.3-foo-bar-3.1.2.jar:

ls -vr *.jar | awk -F-[0-9]+.[0-9]+ '$1 == name{system ("ls \""$0"\"")}{name=$1}'

I can also use gsub to get a variable that contains everything but the version number, but I can't figure out how to use it to ultimately get my expected results.

ls -vr *.jar | awk -F- '{gsub("-"$NF,"",$x)}{print $x}'

I am open to not using awk if there's a better solution (I'm not terribly familiar with it). I'm working on RHEL in bash with sed also available. However, it must be a one-liner that can be used directly on the command line.

Solution

Sort now has a --version-sort option which is the hero here.

#!/bin/bash

# let awk remember the previous file prefix (p1) and previous file name (f1)
# if the current prefix (p2) matches the previous prefix (p1), then 
# print the previous filename (f1)
awk '{
    # remember the previous values
    p1=p2
    f1=f2
    # save the current filename
    f2=$0
    # strip the version and extension
    sub(/[0-9\.]+.[a-z]+$/, "")
    # save as the current prefix
    p2=$0
    if (p1 == p2) {
        # print the previous filename if this prefix is the same as the previous
        print f1
    }
}' <(sort --version-sort <(for f in *.jar; do echo "$f"; done))

And now for the one-liner :)

awk '{p1=p2; f1=f2; f2=$0; sub(/[0-9\.]+.[a-z]+$/, ""); p2=$0; if (p1==p2) {print f1}}' <(sort -V <(for f in *.jar; do echo "$f"; done))

Results:

bar-foo-1.24.jar
foo-2.3-foo-bar-3.1.2.jar
foo-bar-foo-1.2.21.jar

Answered By - Cole Tierney

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, December 4, 2021

[SOLVED] One-liner to list old versions of files in a directory

Issue

Solution

Popular Posts

Labels