Issue
I have strings of the form
A-XXX
A-YYY
B-NNN
A-ZZZ
B-MMM
C-DDD
However, I want to get the first occurrence of every string before the hyphen. So the solution here would be:
A-XXX
B-NNN
C-DDD
How can I do this with bash tools? I tried uniq
, but I can't set the "similarity-pattern" there.
Solution
Will this suffice?
cat uwe
A-XXX
A-YYY
B-NNN
A-ZZZ
B-MMM
C-DDD
$ awk -F'-' '!a[$1]{print $0;a[$1]++}' uwe
A-XXX
B-NNN
C-DDD
EDIT:
One can actually shorten that to the slightly more cryptic:
$ awk -F'-' '!a[$1]++' uwe
A-XXX
B-NNN
C-DDD
What we do is to tell awk -
is the field separator; !a[$1]
tells awk to execute the following commands (with nothing given print is the default), and post increment the array that checks whether a value was seen.
Answered By - tink