Issue
I'm trying to make a nice Gource video on our software develop project. Using Gource a can generate a combined git log of all repos with:
first gource --output-custom-log ../logs/repo1.txt
then
cat *.txt | sort -n > combined.txt
This generates a combined.txt file which is a pipe delimited file like:
1551272464|John|A|repo1/file1.txt
1551272464|john_doe|A|repo1/folder/file9.py
1551272464|Doe, John|A|repo2/filex.py
So its: EPOCH|Committer name|A or D or C|committed file
The actual problem I want to solve is the fact that my developers have used different git clients with different committer names so id like to replace all of their names to a single version. I do not mind setting multiple sed per situation. So find "John", "john_doe" and "Doe, John" and replace it with "John Doe". And it should be done on my MacBook.
So I tried sed -i -r "s/John/user_john/g" combined.txt
but the problem here is that it finds "John" and "Doe, John" and replaces just the "John" part so I'm need to do a fuzzy search and replace the whole column.
Who can help me get the correct regex?
Solution
A regex would almost certainly be the wrong approach for this as you'd get false matches unless you were extremely careful and it's inefficient.
Just create an aliases
file containing a line for each name you want in your output followed by all the names that should be mapped to it and then you can do this to change them all clearly, simply, robustly, portably, and efficiently in one call to awk:
$ cat tst.awk
BEGIN { FS="[|]" ; OFS="|" }
NR==FNR {
for (i=2; i<=NF; i++) {
alias[$i] = $1
}
next
}
$2 in alias { $2 = alias[$2] }
{ print }
.
$ cat aliases
John Doe|John|john_doe|Doe, John
Susan Barker|Susie B|Barker, Susan
.
$ cat file
1551272464|John|A|repo1/file1.txt
1551272464|Susie B|A|repo2/filex.py
1551272464|john_doe|A|repo1/folder/file9.py
1551272464|Doe, John|A|repo2/filex.py
1551272464|Barker, Susan|A|repo2/filex.py
.
$ awk -f tst.awk aliases file
1551272464|John Doe|A|repo1/file1.txt
1551272464|Susan Barker|A|repo2/filex.py
1551272464|John Doe|A|repo1/folder/file9.py
1551272464|John Doe|A|repo2/filex.py
1551272464|Susan Barker|A|repo2/filex.py
Answered By - Ed Morton