Issue
I have a large text file with lines like:
01 81118 9164.47 0/0:6,0:6:18:.:.:0,18,172:. 0/0:2,0:2:6:.:.:0,6,74:. 0/1:4,5:9:81:.:.:148,0,81:.
What I need is to keep just the first three characters of all the columns containing a colon, i.e.
01 81118 9164.47 0/0 0/0 0/1
Where the number of chars after the first 3 can vary. I started here by removing everything after a colon, but that removes the entire rest of the line, rather than per word:
sed 's/:.*//g' file.txt
Alternately, I've been trying to bring in the word boundary (\b) and hack away at removing everything after colons several times:
sed 's/\b:[^ ]//g' file.txt | sed 's/\b:[^ ]//g'
But this is not a good way to go about it. What's the best approach?
Solution
Using GNU sed
with regular expression extensions, a one-liner could be:
sed -E 's/(\S{3})\S*:\S*/\1/g' file
\S
matches non-whitespace characters (a GNU extension).
Answered By - M. Nejat Aydin Answer Checked By - Terry (WPSolving Volunteer)