Tuesday, October 25, 2022

[SOLVED] run regex against 3rd column only

Issue

Using this regular expression, I'm finding a string of numbers starting with 9, followed by 4 or 5 or 6 or 7 or 9, followed by 6 more numbers.

9*[45679]( *[0-9]){6}

I have a file named content.txt containing 3 columns. The first column is a date, the second a time and the third contains random text and numbers with spaces in it.

20/10/2022 19:00 test 1 99 435 18 1 more text
20/10/2022 20:00 test 2 97 123 1 81 more text2
20/10/2022 21:00 test 3 96 4 3 5567 more text3
20/10/2022 22:00 test 4 99 43 5181 more text4

Using my regular expression I want to modify the third column and leave only the results of the regular expression, with no spaces, so the result should be

20/10/2022 19:00 99435181
20/10/2022 20:00 97123181
20/10/2022 21:00 96435567
20/10/2022 22:00 99435181

the field separator is a space


Solution

With GNU sed. I assume your field separator is one space.

sed -E 's/^(.{16}).*( 9[45679]( *[0-9]){6}).*/\1\2/; s/ //g3' file

Output:

20/10/2022 19:00 99435181
20/10/2022 20:00 97123181
20/10/2022 21:00 96435567
20/10/2022 22:00 99435181

See: man sed



Answered By - Cyrus
Answer Checked By - Senaida (WPSolving Volunteer)