Issue
I have a csv file like this:
# 2022 5 2 8 1 24.8-17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2
SOD6 2.20 1.00 P
SOD6 3.98 1.00 S
SOD5 3.21 1.00 P
SOD5 5.79 1.00 S
SOD0 4.07 1.00 P
SOD0 7.10 1.00 S
SOD3 6.47 1.00 P
SOD3 11.20 1.00 S
# 2022 5 3 0 10 16.8-17.3820 -65.6330 28.0 0.7 0.0 0.0 0.3 3
SOD2 6.24 1.00 P
SOD2 10.49 1.00 S
SOD9 7.66 1.00 P
SOD9 12.75 1.00 S
SOD1 10.34 1.00 P
SOD3 11.42 1.00 P
SOD3 21.11 1.00 S
# 2022 5 3 11 28 10.8-17.7600 -65.9840 6.6 0.7 0.0 0.0 0.1 4
SOD3 6.55 1.00 P
SOD2 6.89 1.00 P
SOD2 11.70 1.00 S
SOD9 8.82 1.00 P
SOD1 10.04 1.00 P
SOD1 17.60 1.00 S
I was trying to add a black space on the 24th place of each header, this is the header
# 2022 5 2 8 1 24.8-17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2
so the header will look like:
# 2022 5 2 8 1 24.8 -17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2
I tried the following code:
# To read the headers and to add a space on 24th place
# of each header, where 'phase.dat' is the csv file
grep '# 2022' phase.dat | sed 's/ ./&\s /24'
But it did not add the space at desired position. Does anyone have an idea what I did wrong?
Stay safe and best regards, Tonino
Solution
Something like this.
sed 's/^\(# 2022[^-]*\)\(.*\)$/\1 \2/' phase.dat
If the headers are what you're actually trying to extract and edit.
sed -n 's/^\(# 2022[^-]*\)\(.*\)$/\1 \2/p' phase.dat
A quick breakdown on the sed
code
^
is what they call an anchor in regex, It means from the beginning or start.( )
Inside of those parenthesis are capture groups. Since it is B.R.E. (Basic Regular Expression) It needs to be escaped/preceded by a\
[ ]
Is what they call a bracket expression,inside it is also a
^
(it negates) but that means everything EXCEPT for the character next to it, in this case a-
*
is a next to[ ]
, they call it a quantifier, which means zero or more string/characters.
So the first capture group will match
# 2022
from the beginning and everything until it reaches the first-
(.*)
is the second capture group..*
means zero or more amount of string/character, which basically the rest of the string is captured.
$
is also an anchor which means at the end.\1
and\2
refers to the capture groups one and two, which is what ever is inside the( )
- See
man 7 regex
Answered By - Jetchisel Answer Checked By - Marilyn (WPSolving Volunteer)