Friday, December 17, 2021

[SOLVED] Replace a pattern range line by line with sed

December 17, 2021 awk, bsd, gnu, linux, sed

Issue

I need to replace the beginning of each line of a pattern range sorted by ID, I am using sed but other languages are welcome!

This is the sample text with the objectives

...
==OPEN==
  data: blabla
  id: class1
  moredata: blabla
==CLOSE==

==OPEN==
  id: class2
  boringdata: blabla
==CLOSE==

...extra info

==OPEN==
  id: class8
  data: ...
==CLOSE==

...more info

==OPEN==
  data: ...
  boringdata: ...
  id: class10
==CLOSE==
...

If I were to comment out the pattern with id 8, the expected output would be:

...
==OPEN==
  data: blabla
  id: class1
  moredata: blabla
==CLOSE==

==OPEN==
  id: class2
  boringdata: blabla
==CLOSE==

...extra info

// ==OPEN==
//   id: class8
//   data: ...
// ==CLOSE==

...more info

==OPEN==
  data: ...
  boringdata: ...
  id: class10
==CLOSE==
...

The closest code I have gotten is this, but I have to rewrite the entire range and it is not affordable:

sed -e '/==OPEN==/ {:loop; N; /==CLOSE==/! b loop; /id: class8/ {s/.*/NEEDS REWRITE/}}' /example

If I tell it to rewrite the beginning (^), it rewrites only the first line of the range, I think it is because it considers the entire pattern as one line.

Solution

The closest code I have gotten is this, but I have to rewrite the entire range and it is not affordable:

sed -e '/==OPEN==/ {:loop; N; /==CLOSE==/! b loop; /id: class8/ {s/.*/NEEDS REWRITE/}}' /example

That's actually not bad.

If I tell it to rewrite the beginning (^), it rewrites only the first line of the range, I think it is because it considers the entire pattern as one line.

Yes, with POSIX sed, and with GNU sed by default, the ^ matches the beginning of the pattern space only. However, you can match the newlines themselves:

sed -e '/==OPEN==/ {:loop; N; /==CLOSE==/! b loop; /id: class8/ {s,\(^\|\n\),\1// ,g}}' \
  /example

Note in particular:

The text being replaced is expressed as the group \(^\|\n\), meaning either the zero-length substring at the beginning of the pattern space or a newline, which is captured as a group.
The matched text is echoed back into the replacement via \1. That has no visible effect when it is the ^ alternative that matches, but it avoids eliminating newlines when the other alternative matches.
The comma (,) is used as pattern delimiter, so that the slashes (/) in the replacement text don't need to be escaped.
The g flag is used to cause all appearances of the pattern to be replaced, not just the first.

If you are willing to rely on GNU extensions, then you can do it a bit more simply:

sed -e '/==OPEN==/ {:loop; N; /==CLOSE==/! b loop; /id: class8/ {s,^,// ,gm}}' \
  /example

With GNU sed, the m flag in the s command causes ^ to match both at the beginning of the pattern space and immediately after each newline within. This flag is not specified by POSIX.

Answered By - John Bollinger

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 17, 2021

[SOLVED] Replace a pattern range line by line with sed

Issue

Solution

Popular Posts

Labels