Issue
I'm trying to clean up the syntax of a pseudo-json file. The file is too large to open in a text editor (20 gb), so I have to do all of this via command line (running Arch linux). The one thing I cannot figure out how to do is replace new line characters in sed (GNU sed v. 4.8)
Specifically I have data of the form:
{
"id" : 1,
"value" : 2
}
{
"id" : 2,
"value" : 4
}
And I need to put a comma after the closed curly bracket (but not the last one). So I want the output to looks like:
{
"id" : 1,
"value" : 2
},
{
"id" : 2,
"value" : 4
}
Ideally I'd just do this in sed
, but from reading about this, sed flattens the text first, so it's not clear how to replace newline characters.
Ideally I'd just run something like sed 's/}\n{/},\n{/g' test.json
, but this doesn't work (nor does using \\n in place of \n).
I've also tried awk, but have run into similar issue of not being able to replace the combination of a hard return with brackets. And I can get tr to replace the hard returns, but not the combination of characters.
Any thoughts on how to solve this?
Solution
Yeah, by default sed
works line by line. You cannot match across multiple lines unless you use features to bring in multiple lines to the pattern space. Here's one way to do it, provided the input strictly follows the sample shown:
sed '/}$/{N; s/}\n{/},\n{/}' ip.txt
/}$/
match}
at the end of a line{}
allows you to group commands to be executed for a particular addressN
will add the next line to the pattern spaces/}\n{/},\n{/
perform the required substitution
- Use
-i
option for in-place editing
This solution can fail for sequences like shown below, but I assume two lines ending with }
will not occur in a row.
}
}
{
abc
}
Use sed '/}$/{N; s/}\n{/},\n{/; P; D}'
if the above sequence can occur.
Answered By - Sundeep