Issue
Have some delimited files with improperly placed newline characters in the middle of fields (not line ends), appearing as ^M in Vim. They originate from freebcp (on Centos 6) exports of a MSSQL database. Dumping the data in hex shows \r\n patterns:
$ xxd test.txt | grep 0d0a
0000190: 3932 3139 322d 3239 3836 0d0a 0d0a 7c43
I can remove them with awk, but am unable to do the same with sed.
This works in awk, removing the line breaks completely:
awk 'gsub(/\r/,""){printf $0;next}{print}'
But this in sed does not, leaving line feeds in place:
sed -i 's/\r//g'
where this appears to have no effect:
sed -i 's/\r\n//g'
Using ^M in the sed expression (ctrl+v, ctrl+m) also does not seem to work.
For this sort of task, sed is easier to grok, but I am working on learning more about both. Am I using sed improperly, or is there a limitation?
Solution
I believe some versions of sed
will not recognize \r
as a character. However, you can use a bash
feature to work around that limitation:
echo $string | sed $'s/\r//'
Here, you let bash
replace '\r' with the actual carriage return character inside the $'...'
construct before passing that to sed
as its command. (Assuming you use bash
; other shells should have a similar construct.)
Answered By - chepner Answer Checked By - Terry (WPSolving Volunteer)