Monday, April 11, 2022

[SOLVED] Replace new line character between double quotes with space

Issue

i want to read a data row by row and whereever i find double quote i want to replace new line character with a space till the second double quote encounter like

090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing
Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology

Like in above data second row as it finds the double quote(open) and close double quote in 3rd line so we need to merge these lines by a single space as below:

090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology

Solution

You can use this gnu-awk one-liner:

awk -v RS='"[^"]*"' -v ORS= '{gsub(/\n/, " ", RT); print $0  RT}' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology
  • RS='"[^"]*"' - Input Record Separator is set to regex '"[^"]*"'
  • -v ORS= - Output Record Separator is set to null
  • gsub(/\n/, " ", RT) - Replace newlines with space in the text matched by Input Record Separator

And here is a perl one-liner:

perl -0pe 's/"[^\n"]*"(*SKIP)(*F)|("[^"\n]*)\n([^"]*")/$1 $2/g' file
090033ec82b13639,CPDM Initiated,Logistical,"There corrected.",Gul Y Serbest,Urology
090033ec82ae0c07,Initiated,NA,"To   local testing Rohit  3 to 4.",Julienne B Orr,Oncology
090033ec82b35fd0,Externally Initiated,NA,regulatory agency requests,Kenneth A Lord,Oncology


Answered By - anubhava
Answer Checked By - Terry (WPSolving Volunteer)