Issue
I'm looking a solution with sed command to search and replace strings on last column of csv file and here the search patterns I'm calling from an array. Below script looks for 3rd and 4th column which causes a mismatch in the output.
Here i need your help how i can tell sed to look only on the last column.
file1.txt
QCQP
TXTT
QCQT
YYTH
file2.txt
TTYY
JPEK
QCQC
TTYE
Original output.csv
[Input]
String1
[Data]
ID,Name,Class,Context,Code
1,jack,6,QCQT,QCQP
2,john,5,QCQP,TXTT
3,jake,3,TTXX,QCQT
4,jone,3,TXTT,YYTH
Below is my script which I used for this setup, but here this sed command search for all occurrence instead of looking for the last column separated by comma.
#!/bin/bash
filein=file1.txt
fileout=file2.txt
pre=$(cat $filein)
post=$(cat $fileout)
prear=($pre)
postar=($post)
typeset -p prear postar
for (( i=0; i<${#prear[@]}; ++i )); do
sed -i -e 's/'"${prear[$i]}"'/'"${postar[$i]}"'/g' output.csv
done
Expected result
output.csv
[Input]
String1
[Data]
ID,Name,Class,Context,Code
1,jack,6,QCQT,TTYY
2,john,5,QCQP,JPEK
3,jake,3,TTXX,QCQC
4,jone,3,TXTT,TTYE
Using awk command I'm able to figure out similiar occurance, but the below works with a single variable, also not with comma seperator but with array this fails.
awk -F "," '{gsub(c,d,$(NF)); print}' c=$a d=$b file.txt
In addition, if using awk or gawk for this purpose, i would need to specify the variable name as input. Because the input files "file1.txt, file2.txt" and output files with .csv filenames will not be same all the time. Actually I'm accepting them as first, second and third argument in the script and then reading the contents from that variable.
For eg:- Here users can choose any name file as input. Here I'm not sure how to call the array in awk/gawk
#!/bin/bash
input1=$1
input2=$2
Output=$3
inp1=$(cat $input1)
inp2=$(cat $input2)
out=$(cat $Output)
inp1ar=($inp1)
inp2ar=($inp2)
outar=($out)
I would like to expect to call the array variable to read the contents
gawk -i inplace '
.. some condition ..
' {inp1ar} {inp2ar} {outar}
Please advise
Thanks Jay
Solution
I'd use awk for this. With GNU awk:
gawk '
BEGIN {FS = OFS = ","}
ARGIND == 1 {f1[FNR] = $1; next}
ARGIND == 2 {map[f1[FNR]] = $1; next}
{$NF = map[$NF]; print}
' file1.txt file2.txt original.csv
ID,Name,Class,Context,
1,jack,6,QCQT,TTYY
2,john,5,QCQP,JPEK
3,jake,3,TTXX,QCQC
4,jone,3,TXTT,TTYE
But with sed, you can dynamically build up a sed program using file1 and file, and than apply that to the original csv
sed "$(paste -d " " file1.txt file2.txt | sed 's/^/s:,/; s/ /$:,/; s/$/:/')" original.csv
Execute that piece-by-piece to see how it all fits together.
To accomodate the updated csv file with "prefix" lines: (untested)
gawk '
BEGIN {FS = OFS = ","}
ARGIND == 1 {f1[FNR] = $1; next}
ARGIND == 2 {map[f1[FNR]] = $1; next}
BEGINFILE {start = 0; header = 1}
start {if (header) {header = 0} else {$NF = map[$NF]}}
{print}
$1 == "[Data]" {start = 1}
' file1.txt file2.txt original.csv
Given the skeleton of a script you have in your recent edit:
First, it is crucial to quote your variable names: cat "$input1"
-- failure to do that will result in the "I'm falling as argument or some other variable name" that you report.
Next, there's no need to read the contents of the files in the bash part of the script: awk will do that.
#!/bin/bash
input1="$1"
input2="$2"
Output="$3"
gawk -i inplace '.. some condition ..' "$input1" "$input2" "$Output"
See how the variables are all (double) quoted everywhere?
Answered By - glenn jackman Answer Checked By - Cary Denson (WPSolving Admin)