Issue
I am new to shell scripting and want to use awk to parse a file. I have a log file which looks something like this log.txt where each record name(A,B,C) is shown twice.
A,10.10.250.2,Compliant
A,10.10.250.2,Compliant
B,10.10.250.3,NonCompliant
B,10.10.250.3,Compliant
C,10.10.250.4,NonCompliant
C,10.10.250.4,NonCompliant
I want to merge both the record where record name is same, something like this:
A,10.10.250.2,Compliant, NA,Compliant
B,10.10.250.3, NonCompliant,Yes,Compliant
C,10.10.250.4, NonCompliant,No,NonCompliant
The 4th Column is “NA” when both last column values are Compliant, “Yes” when NonCompliant and “Compliant” and “No” when it is NonCompliant and NonCompliant. 5th column is the last value of second record.
I am trying something like this which is not correct, needs some help
awk -F "," '{if ($1 == $1) print NR}' log.txt
Solution
How I have done it (and some awk
details):
- awk reads one line at a time and processes it.
- to compare something with the next line, you can force your script to read the next input line with
getline
. if (condition) { something } else { something else }
is used to check conditionals.&&
performs "and" between two conditions.
That being said, create a file named "script.awk":
BEGIN { FS="," }
{
compornot=$3
getline
if (compornot == "Compliant" && $3 == "Compliant") {
value = "NA"
}
else if (compornot == "NonCompliant" && $3 == "NonCompliant") {
value = "No"
}
else {
value = "Yes"
}
print $1 "," $2 "," compornot "," value "," $3
}
- The BEGIN is to establish the field separator as the comma
- then awk reads the firt line
- keeps the third field in variable "compornot". The 3rd field is "Compliant" or "NonCompliant"
- reads the next input line using
getline
. From that point on, the numbered fields refer to that next input line. - Use
if
to check if the previous line's third field is "Compliant" or "NonCompliant", and the current line's third field for the same values. - Depending on the values of the third fields, set variable "value" to "Yes", "No" or "NA".
- Finally print the output. Remember that $1, $2 and $3 refer to the second line of input here, not the first anymore!
- And the process starts again with the 3rd line of input.
The sequence of processed lines is:
- first line,
getline
, second line - third line,
getline
, fourth line - fifth line,
getline
, sixth line - ...
So only the odd numbered lines are processed, and the even lines are explicitly read via getline
.
To use it, the command is:
awk -f script.awk input.txt
Answered By - Nic3500 Answer Checked By - Pedro (WPSolving Volunteer)