Issue
I need to make a script file that reads two files and prints out common lines between them. I know that both the files are the same number of lines and each line only contains one word.
File 1:
Blue
Red
Orange
Green
Yellow
Blue
File 2:
Blue
Green
Red
Purple
Yellow
Blue
Expected output:
Blue
Yellow
Blue
So in the example Red and Green appear in both files, however they are not on the same line in each file so they are ignored.
Have tried using awk, grep and comm but couldn't get them to work.
Trying to find the solution that takes the shortest amount of time to process.
Solution
Using awk:
awk 'NR == FNR { lines[NR] = $0 } NR != FNR && lines[FNR] == $0 { print }' file1 file2
Explanation:
- When reading the first file (
NR == FNR
), build a mapping of line number to value - When reading not the first file (
NR != FNR
), if the current line matches what the corresponding line has in the cache, then print the line
This reads both files exactly once, and uses roughly as much memory as the size of the first file.
Answered By - janos Answer Checked By - Timothy Miller (WPSolving Admin)