Issue
I have searched but found nothing so far. i am looking to list out a directory, then cat each file for unique items then use those two items to count items in a file.
1st ls dir
ls
- file1.txt
- file2.txt
- file3.txt etc...
2nd grep each for unique
cat $file | awk '{print $8}' | sort | uniq
which should output numbers
- 83886096
- 1040187393
- 201326673 etc...
and 3rd, use the uniq numbers found to grep the file it came from to count how many there are
cat $file | grep $output | wc -l
And somehow get a nice output with $file $output $count on lines
Thank you ahead of time
I am assuming i will have to do something of this nature but more complicated (since i cant get it to work)
FILE="$(ls -1)"
ls > list.txt
input=list.txt
while read line
do
OUTPUT=cat ${FILE} | awk '{print $8}' | sort | uniq
cat ${FILE} | grep ${OUTPUT} | wc -l
done < "$input"
When i run it, it seems to kind of work, I get the following output
grep: 0652-033 Cannot open 83886096.
0
grep: 0652-033 Cannot open 83886096.
0
So it found the files and read them but could not do the count
Solution
Do nont parse the output of ls. Instead, just loop through the files. This way you also avoid using intermediate files:
for file in *;
do
# things with "$file"
done
Then, you are saying:
OUTPUT=cat ${FILE} | awk '{print $8}' | sort | uniq
To start, storing the output of a command in a variable requires the syntax var=$(command)
. Otherwise, when you say var=command1 command2...
one of these things can happen. Then, cat file | awk '...'
is equivalent to awk '...' file
, so you can directly say OUTPUT=$(awk "$FILE" | sort | uniq)
. awk
can do all of this alone, but we will address this later.
cat ${FILE} | grep ${OUTPUT} | wc -l
Same here with cat
. Also, grep -c
does this, so you can just say:
grep -c "$OUTPUT" "$FILE"
All together, it would be:
for file in *;
do
OUTPUT=$(awk "$FILE" | sort | uniq)
grep -c "$OUTPUT" "$FILE"
done
But in fact awk
alone can do it:
awk '{count[$8]++} ENDFILE {print FILENAME; for (f in count) print f, count[f]; delete count}' *
This loops through all the files in the current directory and counts the number of times a given 8th field appears in each one. Then it prints a summary for every file.
Note this is GNU awk specific since it uses ENDFILE.
See some sample input/output:
$ tail f*
==> f1 <==
field1 field2 field3 field4 field5 field6 field7 xfield8 field9
field1 field2 field3 field4 field5 field6 field7 yfield8 field9
field1 field2 field3 field4 field5 field6 field7 yfield8 field9
field1 field2 field3 field4 field5 field6 field7 zfield8 field9
==> f2 <==
field1 field2 field3 field4 field5 field6 field7 xfield8 field9
field1 field2 field3 field4 field5 field6 field7 yfield8 field9
field1 field2 field3 field4 field5 field6 field7 zfield8 field9
field1 field2 field3 field4 field5 field6 field7 zfield8 field9
==> f3 <==
field1 field2 field3 field4 field5 field6 field7 xfield8 field9
field1 field2 field3 field4 field5 field6 field7 xfield8 field9
field1 field2 field3 field4 field5 field6 field7 xfield8 field9
field1 field2 field3 field4 field5 field6 field7 yfield8 field9
field1 field2 field3 field4 field5 field6 field7 yfield8 field9
field1 field2 field3 field4 field5 field6 field7 zfield8 field9
$ awk '{count[$8]++} ENDFILE {print FILENAME; for (f in count) print f, count[f]; delete count}' f*
f1
xfield8 1
yfield8 2
zfield8 1
f2
xfield8 1
yfield8 1
zfield8 2
f3
xfield8 3
yfield8 2
zfield8 1
Answered By - fedorqui 'SO stop harming'