Issue
I'm newbie in shell scripts (command-line).
Usually, I type single-line commands only, but, today I get different results from command line sort & text edit sort.
Short, I want to know why command-line "sort" is different from vim's ":sort".
Question & my situation details.
I have under sample log(text) file like under.
// log.txt
2021-04-12 10:00:00 [USER1000] login
2021-04-12 10:01:00 [USER1100] login
2021-04-12 10:02:00 [USER1010] login
2021-04-12 10:03:00 [USER1000] logout
2021-04-12 10:04:00 [USER1000] login
2021-04-12 10:05:00 [USER2000] login
2021-04-12 10:06:00 [USER1000] logout
2021-04-12 10:07:00 [USER1100] logout
2021-04-12 10:08:00 [USER1000] login
...
I want to know who and how many times "login" in one day.
So, I use cat
, grep
, sort
, uniq
for this.
cat log.txt | grep "login" | grep -o "\[USER....\]" | sort | uniq -c | sort > login.txt
I think it return perfect result, but it's sequence was different.
Under Step is what I expected
- 1st,
cat log.txt
print all of them.
2021-04-12 10:00:00 [USER1000] login
2021-04-12 10:01:00 [USER1100] login
2021-04-12 10:02:00 [USER1010] login
2021-04-12 10:03:00 [USER1000] logout
2021-04-12 10:04:00 [USER1000] login
2021-04-12 10:05:00 [USER2000] login
2021-04-12 10:06:00 [USER1000] logout
2021-04-12 10:07:00 [USER1100] logout
2021-04-12 10:08:00 [USER1000] login
- 2nd,
grep "login"
filter "login" only
2021-04-12 10:00:00 [USER1000] login
2021-04-12 10:01:00 [USER1100] login
2021-04-12 10:02:00 [USER1010] login
2021-04-12 10:04:00 [USER1000] login
2021-04-12 10:05:00 [USER2000] login
2021-04-12 10:08:00 [USER1000] login
- 3rd, for group per user, use
grep -o
.
[USER1000]
[USER1100]
[USER1010]
[USER1000]
[USER2000]
[USER1000]
- 4th, for use
uniq -c
,sort
all of "login" texts.
[USER1000]
[USER1000]
[USER1000]
[USER1010]
[USER1100]
[USER2000]
- 5th,
uniq -c
for group every texts.
3 [USER1000]
1 [USER1010]
1 [USER1100]
1 [USER2000]
- 6th,
sort
one more for find-out who login very much.
This step, I append real-case Not Related with upper cases
1 [USER1000]
11 [USER1001]
2 [USER1002]
237 [USER1003]
4 [USER1005]
It looks like sort by text not number.
so, I change bash command
# cat log.txt | grep "login" | grep -o "\[USER....\]" | sort | uniq -c | sort > login.txt
cat log.txt | grep "login" | grep -o "\[USER....\]" | sort | uniq -c > login.txt
vim login.txt
# in vim, :sort returns result what I want (sort by number)
There's no problem, but I just want to know "why different".
Can I solve this problem with command-line sort
too?
I append My Test Code Picture because of comments.
When I wrote this Question, I get right result with vim :sort
command.
All Test Results are under.
sort -n
makes me result what I want.
Solution
Short, I want to know why command-line "sort" is different from vim's ":sort".
The vim :sort
command rely on the sort function of a library used by vim. You probably have numerical sort set by default for this one, as you can see with the :help sort
feedback :
The details about sorting depend on the library function used. There
is no guarantee that sorting obeys the current locale. You will have
to try it out. Vim does do a "stable" sort.
The sorting can be interrupted, but if you interrupt it too late in
the process you may end up with duplicated lines. This also depends
on the system library function used.
You can use the OS sort command instead with :%!sort
to retrieve the 'same' sort order than the OS command.
To sort numerically with the OS command, use the -n option :
cat log.txt | grep "login" | grep -o "\[USER....\]" | sort -n | uniq -c | sort -n
Answered By - Zilog80