Re: [问题] awk查询access.log问题 DarkKiller PTT批踢踢实业坊

Re: [问题] awk查询access.log问题

楼主: DarkKiller (System hacked) 2019-09-20 17:33:10

※ 引述《angle065 (Fu)》之铭言：
: 大家好，小弟有个问题想请教，因为想直接查询出access.log不重复的ip
: 查到可以利用这个指令去查
: awk '{tmp[$1]} END {for (i in tmp) print i}' access.log
: 这边有个比较不理解的地方想请教各位大大
这边有点取捷径，如果你看不懂的话可以用：
awk '{tmp[$1] = 1} END {for (i in tmp) print i}' access.log
这样就会好懂一些。你可以交叉比较：
awk '{tmp[$1]} END {for (i in tmp) print tmp[i]}' access.log
你会发现里面全部都是 1。
: 其中的 {tmp[$1]} 这个部分，我理解是把每一行的第一组文字
: 写入到tmp这个阵列变量中，接着再利用for循环去呈现重复的文字
: 也确实是让我取得所有不重复的IP
: 想请教这个观念是对的吗？
: 那想请问tmp[$1]，这个是哪一种语言写入阵列的方式呢？
: 因为我稍微略懂一点PHP、JS，这样的做法通常是塞给阵列/物件，索引值再用的
: 应该不是写入阵列
不同家的实做不太一样，但因为 POSIX 标准的关系，标准内定义的功能必须实做：
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
在 GNU awk 的 manual 里面是这样写：
All arrays in AWK are associative, i.e., indexed by string values.
而在 FreeBSD 上的 awk manual 则是这样说：
Array subscripts may be any string, not necessarily numeric; this allows
for a form of associative memory. Multiple subscripts such as [i,j,k]
are permitted; the constituents are concatenated, separated by the value
of SUBSEP (see the section on variables below).
其他家又会有其他的方式，翻 manpage 或是 google 一下通常都会有。
然后补充一下，你的问题我的习惯是：
cut -d ' ' -f 1 access.log | sort -u
awk '{print $1}' access.log | sort -u
如果只是要看大宗的数量：
awk '{print $1}' access.log | sort | uniq -c | sort -n | tail
然后要看现在是哪个王八蛋在打：
while true; do clear; date; tail -n 10000 access.log | awk '{print $1}' | sort | uniq -c | sort -n | tail; sleep 1; done
指令用的习惯就好，方法还蛮多的...

作者: hijkxyzuw (i,j,k) ×(x,y,z) 2019-09-21 14:22:00

本来想推用 watch ，但后面那串太长了，又有引号。

继续阅读

Re: [问题] Failed to load session "ubuntu"mgdesigner [问题] Failed to load session "ubuntu"anoymouse [心得] dosbox+wine玩老游戏(三国志5+6)Bellkna Re: [问题] iptables 字串封锁功能DarkKiller [闲聊] OpenSuse 15.1安装pcmanxZentyal Re: [问题] Nginx 无法启动DarkKiller [问题] Nginx 无法启动jasmine3471 [问题] 老硬件用哪个虚拟系统装Ubuntu佳？dharma [问题] Ubuntu 19.04 Gnome 调整reachhard [问题] win10+ubuntu18.04双系统会当机magic83v