grok {
match => { "message" => '"%{IPORHOST:clientip}"\s"(?:%{IPORHOST:http_x_forwarded_for}|-)"\s"%{USER:auth}"\s"%{HTTPDATE:timestamp}"\s"%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}"\s"%{NUMBER:request_time}"\s"%{INT:status}"\s"%{INT:body_bytes_sent}"\s"(?:%{HOSTPORT:upstream}|-)"\s"(?:%{NUMBER:upstream_response_time}|-)"\s"%{DATA:referer}"\s"%{DATA:agent}"' }
}
grok配置
log_format log_mvw '"$nuser_count"\t"$user_id"\t"$remote_addr"\t"$http_x_forwarded_for"\t"$remote_user"\t"$time_local"\t"$request"\t"$request_time"\t"$status"\t"$body_bytes_sent"\t"$upstream_addr"\t"$upstream_response_time"\t"$http_referer"\t"$http_user_agent"';
nginx日志,前两个字段是自己自定义的
match => { "message" => '"%{IPORHOST:clientip}"\s"(?:%{IPORHOST:http_x_forwarded_for}|-)"\s"%{USER:auth}"\s"%{HTTPDATE:timestamp}"\s"%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}"\s"%{NUMBER:request_time}"\s"%{INT:status}"\s"%{INT:body_bytes_sent}"\s"(?:%{HOSTPORT:upstream}|-)"\s"(?:%{NUMBER:upstream_response_time}|-)"\s"%{DATA:referer}"\s"%{DATA:agent}"' }
}
grok配置
log_format log_mvw '"$nuser_count"\t"$user_id"\t"$remote_addr"\t"$http_x_forwarded_for"\t"$remote_user"\t"$time_local"\t"$request"\t"$request_time"\t"$status"\t"$body_bytes_sent"\t"$upstream_addr"\t"$upstream_response_time"\t"$http_referer"\t"$http_user_agent"';
nginx日志,前两个字段是自己自定义的
3 个回复
yj7778826 - 苦逼小运维
赞同来自:
yang4210
赞同来自:
08-Feb-2020 22:38:09.525 client @0x7f8d340563e0 192.168.1.48#56452 (www.baidu.com): query: www.baidu.com IN A + (192.168.1.55)
(?<date_>.*) client @[^ ]* (?<ip1>.*)#\d* \((?<domain>.*)\): query: ([^ ]*) [A-Z]* [A-Z]* \+ \((?<host>.*)\)
把红色的去了,就是python的标准正则,python中最常见的构造正则的方式就是.*去替换变动的部分。
\d数字
[A-Z]大写字母
比如标准的ip地址正则:
?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
但是这里没必要写这么复杂
最多\d+\.\d+\.\d+\.\d+就可以了 ,代表的就是:数字.数字.数字.数字
dyx - 20后找人带
赞同来自: