高峰只对攀登它而不是仰望它的人来说才有真正意义。

logstash 字符串转换

匿名 | 发布于2018年06月07日 | 阅读数:11164

logstash 这块还需要继续优化下:  
request_time, upstream_response_time 等的数据类型需要改为float或者double,
以及 status 为int等等,目前默认是string的,
从而无法在kibana上按照histogram或者 average等维度查看请求的状态分布、延迟分布等信息。
 
日志格式是 文本格式,目前logstash是这样写的。
"message" =>"%{IPV4:remote_addr} - (%{USERNAME:user}|-) \[%{HTTPDATE:nginx_timestamp}\] %{NUMBER:http_status_code} %{BASE10NUM:bytes_sent:int} %{BASE16FLOAT:request_time} (%{BASE16FLOAT:upstream_response_time}|-) %{WORD:scheme} %{HOSTNAME:domain} \"%{WORD:verb} (?<request>\S+) HTTP/%{NUMBER:httpversion}\" \"(?<http_referer>\S+)\" \"(?<http_user_agent>(\S+\s+)*\S+)\" \"(?<http_x_forwarded_for>(\S+\s+)*\S+)\" \"(?<http_pinpoint_traceid>\S+)\""
 
 
已邀请:

auroracxy

赞同来自: fantaigan duffy caogx

我给你一个我现成的配置, 按需修改吧:
nginx部分配置直接用json,省去很多麻烦,注意里面有request_body, 所有后面logstash有一些转义的配置,如果不需要直接把它删了吧
    log_format json '{"@timestamp":"$time_iso8601",'
'"server_addr":"$server_addr",'
'"remote_addr":"$remote_addr",'
'"cookie_JSESSIONID":"$cookie_JSESSIONID",'
'"body_bytes_sent":$body_bytes_sent,'
'"request_uri":"$request_uri",'
'"request_method":"$request_method",'
'"server_protocol":"$server_protocol",'
'"scheme":"$scheme",'
'"request_time":$request_time,'
'"upstream_response_time":"$upstream_response_time",'
'"upstream_addr":"$upstream_addr",'
'"host":"$host",'
'"hostname":"$hostname",'
'"http_host":"$http_host",'
'"uri":"$uri",'
'"http_x_forwarded_for":"$http_x_forwarded_for",'
'"http_referer":"$http_referer",'
'"http_user_agent":"$http_user_agent",'
'"X-Forwarded-Proto":"$http_x_forwarded_proto",'
'"request_body":"$request_body",'
'"status":"$status"}';

access_log /var/log/nginx/access.json.log json;

 
filebeat配置部分:
filebeat.prospectors:
- input_type: log
paths:
- /var/log/nginx/access.json.log
document_type: nginx_access_log
ignore_older: 0
tail_files: true
symlinks: true
close_removed: true
clean_removed: true

output.logstash:
# Boolean flag to enable or disable the output module.
enabled: true

# The Logstash hosts
hosts: ["log-collection.internal.xxx.com:5517"]




logstah配置部分:
input {
beats {
port => 5517
}
}
filter {
if [type] == "nginx_access_log" {
mutate {
gsub => ["message", "\\x", "\\\x"]
}
json {
source => "message"
}
mutate {
remove_field => [ "message" ]
}
if "HEAD" in [request_method] or "x.x.x.x" in [remote_addr] or "x.x.x.x" in [http_x_forwarded_for] {
drop {}
}

useragent {
source => "http_user_agent"
target => "ua"
}

if "-" in [upstream_response_time] {
mutate {
replace => {
"upstream_response_time" => "0"
}
}
}

mutate {
convert => [ "upstream_response_time", "float" ]
}

if "," in [http_x_forwarded_for] {
mutate {
add_field => { "[@metadata][user_ip_list]" => "%{http_x_forwarded_for}" }
}
mutate {
split => { "[@metadata][user_ip_list]" => ", " }
add_field => { "[@metadata][user_ip]" => "%{[@metadata][user_ip_list][0]}" }
}
geoip {
source => "[@metadata][user_ip]"
database => "/logstash/config/GeoLite2-City.mmdb"
target => "geoip"
}
} else {
geoip {
source => "http_x_forwarded_for"
database => "/logstash/config/GeoLite2-City.mmdb"
target => "geoip"
}
}
mutate {
gsub => [
"request_body", "\\x22", '"'
]
gsub => [
"request_body", "\\x0A", "\n"
]
}
}
}

output{
if [type] == "nginx_access_log" {
elasticsearch {
hosts => ["",""]
index => "nginxlog-%{+YYYY}"
template => "/home/test/template_nginxlog.json"
template_name => "nginxlog"
template_overwrite => true
}
}
}

最后模板:
 
 

zqc0512 - andy zhou

赞同来自:

这个可以自己定义格式内容的呢。
要先看你的日志格式,不匹配的话,修改这个filter..
字段类型 可以在mapping中添加。

fantaigan - JAVA

赞同来自:

filter {
  mutate{
    convert => ["request_time","float"]
  }  
}
 
加个filter?

caogx - IT

赞同来自:

感谢~~

要回复问题请先登录注册