使用 shuf 来打乱一个文件中的行或是选择文件中一个随机的行。

Logstash dissect插件日志切割后用convert_datatype转换数据类型问题

Logstash | 作者 haohao | 发布于2017年08月06日 | 阅读数:7482

大家好:
    我使用logstash的dissect插件进行nginx日志的字段切割,然后用convert_datatype进行日志字段数据类型转换,但是有个问题,request_time和upstream_response_time字段有时候并不一定是浮点类型,有时候却是'-'字符来代替,因此用convert_datatype来将这两个字段转换成float类型时会抛异常,然后将这两个字段的值设置为原先的字符串,即'-'。我想要的结果是当这两个字段是'-'导致数据类型转换失败时,给这两个字段一个默认的数值,即0.00而不是给一个字符串'-'。
以下是运行logstash后抛出的异常(配置文件及示例日志见后面):
[2017-08-05T22:28:29,551][WARN ][logstash.filters.dissect ] Dissector datatype conversion, value cannot be coerced, key: request_time, value: - [2017-08-05T22:28:29,551][WARN ][logstash.filters.dissect ] Dissector datatype conversion, value cannot be coerced, key: upstream_response_time, value: -

以下是我的nginx日志格式:
$time_local|server_ip|$request | $status | $remote_user | $remote_addr | $http_user_agent | $http_referer | $host | $bytes_sent|$request_time|$upstream_response_time|$upstream_addr|$connection|$connection_requests|$uuid

示例日志(注意request_time和upstream_response_time字段的值是'-',不是数值类型):
05/Aug/2017:22:22:33 -0700|54.153.101.30|GET /listing/detail/1864252/MO/St-Louis/www HTTP/1.1|200|-|5.255.250.132|Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)|-|env1-2.chime.me|76627|-|-|127.0.0.1:10300|4483145|2|38873227-aa0e-412b-b93f-13a7d0e26fb7

 
Logstash 配置文件:
input {
beats {
port => 5044
}
}

filter {
ruby {
code => "
event.timestamp.time.localtime
tstamp = event.get('@timestamp').to_i
Time.at(tstamp).strftime('%Y-%m-%d')
"
}

dissect {
mapping => {
"message" => "%{time_local}|%{server_ip}|%{request}|%{status_code}|%{remote_user}|%{remote_addr}|%{http_user_agent}|%{http_referer}|%{host}|%{bytes_sent}|%{request_time}|%{upstream_response_time}|%{upstream_addr}|%{connection}|%{connection_requests}|%{uuid}"
}
convert_datatype => {
status_code => "int"
bytes_sent => "int"
request_time => "float"
upstream_response_time => "float"
}
}
}

output {
if [business] == "nginx" and [type] == "access" {
stdout { codec => rubydebug }
}
}


当以上面的配置文件运行logstash后得到的结果如下:
{
"request" => "GET /listing/detail/1864252/MO/St-Louis/www HTTP/1.1",
"status_code" => 200,
"upstream_addr" => "127.0.0.1:10300",
"connection_requests" => "2",
"source" => "/home/ec2-user/nginx/logs/access.log",
"type" => "access",
"uuid" => "38873227-aa0e-412b-b93f-13a7d0e26fb7",
"http_user_agent" => "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)",
"remote_user" => "-",
"request_time" => "-",
"@version" => "1",
"beat" => {
"hostname" => "awsuw7-50.opi.com",
"name" => "awsuw7-50.opi.com",
"version" => "5.5.1"
},
"host" => "env1-2.chime.me",
"server_ip" => "54.153.101.30",
"connection" => "4483145",
"remote_addr" => "5.255.250.132",
"offset" => 270,
"business" => "nginx",
"input_type" => "log",
"time_local" => "05/Aug/2017:22:22:33 -0700",
"message" => "05/Aug/2017:22:22:33 -0700|54.153.101.30|GET /listing/detail/1864252/MO/St-Louis/www HTTP/1.1|200|-|5.255.250.132|Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)|-|env1-2.chime.me|76627|-|-|127.0.0.1:10300|4483145|2|38873227-aa0e-412b-b93f-13a7d0e26fb7",
"bytes_sent" => 76627,
"tags" => [
[0] "beats_input_codec_plain_applied",
[1] "_dataconversionuncoercible_request_time_float",
[2] "_dataconversionuncoercible_upstream_response_time_float"
],
"@timestamp" => 2017-08-06T05:28:24.508Z,
"http_referer" => "-",
"upstream_response_time" => "-"
}


可以发现request_time和upstream_response_time是原先的'-'字符串,这并不是我想要的,我想在转换失败时给一个默认的数值,比如0.00,而不是字符串。
 
已邀请:

xinfanwang

赞同来自:

用filter转换吧。

kennywu76 - Wood

赞同来自:

可以用mutate插件的gsub功能做转换:
https://www.elastic.co/guide/e ... -gsub

要回复问题请先登录注册