使用 shuf 来打乱一个文件中的行或是选择文件中一个随机的行。

logstash.inputs.jdbc抽取mysql的2000万行表到es时中途报错 “IllegalArgumentException: HOUR_OF_DAY: 2 -> 3>”

Logstash | 作者 kasonbin | 发布于2018年07月24日 | 阅读数:6308

每次都是到700万+的范围时报错,然后后又从0行开始遍历,导致700万+后面的数据一直导不进es,各种百度谷歌没查到相关资料,有否同学碰到过的给点思路。
 
logstash日志错误信息:
 
tail -f logs/logstash-plain.log
[2018-07-24T01:05:15,517][WARN ][logstash.inputs.jdbc     ] Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::JavaLang::IllegalArgumentException: HOUR_OF_DAY: 2 -> 3>}
 
[2018-07-24T00:11:13,161][INFO ][logstash.inputs.jdbc     ] (99.928124s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 3000000
[2018-07-24T00:22:38,118][INFO ][logstash.inputs.jdbc ] (101.367807s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 4000000
[2018-07-24T00:33:59,108][INFO ][logstash.inputs.jdbc ] (98.171627s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 5000000
[2018-07-24T00:45:20,151][INFO ][logstash.inputs.jdbc ] (101.837825s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 6000000
[2018-07-24T00:56:49,441][INFO ][logstash.inputs.jdbc ] (101.421489s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 7000000
[2018-07-24T01:05:15,517][WARN ][logstash.inputs.jdbc ] Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::JavaLang::IllegalArgumentException: HOUR_OF_DAY: 2 -> 3>}
[2018-07-24T01:05:15,534][INFO ][logstash.inputs.jdbc ] (0.000272s) SELECT version()
[2018-07-24T01:07:00,200][INFO ][logstash.inputs.jdbc ] (104.663879s) SELECT count(*) AS `count` FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1
[2018-07-24T01:08:38,753][INFO ][logstash.inputs.jdbc ] (98.551111s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 0
[2018-07-24T01:20:02,213][INFO ][logstash.inputs.jdbc ] (99.033096s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 1000000
[2018-07-24T01:31:23,333][INFO ][logstash.inputs.jdbc ] (99.236364s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 2000000
[2018-07-24T01:42:49,212][INFO ][logstash.inputs.jdbc ] (100.754219s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 3000000
[2018-07-24T01:54:13,668][INFO ][logstash.inputs.jdbc ] (101.010461s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 4000000
[2018-07-24T02:05:33,344][INFO ][logstash.inputs.jdbc ] (98.403401s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 5000000
[2018-07-24T02:16:58,504][INFO ][logstash.inputs.jdbc ] (102.378706s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 6000000
[2018-07-24T02:28:25,252][INFO ][logstash.inputs.jdbc ] (101.759503s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 7000000
[2018-07-24T02:36:56,152][WARN ][logstash.inputs.jdbc ] Exception when executing JDBC query {:exception=>#<Sequel::DatabaseError: Java::JavaLang::IllegalArgumentException: HOUR_OF_DAY: 2 -> 3>}
[2018-07-24T02:36:56,169][INFO ][logstash.inputs.jdbc ] (0.000284s) SELECT version()
[2018-07-24T02:38:40,037][INFO ][logstash.inputs.jdbc ] (103.866966s) SELECT count(*) AS `count` FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1
[2018-07-24T02:40:18,775][INFO ][logstash.inputs.jdbc ] (98.734833s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 0
[2018-07-24T02:51:43,683][INFO ][logstash.inputs.jdbc ] (99.355079s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 1000000
[2018-07-24T03:03:07,538][INFO ][logstash.inputs.jdbc ] (98.969696s) SELECT * FROM (select * from IM_RY_MESSAGE_RECORD where ID > 0) AS `t1` LIMIT 1000000 OFFSET 2000000

 
 
原表数据量:

表行.png

 
mysql.conf配置文件:

mysqlconf.png

 
只能导入7866044,多一行都不行。  #理论上最终应该要导入相同于表记录的行数25905214
POST twitter/doc/_search?pretty
{
  "query": {
    "bool": {
      "must": { "match_all": {} }
    }
  },
  "sort": [
    {
      "id": {
        "order": "desc"
      }
    }
  ]
}

data.png

 
 
 
已邀请:

questiny

赞同来自:

我是在读取的时候出现这个错误IllegalArgumentException: HOUR_OF_DAY: 2 -> 3
是本地连接的线上数据库,服务器是linux系统,本地是苹果系统
如果换成本地linux系统就不会出错,猜测是时区的问题

zqc0512 - andy zhou

赞同来自:

条数最后对应上,是否有特殊的表中的数据处理不了。

kasonbin - 脑子有点儿懵懵的

赞同来自:

这个问题真的有点儿见鬼了,起初怀疑是这一条之后的数据问题本身问题,细看也没看出问题。
 
时区不会是这个问题,我是在一台服务器上测试的。
 
还未解决该问题,估计可能得自己写数据同步中间件了。

zqc0512 - andy zhou

赞同来自:

JDBC好像可以取一部分吧?
你手动把后面的数据搞出来看看,不行把这个DB拆分下,看看是不是表大了。

qw8613243

赞同来自:

可以监控mysql.log日志进行同步数据呢

medcl - 今晚打老虎。

赞同来自:

先查一下 mysql 里面具体的 timezone
mysql> show variables like "%time_zone%";
+------------------+--------+
| Variable_name | Value |
+------------------+--------+
| system_time_zone | CST |
| time_zone | SYSTEM |
然后把 Logstash 的 JDBC 链接字符串改成下面的格式试试(替换 serverTimezone 为上面查询到的,如 CST):
jdbc:mysql://<ip>/<db>?useSSL=false&&serverTimezone=查询的结果&useLegacyDatetimeCode=false

lastname

赞同来自:

请问楼主这个问题解决了吗?我也碰到这个问题,连接数据库时加了时区为UTC或CST,又报错“IllegalArgumentException: MONTH>”
加个QQ探讨下:279861972

huangjy

赞同来自:

楼主解决了么

要回复问题请先登录注册