身安不如心安,屋宽不如心宽 。

es gc 导致集群挂了

Elasticsearch | 作者 Jason | 发布于2018年01月06日 | 阅读数:3031

[2018-01-05 17:03:25,327][INFO ][monitor.jvm              ] [node75] [gc][old][1473324][1146] duration [8.7s], collections [1]/[9.5s], total [8.7s]/[3.5m], memory [7.3gb]->[7.3gb]/[7.9gb], all_pools {[young] [1.8mb]->[4.6mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:03:35,684][INFO ][monitor.jvm              ] [node75] [gc][old][1473326][1147] duration [9.1s], collections [1]/[9.3s], total [9.1s]/[3.7m], memory [7.9gb]->[7.3gb]/[7.9gb], all_pools {[young] [532.5mb]->[1.7mb]/[532.5mb]}{[survivor] [43.6mb]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:03:45,333][INFO ][monitor.jvm              ] [node75] [gc][old][1473327][1148] duration [8.9s], collections [1]/[9.6s], total [8.9s]/[3.9m], memory [7.3gb]->[7.3gb]/[7.9gb], all_pools {[young] [1.7mb]->[37.9mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:03:53,501][INFO ][monitor.jvm              ] [node75] [gc][old][1473328][1149] duration [7.2s], collections [1]/[8.1s], total [7.2s]/[4m], memory [7.3gb]->[7.3gb]/[7.9gb], all_pools {[young] [37.9mb]->[43.6mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:03,309][INFO ][monitor.jvm              ] [node75] [gc][old][1473329][1150] duration [9s], collections [1]/[9.8s], total [9s]/[4.1m], memory [7.3gb]->[7.4gb]/[7.9gb], all_pools {[young] [43.6mb]->[55.3mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:11,193][INFO ][monitor.jvm              ] [node75] [gc][old][1473330][1151] duration [7s], collections [1]/[7.8s], total [7s]/[4.2m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [55.3mb]->[77.6mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:21,148][INFO ][monitor.jvm              ] [node75] [gc][old][1473331][1152] duration [9.1s], collections [1]/[9.9s], total [9.1s]/[4.4m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [77.6mb]->[103.3mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:30,008][INFO ][monitor.jvm              ] [node75] [gc][old][1473332][1153] duration [8.2s], collections [1]/[8.8s], total [8.2s]/[4.5m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [103.3mb]->[73.8mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:40,467][INFO ][monitor.jvm              ] [node75] [gc][old][1473333][1154] duration [9.5s], collections [1]/[10.4s], total [9.5s]/[4.7m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [73.8mb]->[110.7mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:48,037][INFO ][monitor.jvm              ] [node75] [gc][old][1473334][1155] duration [6.7s], collections [1]/[7.5s], total [6.7s]/[4.8m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [110.7mb]->[99mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:04:57,794][INFO ][monitor.jvm              ] [node75] [gc][old][1473335][1156] duration [8.7s], collections [1]/[9.7s], total [8.7s]/[4.9m], memory [7.4gb]->[7.4gb]/[7.9gb], all_pools {[young] [99mb]->[147.9mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:05:05,426][INFO ][monitor.jvm              ] [node75] [gc][old][1473336][1157] duration [6.8s], collections [1]/[7.6s], total [6.8s]/[5.1m], memory [7.4gb]->[7.5gb]/[7.9gb], all_pools {[young] [147.9mb]->[161.6mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:05:14,470][INFO ][monitor.jvm              ] [node75] [gc][old][1473337][1158] duration [8.2s], collections [1]/[9s], total [8.2s]/[5.2m], memory [7.5gb]->[7.4gb]/[7.9gb], all_pools {[young] [161.6mb]->[146.1mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
[2018-01-05 17:05:22,170][INFO ][monitor.jvm              ] [node75] [gc][old][1473338][1159] duration [6.9s], collections [1]/[7.7s], total [6.9s]/[5.3m], memory [7.4gb]->[7.5gb]/[7.9gb], all_pools {[young] [146.1mb]->[168.2mb]/[532.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.3gb]->[7.3gb]/[7.3gb]}
 
3台机器配置:CPU 8核,内存16G(分配es为8G)
索引情况:100G
 
 
已邀请:

medcl - 今晚打老虎。

赞同来自:

GC 原因有很多的。
ES 版本是多少?
集群负载压力、查询都是怎么样的?

etoak

赞同来自:

[2020-04-25T05:54:28,196][INFO ][o.e.m.j.JvmGcMonitorService] [node2] [gc][10502237] overhead, spent [363ms] collecting in the last [1s]
[2020-04-25T06:00:27,517][DEBUG][o.e.a.a.c.n.s.TransportNodesStatsAction] [node2] failed to execute on node [SEviXhshRm2D4gxAvIAGNg]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [node2][192.168.1.88:9305][cluster:monitor/nodes/stats[n]] request_id [42995968] timed out after [15000ms]
    at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:869) [?:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2020-04-25T06:00:27,522][WARN ][o.e.a.a.c.n.s. 
org.elasticsearch.action.FailedNodeException: Failed node [SEviXhshRm2D4gxAvIAGNg]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.onFailure(TransportNodesAction.java:247) [?:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction.access$1(TransportNodesAction.java:241) [?:?]
    at org.elasticsearch.action.support.nodes.TransportNodesAction$AsyncAction$1.handleException(TransportNodesAction.java:219) [?:?]
    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:984) [?:?]
    at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:868) [?:?]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:458) [?:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.transport.ReceiveTimeoutTransportException: [node2][192.168.1.88:9305][cluster:monitor/nodes/stats[n]] request_id [42995968] timed out after [15000ms]
    at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:869) ~[?:?]
    ... 4 more
[2020-04-25T06:00:27,524][WARN ][o.e.t.TransportService   ]
 

medcl - 今晚打老虎。

赞同来自:

这点信息对于排障是不够的。

要回复问题请先登录注册