使用 dmesg 来查看一些硬件或驱动程序的信息或问题。

7.7.0 master主节点自己关闭并重启,部署On K8S

Elasticsearch | 作者 a593700624 | 发布于2021年02月25日 | 阅读数:1658

master主节点自己关闭重启了,看了下日志,没有ERROR的级别,看到的现象是在自己关闭前,有过一段disconnected,如下所示


"stacktrace": ["org.elasticsearch.transport.NodeDisconnectedException: [es-data-cold-b-14][indices:admin/forcemerge[n]] disconnected"] }
{"type": "server", "timestamp": "2021-02-24T23:28:14,519Z", "level": "DEBUG", "component": "o.e.a.a.i.f.TransportForceMergeAction", "cluster.name": "xx", "node.name": "es-master-2", "message": "failed to execute [indices:admin/forcemerge] on node [ur8xRrOzSoavigX0A6f2PQ]", "cluster.uuid": "He8kd2vAR1mc6Zl0Tl6XXw", "node.id": "KpA0WsB6SJqi0eEDbEpxdQ"
 
这类日志出现了大概有十多条,涉及到disconnected的节点有6个
 
{"type": "server", "timestamp": "2021-02-24T23:28:14,909Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "xx", "node.name": "es-master-2", "message": "stopped", "cluster.uuid": "He8kd2vAR1mc6Zl0Tl6XXw", "node.id": "KpA0WsB6SJqi0eEDbEpxdQ"  }
{"type": "server", "timestamp": "2021-02-24T23:28:14,910Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "xx", "node.name": "es-master-2", "message": "closing ...", "cluster.uuid": "He8kd2vAR1mc6Zl0Tl6XXw", "node.id": "KpA0WsB6SJqi0eEDbEpxdQ"  }
{"type": "server", "timestamp": "2021-02-24T23:28:14,929Z", "level": "INFO", "component": "o.e.n.Node", "cluster.name": "xx", "node.name": "es-master-2", "message": "closed", "cluster.uuid": "He8kd2vAR1mc6Zl0Tl6XXw", "node.id": "KpA0WsB6SJqi0eEDbEpxdQ"  }
 
接着就直接关闭重启了,看日志这个过程是自发的一样,并不是因为什么错误造成的


所以就挺奇怪的,想请教下大佬,是不是主master与node节点发生多个indices:admin/forcemerge[n]] disconnected 的情况下,主master会自发的认为自己有问题,从而进行关闭重启,选举出新的leader?
 
服务没有受到什么影响。
已邀请:

Charele - Cisco4321

赞同来自:

这只是你的错觉,其实你看到的就是Error,只不过包装在了debug信息里面。
 
是因为节点disconnected,所以forceMerge动作失败了。
你要搞清楚的是为什么节点间会出现disconnected,虽然可能是暂时性的。

要回复问题请先登录注册