ES集群master宕机之后再重启,无法恢复到原来的集群当中

Elasticsearch | 作者 ZChao_smile | 发布于2019年06月12日 | 阅读数:152

环境:
一台主机,多实例:
主master配置:
# ======================== Elasticsearch Configuration =========================
#
# ---------------------------------- Cluster -----------------------------------
cluster.name: hotel
# ------------------------------------ Node ------------------------------------
node.name: hotel-master
node.master: true
node.data: false
# ----------------------------------- Paths ------------------------------------
path.data: /home/es/data/hotel-master
path.logs: /home/es/logs/hotel-master
# ----------------------------------- Memory -----------------------------------

# ---------------------------------- Network -----------------------------------
network.host: 192.168.40.128
http.port: 9200
# --------------------------------- Discovery ----------------------------------
discovery.zen.ping.unicast.hosts: ["192.168.40.128"]
discovery.zen.ping_timeout: 10s
# ---------------------------------- Gateway -----------------------------------

# ---------------------------------- Various -----------------------------------
http.cors.enabled: true
http.cors.allow-origin: "*"

随从master:两个除了必须不一样的,其他的配置都一样
# ---------------------------------- Cluster -----------------------------------
cluster.name: hotel
# ------------------------------------ Node ------------------------------------
node.name: hotel-master-1
node.master: true
node.data: false
# ----------------------------------- Paths ------------------------------------
path.data: /home/es/data/hotel-master-1
path.logs: /home/es/logs/hotel-master-1
# ----------------------------------- Memory -----------------------------------

# ---------------------------------- Network -----------------------------------
network.host: 192.168.40.128
http.port: 9201
# --------------------------------- Discovery ----------------------------------
discovery.zen.ping.unicast.hosts: ["192.168.40.128"]
discovery.zen.ping_timeout: 10s
discovery.zen.minimum_master_nodes: 2

# ---------------------------------- Gateway -----------------------------------

# ---------------------------------- Various -----------------------------------

http.cors.enabled: true
http.cors.allow-origin: "*"

-----------------------------------
有三台所谓的“matser”节点,一个“主master”节点,两个“随从master”节点,当“主master”节点挂了之后,其中的一个“随从master”节点被选举成了“主master”节点,后来之前挂掉的“主master”节点恢复了,但是却无法加入集群,它自己又单独成立了一个集群,这是脑裂吗?需要怎么设置才能把恢复的master加入到刚才的集群?
-----------------------------------
知道的望指导一下,谢谢了
已邀请:

Reilee - 在日devops

赞同来自:

discovery.zen.ping.unicast.hosts: ["192.168.40.128"]
猜测是因为没有指定端口的原因。
初始状态3个实例都连到master(默认9200端口)形成集群,当master挂掉之后,master-1:9201成为主节点,然而原来的master:9200恢复后,只能发现自己,选举timeout之后自己成为新集群的master,形成脑裂。
建议配置成
discovery.zen.ping.unicast.hosts: ["192.168.40.128:9200","192.168.40.128:9201"]
测试一下

要回复问题请先登录注册