看,灰机...

es节点load特别高,cpu等使用率基本为0,不可恢复。

Elasticsearch | 作者 famoss | 发布于2017年06月15日 | 阅读数:11652

es集群里面经常有机器,突然那就cpu load飙到 80左右(32core cpu),但是cpu使用率会变成0,且同时io 等使用率全部变为0.这种状态不可以自己恢复,除非重启。(突然飙升,该节点过会儿会自动从集群下掉。不可自动恢复)
网上查了相关文章说是网络请求太高,想到会不会是一次查询请求shard数太多导致。毕竟每个shard都会单独一个请求,request thread pool使用的默认1000。
请问怎么能让系统能自动恢复,有没有linux内核参数可设置,或者有没有别的办法可规避。
从zabbix监控看,流量没有突然上升,且 disk io ops也没突然飙升。
且这个时候使用kill -9 杀死es进程,杀不掉,原先进程会变成僵尸进程。load一直下不来。
 
僵死的时候 dmsg 输出了日志:
 
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] kthreadd D ffffffffffffffff 0 2 0 0x00000000
[Wed Jun 14 18:03:48 2017] ffff881fd2a234e0 0000000000000046 ffff881fd2998b80 ffff881fd2a23fd8
[Wed Jun 14 18:03:48 2017] ffff881fd2a23fd8 ffff881fd2a23fd8 ffff881fd2998b80 ffff881fd2a23628
[Wed Jun 14 18:03:48 2017] ffff881fd2a23630 7fffffffffffffff ffff881fd2998b80 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5dc9>] ? ttwu_do_wakeup+0x19/0xd0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5f5d>] ? ttwu_do_activate.constprop.84+0x5d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff81631ada>] ? __slab_free+0x10e/0x277
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a666>] kernel_thread+0x26/0x30
[Wed Jun 14 18:03:48 2017] [<ffffffff810a65f2>] kthreadd+0x2b2/0x2f0
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6340>] ? kthread_create_on_cpu+0x60/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff81645858>] ret_from_fork+0x58/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6340>] ? kthread_create_on_cpu+0x60/0x60
[Wed Jun 14 18:03:48 2017] INFO: task kswapd0:244 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] kswapd0 D ffffffffffffffff 0 244 2 0x00000000
[Wed Jun 14 18:03:48 2017] ffff881fcda7f6f0 0000000000000046 ffff881fcd9ba280 ffff881fcda7ffd8
[Wed Jun 14 18:03:48 2017] ffff881fcda7ffd8 ffff881fcda7ffd8 ffff881fcd9ba280 ffff881fcda7f838
[Wed Jun 14 18:03:48 2017] ffff881fcda7f840 7fffffffffffffff ffff881fcd9ba280 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5dc9>] ? ttwu_do_wakeup+0x19/0xd0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5f5d>] ? ttwu_do_activate.constprop.84+0x5d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810bd008>] ? __enqueue_entity+0x78/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff810c34b7>] ? enqueue_entity+0x237/0x890
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8112483d>] ? call_rcu_sched+0x1d/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811f4d3d>] ? d_free+0x4d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff811f6e80>] ? shrink_dentry_list+0x250/0x480
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5eb7>] ? vmpressure+0x87/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81180131>] balance_pgdat+0x4b1/0x5e0
[Wed Jun 14 18:03:48 2017] [<ffffffff811803d3>] kswapd+0x173/0x450
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6ae0>] ? wake_up_atomic_t+0x30/0x30
[Wed Jun 14 18:03:48 2017] [<ffffffff81180260>] ? balance_pgdat+0x5e0/0x5e0
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[Wed Jun 14 18:03:48 2017] [<ffffffff81645858>] ret_from_fork+0x58/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[Wed Jun 14 18:03:48 2017] INFO: task xfsaild/sdb1:781 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] xfsaild/sdb1 D ffffffffffffffff 0 781 2 0x00000000
[Wed Jun 14 18:03:48 2017] ffff881fcf1afb20 0000000000000046 ffff881fcfbe7300 ffff881fcf1affd8
[Wed Jun 14 18:03:48 2017] ffff881fcf1affd8 ffff881fcf1affd8 ffff881fcfbe7300 ffff881fcf1afc68
[Wed Jun 14 18:03:48 2017] ffff881fcf1afc70 7fffffffffffffff ffff881fcfbe7300 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5dc9>] ? ttwu_do_wakeup+0x19/0xd0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5f5d>] ? ttwu_do_activate.constprop.84+0x5d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8108d6fe>] ? try_to_del_timer_sync+0x5e/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4720>] _xfs_log_force+0x70/0x290 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8108bf50>] ? internal_add_timer+0x70/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4966>] xfs_log_force+0x26/0x80 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02cf470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02cf5c1>] xfsaild+0x151/0x5e0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02cf470>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5aef>] kthread+0xcf/0xe0
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[Wed Jun 14 18:03:48 2017] [<ffffffff81645858>] ret_from_fork+0x58/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[Wed Jun 14 18:03:48 2017] INFO: task zabbix_agentd:1594 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] zabbix_agentd D ffffffffffffffff 0 1594 1572 0x00000084
[Wed Jun 14 18:03:48 2017] ffff881fcf91f5a0 0000000000000086 ffff881fccdaf300 ffff881fcf91ffd8
[Wed Jun 14 18:03:48 2017] ffff881fcf91ffd8 ffff881fcf91ffd8 ffff881fccdaf300 ffff881fcf91f6e8
[Wed Jun 14 18:03:48 2017] ffff881fcf91f6f0 7fffffffffffffff ffff881fccdaf300 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810c4a74>] ? find_busiest_group+0x114/0x910
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8112483d>] ? call_rcu_sched+0x1d/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811f4d3d>] ? d_free+0x4d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff811f6e80>] ? shrink_dentry_list+0x250/0x480
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff811fcad7>] ? __fd_install+0x47/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b
[Wed Jun 14 18:03:48 2017] INFO: task zabbix_agentd:1595 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] zabbix_agentd D ffffffffffffffff 0 1595 1572 0x00000084
[Wed Jun 14 18:03:48 2017] ffff881fceb935a0 0000000000000086 ffff881fccdadc00 ffff881fceb93fd8
[Wed Jun 14 18:03:48 2017] ffff881fceb93fd8 ffff881fceb93fd8 ffff881fccdadc00 ffff881fceb936e8
[Wed Jun 14 18:03:48 2017] ffff881fceb936f0 7fffffffffffffff ffff881fccdadc00 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff81631ada>] ? __slab_free+0x10e/0x277
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff811fcad7>] ? __fd_install+0x47/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b
[Wed Jun 14 18:03:48 2017] INFO: task zabbix_agentd:1598 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] zabbix_agentd D ffffffffffffffff 0 1598 1572 0x00000084
[Wed Jun 14 18:03:48 2017] ffff881fbec535a0 0000000000000086 ffff881fbec21700 ffff881fbec53fd8
[Wed Jun 14 18:03:48 2017] ffff881fbec53fd8 ffff881fbec53fd8 ffff881fbec21700 ffff881fbec536e8
[Wed Jun 14 18:03:48 2017] ffff881fbec536f0 7fffffffffffffff ffff881fbec21700 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810c4a74>] ? find_busiest_group+0x114/0x910
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8112483d>] ? call_rcu_sched+0x1d/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811f4d3d>] ? d_free+0x4d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff811f6e80>] ? shrink_dentry_list+0x250/0x480
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff811fcad7>] ? __fd_install+0x47/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b
[Wed Jun 14 18:03:48 2017] INFO: task zabbix_agentd:1601 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] zabbix_agentd D ffffffffffffffff 0 1601 1572 0x00000084
[Wed Jun 14 18:03:48 2017] ffff881fcef035a0 0000000000000082 ffff881fbec23980 ffff881fcef03fd8
[Wed Jun 14 18:03:48 2017] ffff881fcef03fd8 ffff881fcef03fd8 ffff881fbec23980 ffff881fcef036e8
[Wed Jun 14 18:03:48 2017] ffff881fcef036f0 7fffffffffffffff ffff881fbec23980 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff811fcad7>] ? __fd_install+0x47/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b
[Wed Jun 14 18:03:48 2017] INFO: task zabbix_agentd:1609 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] zabbix_agentd D ffffffffffffffff 0 1609 1572 0x00000084
[Wed Jun 14 18:03:48 2017] ffff881fbee335a0 0000000000000082 ffff881fbec27300 ffff881fbee33fd8
[Wed Jun 14 18:03:48 2017] ffff881fbee33fd8 ffff881fbee33fd8 ffff881fbec27300 ffff881fbee336e8
[Wed Jun 14 18:03:48 2017] ffff881fbee336f0 7fffffffffffffff ffff881fbec27300 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8101360b>] ? __switch_to+0x17b/0x4b0
[Wed Jun 14 18:03:48 2017] [<ffffffff8112483d>] ? call_rcu_sched+0x1d/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811f4d3d>] ? d_free+0x4d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff811f6e80>] ? shrink_dentry_list+0x250/0x480
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff81285ea6>] ? security_file_alloc+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff811e07de>] ? alloc_file+0x1e/0xf0
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff811fcad7>] ? __fd_install+0x47/0x60
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645909>] ? system_call_fastpath+0x16/0x1b
[Wed Jun 14 18:03:48 2017] INFO: task java:9265 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] java D ffffffffffffffff 0 9265 1 0x00000180
[Wed Jun 14 18:03:48 2017] ffff880f94f075a0 0000000000000086 ffff880fe98bdc00 ffff880f94f07fd8
[Wed Jun 14 18:03:48 2017] ffff880f94f07fd8 ffff880f94f07fd8 ffff880fe98bdc00 ffff880f94f076e8
[Wed Jun 14 18:03:48 2017] ffff880f94f076f0 7fffffffffffffff ffff880fe98bdc00 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5dc9>] ? ttwu_do_wakeup+0x19/0xd0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b5f5d>] ? ttwu_do_activate.constprop.84+0x5d/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa0295e47>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811717eb>] ? free_pcppages_bulk+0x34b/0x3a0
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810bd008>] ? __enqueue_entity+0x78/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff810c34b7>] ? enqueue_entity+0x237/0x890
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4d2e>] xfs_log_force_lsn+0x2e/0x90 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] ? xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b34b7>] __xfs_iunpin_wait+0xa7/0x150 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810a6b60>] ? wake_atomic_t_function+0x40/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02b6fc9>] xfs_iunpin_wait+0x19/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ab84c>] xfs_reclaim_inode+0x8c/0x350 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02abd77>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8bd3>] ? wake_up_process+0x23/0x40
[Wed Jun 14 18:03:48 2017] [<ffffffffa02ac923>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02bb895>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff811e0c28>] prune_super+0xe8/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff8117c4c5>] shrink_slab+0x165/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff811d5e51>] ? vmpressure+0x21/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f642>] do_try_to_free_pages+0x3c2/0x4e0
[Wed Jun 14 18:03:48 2017] [<ffffffff8117f85c>] try_to_free_pages+0xfc/0x180
[Wed Jun 14 18:03:48 2017] [<ffffffff8117355d>] __alloc_pages_nodemask+0x7fd/0xb90
[Wed Jun 14 18:03:48 2017] [<ffffffff81078d73>] copy_process.part.25+0x163/0x1610
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a401>] do_fork+0xe1/0x320
[Wed Jun 14 18:03:48 2017] [<ffffffff81122b33>] ? __secure_computing+0x73/0x240
[Wed Jun 14 18:03:48 2017] [<ffffffff8110b796>] ? __audit_syscall_exit+0x1e6/0x280
[Wed Jun 14 18:03:48 2017] [<ffffffff8107a6c6>] SyS_clone+0x16/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645c59>] stub_clone+0x69/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffff81645b12>] ? tracesys+0xdd/0xe2
[Wed Jun 14 18:03:48 2017] INFO: task java:28683 blocked for more than 120 seconds.
[Wed Jun 14 18:03:48 2017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jun 14 18:03:48 2017] java D ffffffffffffffff 0 28683 1 0x00000180
[Wed Jun 14 18:03:48 2017] ffff880f6d8ebc20 0000000000000086 ffff881fc8ffdc00 ffff880f6d8ebfd8
[Wed Jun 14 18:03:48 2017] ffff880f6d8ebfd8 ffff880f6d8ebfd8 ffff881fc8ffdc00 ffff880f6d8ebd68
[Wed Jun 14 18:03:48 2017] ffff880f6d8ebd70 7fffffffffffffff ffff881fc8ffdc00 ffffffffffffffff
[Wed Jun 14 18:03:48 2017] Call Trace:
[Wed Jun 14 18:03:48 2017] [<ffffffff8163a909>] schedule+0x29/0x70
[Wed Jun 14 18:03:48 2017] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd15>] ? native_sched_clock+0x35/0x80
[Wed Jun 14 18:03:48 2017] [<ffffffff8101cd69>] ? sched_clock+0x9/0x10
[Wed Jun 14 18:03:48 2017] [<ffffffff810bb685>] ? sched_clock_cpu+0x85/0xc0
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8a66>] ? try_to_wake_up+0x1b6/0x300
[Wed Jun 14 18:03:48 2017] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
[Wed Jun 14 18:03:48 2017] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
[Wed Jun 14 18:03:48 2017] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c643a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffffa02c4a7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff81639b12>] ? down_read+0x12/0x30
[Wed Jun 14 18:03:48 2017] [<ffffffffa02a74d0>] xfs_file_fsync+0x1b0/0x200 [xfs]
[Wed Jun 14 18:03:48 2017] [<ffffffff8120f975>] do_fsync+0x65/0xa0
[Wed Jun 14 18:03:48 2017] [<ffffffff8120fc63>] SyS_fdatasync+0x13/0x20
[Wed Jun 14 18:03:48 2017] [<ffffffff81645b12>] tracesys+0xdd/0xe2
日志有swap相关日志,我用  `bootstrap.memory_lock: true` 和 `ulimit -l unlimited` 来禁用swap的。
已邀请:

kennywu76 - Wood

赞同来自: famoss AiToMaKoTo

我们遇到过类似情况, 有两种场景。 一种是因为内核默认启用了Linux NUMA优化,关掉就好了。 第二种是当时使用了xfs,换回ext4就好了。  第二种情况根源当时也没有找到,只是因为之前跑在ext4上一直稳定,就换回去了。
 
你先看一下关闭NUMA是否能解决你的问题,如果不行的化,再考虑是否是xfs的问题。
 
关闭NUMA的方法:
在/etc/sysctl.conf 中增加配置vm.zone_reclaim_mode = 0 , 然后执行 sysctl -p 使配置生效。

medcl - 今晚打老虎。

赞同来自:

es什么版本啊,出现这个问题的时候,该节点还能响应请求么?

jaxzhai

赞同来自:

这个问题解决了吗?

要回复问题请先登录注册