试试搜索一下吧

spark或java向es5.4.1定入数据,偶尔会出现几分钟写入数据很慢,需要一分钟都有可能

匿名 | 发布于2017年11月28日 | 阅读数:3463

1.环境和数据量      我有5个节点es5.4.1 jdk1.8_111 centos6.8_x64  每个节点分配30Gmem  数据量90多亿  3.3T,数据量每天还在增加,  存放数据的每个盘基本上只占用30%,最多一个盘占用将近45%。
      最近出现了几次spark向es写数据一次将近一分钟的情况(平时最慢3~4秒就写完了),这种现象能持续好几分钟,这个时间段用TransportClient java api 打印日志写一条数据耗时大几十秒的情况也很多。当时出现问题时的gc日志我打印了出来如下,这是其中一个节点的日志,麻烦分析一下原因所在:
 
----gc打印的日志(这个时间段不管用spark或TransportClient,写入批量或一条都会出现耗时大几十秒的情况)
2017-11-27T20:01:21.168+0800: 4138934.958: Total time for which application threads were stopped: 0.0743846 seconds, Stopping threads took: 0.0001456 seconds
2017-11-27T20:02:12.614+0800: 4138986.403: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 478150656 bytes, new threshold 15 (max 15)
- age   1:  148576368 bytes,  148576368 total
- age   2:   23176640 bytes,  171753008 total
- age   3:    2881360 bytes,  174634368 total
- age   4:    2440568 bytes,  177074936 total
- age   5:    3096184 bytes,  180171120 total
- age   6:    1738152 bytes,  181909272 total
- age   7:    2012808 bytes,  183922080 total
- age   8:    5055568 bytes,  188977648 total
, 0.0675751 secs]
   [Parallel Time: 52.8 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 4138986404.7, Avg: 4138986405.1, Max: 4138986405.3, Diff: 0.5]
      [Ext Root Scanning (ms): Min: 2.1, Avg: 2.4, Max: 3.9, Diff: 1.8, Sum: 54.2]
      [Update RS (ms): Min: 18.7, Avg: 20.4, Max: 21.1, Diff: 2.5, Sum: 468.5]
         [Processed Buffers: Min: 37, Avg: 44.6, Max: 55, Diff: 18, Sum: 1025]
      [Scan RS (ms): Min: 8.4, Avg: 9.1, Max: 9.3, Diff: 0.9, Sum: 208.6]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [Object Copy (ms): Min: 20.1, Avg: 20.4, Max: 20.6, Diff: 0.5, Sum: 468.6]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 2.6]
      [GC Worker Total (ms): Min: 52.0, Avg: 52.3, Max: 52.6, Diff: 0.6, Sum: 1202.7]
      [GC Worker End (ms): Min: 4138986457.3, Avg: 4138986457.3, Max: 4138986457.4, Diff: 0.2]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 1.7 ms]
   [Other: 13.0 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 5.8 ms]
      [Ref Enq: 1.1 ms]
      [Redirty Cards: 1.4 ms]
      [Humongous Register: 0.5 ms]
      [Humongous Reclaim: 0.4 ms]
      [Free CSet: 2.4 ms]
   [Eden: 7544.0M(7008.0M)->0.0B(1280.0M) Survivors: 264.0M->256.0M Heap: 17.7G(30.0G)->10.1G(30.0G)]
 [Times: user=1.22 sys=0.00, real=0.06 secs] 
2017-11-27T20:02:12.682+0800: 4138986.471: Total time for which application threads were stopped: 0.1197028 seconds, Stopping threads took: 0.0475207 seconds
2017-11-27T20:02:23.125+0800: 4138996.915: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 100663296 bytes, new threshold 1 (max 15)
- age   1:  141409600 bytes,  141409600 total
- age   2:   19355672 bytes,  160765272 total
- age   3:   16002864 bytes,  176768136 total
- age   4:    2375016 bytes,  179143152 total
- age   5:    1663696 bytes,  180806848 total
- age   6:    3007056 bytes,  183813904 total
- age   7:    1707872 bytes,  185521776 total
- age   8:     871392 bytes,  186393168 total
- age   9:    4727024 bytes,  191120192 total
, 0.0671804 secs]
   [Parallel Time: 59.9 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 4138996915.8, Avg: 4138996916.2, Max: 4138996916.5, Diff: 0.7]
      [Ext Root Scanning (ms): Min: 1.9, Avg: 2.2, Max: 3.8, Diff: 1.9, Sum: 51.0]
      [Update RS (ms): Min: 15.5, Avg: 16.9, Max: 17.4, Diff: 1.9, Sum: 388.0]
         [Processed Buffers: Min: 24, Avg: 37.7, Max: 69, Diff: 45, Sum: 866]
      [Scan RS (ms): Min: 6.3, Avg: 6.9, Max: 7.1, Diff: 0.8, Sum: 157.7]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 33.0, Avg: 33.2, Max: 33.5, Diff: 0.5, Sum: 764.7]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.9]
         [Termination Attempts: Min: 1, Avg: 4.9, Max: 9, Diff: 8, Sum: 112]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.6]
      [GC Worker Total (ms): Min: 59.0, Avg: 59.3, Max: 59.7, Diff: 0.7, Sum: 1363.9]
      [GC Worker End (ms): Min: 4138996975.5, Avg: 4138996975.5, Max: 4138996975.5, Diff: 0.1]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.9 ms]
   [Other: 6.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 3.0 ms]
      [Ref Enq: 0.4 ms]
      [Redirty Cards: 0.5 ms]
      [Humongous Register: 0.5 ms]
      [Humongous Reclaim: 0.2 ms]
      [Free CSet: 0.7 ms]
   [Eden: 1280.0M(1280.0M)->0.0B(1400.0M) Survivors: 256.0M->136.0M Heap: 11.4G(30.0G)->10.1G(30.0G)]
 [Times: user=1.15 sys=0.02, real=0.06 secs] 
2017-11-27T20:02:23.192+0800: 4138996.982: Total time for which application threads were stopped: 0.0727085 seconds, Stopping threads took: 0.0001894 seconds
2017-11-27T20:02:34.737+0800: 4139008.527: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 100663296 bytes, new threshold 15 (max 15)
- age   1:   70754784 bytes,   70754784 total
, 0.0483195 secs]
   [Parallel Time: 43.1 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 4139008528.0, Avg: 4139008528.3, Max: 4139008528.6, Diff: 0.6]
      [Ext Root Scanning (ms): Min: 1.8, Avg: 2.1, Max: 3.6, Diff: 1.8, Sum: 48.3]
      [Update RS (ms): Min: 28.6, Avg: 30.0, Max: 30.2, Diff: 1.6, Sum: 689.5]
         [Processed Buffers: Min: 32, Avg: 40.1, Max: 54, Diff: 22, Sum: 923]
      [Scan RS (ms): Min: 0.3, Avg: 0.4, Max: 0.5, Diff: 0.2, Sum: 9.6]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 10.0, Avg: 10.1, Max: 10.2, Diff: 0.2, Sum: 231.7]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.4]
      [GC Worker Total (ms): Min: 42.3, Avg: 42.6, Max: 43.0, Diff: 0.7, Sum: 980.7]
      [GC Worker End (ms): Min: 4139008570.9, Avg: 4139008570.9, Max: 4139008571.0, Diff: 0.1]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.6 ms]
   [Other: 4.5 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 1.6 ms]
      [Ref Enq: 0.1 ms]
      [Redirty Cards: 0.5 ms]
      [Humongous Register: 0.5 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.4 ms]
   [Eden: 1400.0M(1400.0M)->0.0B(1960.0M) Survivors: 136.0M->144.0M Heap: 11.4G(30.0G)->10.1G(30.0G)]
 [Times: user=0.88 sys=0.01, real=0.04 secs] 
2017-11-27T20:02:34.786+0800: 4139008.576: Total time for which application threads were stopped: 0.0534397 seconds, Stopping threads took: 0.0001265 seconds
2017-11-27T20:02:48.884+0800: 4139022.674: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 138412032 bytes, new threshold 15 (max 15)
- age   1:   65406616 bytes,   65406616 total
- age   2:   33437272 bytes,   98843888 total
, 0.0426738 secs]
   [Parallel Time: 36.6 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 4139022674.9, Avg: 4139022675.1, Max: 4139022675.3, Diff: 0.3]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 1.8, Max: 3.5, Diff: 1.9, Sum: 42.4]
      [Update RS (ms): Min: 13.0, Avg: 14.4, Max: 14.9, Diff: 1.9, Sum: 332.3]
         [Processed Buffers: Min: 20, Avg: 34.3, Max: 48, Diff: 28, Sum: 790]
      [Scan RS (ms): Min: 3.7, Avg: 4.2, Max: 4.4, Diff: 0.7, Sum: 97.6]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 15.5, Avg: 15.6, Max: 15.8, Diff: 0.3, Sum: 359.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.6]
         [Termination Attempts: Min: 1, Avg: 12.2, Max: 18, Diff: 17, Sum: 281]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 1.9]
      [GC Worker Total (ms): Min: 36.1, Avg: 36.3, Max: 36.5, Diff: 0.3, Sum: 834.4]
      [GC Worker End (ms): Min: 4139022711.3, Avg: 4139022711.4, Max: 4139022711.4, Diff: 0.1]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.9 ms]
   [Other: 5.1 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 2.2 ms]
      [Ref Enq: 0.2 ms]
      [Redirty Cards: 0.4 ms]
      [Humongous Register: 0.4 ms]
      [Humongous Reclaim: 0.2 ms]
      [Free CSet: 0.8 ms]
   [Eden: 1960.0M(1960.0M)->0.0B(2944.0M) Survivors: 144.0M->192.0M Heap: 12.1G(30.0G)->10.1G(30.0G)]

--------------正常往es添加查询数据时gc日志------------------------
2017-11-27T20:10:03.956+0800: 4128827.774: Total time for which application threads were stopped: 0.1576661 seconds, Stopping threads took: 0.0777386 seconds
2017-11-27T20:10:34.388+0800: 4128858.206: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 1019215872 bytes, new threshold 15 (max 15)
- age   1:  107202096 bytes,  107202096 total
- age   2:   20921384 bytes,  128123480 total
- age   3:   15149480 bytes,  143272960 total
- age   4:   15647120 bytes,  158920080 total
- age   5:   11337232 bytes,  170257312 total
- age   6:   19942624 bytes,  190199936 total
- age   7:    8578424 bytes,  198778360 total
- age   8:    8120592 bytes,  206898952 total
- age   9:    3762040 bytes,  210660992 total
- age  10:     538104 bytes,  211199096 total
- age  11:    2543400 bytes,  213742496 total
- age  12:    1056320 bytes,  214798816 total
- age  13:    1324584 bytes,  216123400 total
- age  14:    1740904 bytes,  217864304 total
- age  15:    2928800 bytes,  220793104 total
, 0.0735332 secs]
   [Parallel Time: 55.5 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 4128858207.9, Avg: 4128858208.1, Max: 4128858208.3, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 1.6, Avg: 1.9, Max: 3.8, Diff: 2.1, Sum: 43.6]
      [Update RS (ms): Min: 16.0, Avg: 17.9, Max: 18.6, Diff: 2.6, Sum: 411.9]
         [Processed Buffers: Min: 33, Avg: 42.0, Max: 51, Diff: 18, Sum: 965]
      [Scan RS (ms): Min: 10.6, Avg: 11.1, Max: 11.3, Diff: 0.6, Sum: 255.5]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
      [Object Copy (ms): Min: 23.7, Avg: 23.9, Max: 24.1, Diff: 0.4, Sum: 549.4]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 3.2]
      [GC Worker Total (ms): Min: 54.7, Avg: 55.0, Max: 55.2, Diff: 0.5, Sum: 1263.9]
      [GC Worker End (ms): Min: 4128858263.0, Avg: 4128858263.1, Max: 4128858263.2, Diff: 0.2]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 2.1 ms]
   [Other: 15.8 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 6.1 ms]
      [Ref Enq: 1.3 ms]
      [Redirty Cards: 0.8 ms]
      [Humongous Register: 0.6 ms]
      [Humongous Reclaim: 0.3 ms]
      [Free CSet: 4.6 ms]
   [Eden: 14.9G(14.9G)->0.0B(14.8G) Survivors: 264.0M->304.0M Heap: 24.8G(30.0G)->10025.8M(30.0G)]
 [Times: user=1.30 sys=0.00, real=0.07 secs] 
2017-11-27T20:10:34.462+0800: 4128858.280: Total time for which application threads were stopped: 0.0854734 seconds, Stopping threads took: 0.0067541 seconds
已邀请:
匿名用户

匿名用户

赞同来自:

补充一下,当时查看集群和分片状态都是正常的。

要回复问题请先登录注册