ELK,萌萌哒

TieredMergePolicy中的为什么当hitTooLarge为true时,skew设为最优值

Lucene | 作者 emmning | 发布于2023年08月16日 | 阅读数:1569

TieredMergePolicy中的为什么当hitTooLarge为true时,skew设为最优值,注释中的“cascade”也没看懂是什么含义。代码如下:
// Roughly measure "skew" of the merge, i.e. how
// "balanced" the merge is (whether the segments are
// about the same size), which can range from
// 1.0/numSegsBeingMerged (good) to 1.0 (poor). Heavily
// lopsided merges (skew near 1.0) is no good; it means
// O(N^2) merge cost over time:
final double skew;
if (hitTooLarge) {
// Pretend the merge has perfect skew; skew doesn't
// matter in this case because this merge will not
// "cascade" and so it cannot lead to N^2 merge cost
// over time:
final int mergeFactor = (int) Math.min(maxMergeAtOnce, segsPerTier);
skew = 1.0 / mergeFactor;
} else {
skew =
((double) floorSize(segmentsSizes.get(candidate.get(0)).sizeInBytes))
/ totAfterMergeBytesFloored;
}
已邀请:

Charele - Cisco4321

赞同来自:

hitTooLarge,说明这些段总量险些要超了(但不会超),
 
Pretend the merge has perfect skew;
假装它是最好的,给它一个小的skew,从而有更小的打分。
 
从而使这些段被优先被merge

要回复问题请先登录注册