## Elasitcsearch 搜索

#### 关于数字.字母的匹配问题

Elasticsearch • xinfanwang 回复了问题 • 4 人关注 • 5 个回复 • 751 次浏览 • 2017-09-01 17:07

#### 关于elasticsearch的python api查询效率问题

回复Elasticsearch • 丶Dik1s 发起了问题 • 1 人关注 • 0 个回复 • 118 次浏览 • 2017-09-01 10:50

#### es中的数据与数据库保持一致的问题

Elasticsearch • laoyang360 回复了问题 • 3 人关注 • 2 个回复 • 259 次浏览 • 2017-08-28 07:42

#### elasticsearch-hadoopp hive导入数据到es中的总是version conflict?

Elasticsearch • medcl 回复了问题 • 2 人关注 • 1 个回复 • 296 次浏览 • 2017-08-09 17:30

#### Lucene 6 基于BKD Tree Index 的应用

Elasticsearch • keehang 发表了文章 • 0 个评论 • 306 次浏览 • 2017-08-04 10:20

https://www.elastic.co/blog/lucene-points-6.0

Block k-d trees are a simple yet powerful data structure. At index time, they are built by recursively partitioning the full space of N-dimensional points to be indexed into smaller and smaller rectangular cells, splitting equally along the widest ranging dimension at each step of the recursion. However, unlike an ordinary k-d tree, a block k-d tree stops recursing once there are fewer than a pre-specified (1024 in our case, by default) number of points in the cell.

At that point, all points within that cell are written into one leaf block on disk and the starting file-pointer for that block is saved into an in-heap binary tree structure. In the 1D case, this is simply a full sort of all values, divided into adjacent leaf blocks. There are k-d tree variants that can support removing values, and rebalancing, but Lucene does not need these operations because of its write-once per-segment design.

At search time, the same recursion takes place, testing at each level whether the requested query shape intersects the left or right sub-tree of each dimensional split, and recursing if so. In the 1D case, the query shape is simply a numeric range whereas in the 2D and 3D cases, it is a geo-spatial shape (circle, ring, rectangle, polygon, cube, etc.).$(document).ready(function() {$('pre code').each(function(i, block) { hljs.highlightBlock( block); }); });测试集合：模拟一亿条

0," nnrIuS","raet","lnsr","inu ","saia",83.405273,73.302012,3991,24,"N"," usA","airport","rra i"

1,"omlritp","aaVe","y Mu","AaVV","NMc ",15.459643,-20.826241,2627,54,"a","eemo","airport","MaArp"

2,"kyaneMr","iasm","raAA"," tnt","inls",16.606066,38.663728,2761,53,"o","arIi","airport","uiron"

1. General Multidimensional Space Points

Search for points with exact given values.

Search for points which has one of the value from a given set of values.

Search for points within a given range.

Get the number of points which has exact point.

Get the number of points within a given range. (Ranges are multidimensional ranges. In 3D, they are boxes.)

Divide points into range-buckets and get the count in each buckets. (Range bucket is a range which has a label in it)

2. Locations on the planet surface. (Latitude, Longitude)

Find closest set of airports to a given town.

Find the set of airports within a given radius from a particular town.

Find the set of airports inside a country. (Country can be given as a polygon)

Find the set of airports within a given range of Latitudes and Longitudes. It is a Latitude, Longitude box query. (For a examples: Airports closer to the equatorial)

Find the set of airports closer to a given path. (Path can be something like a road. Find the airports which are less than 50km away from a given highway)

Count the airports in each country by giving country maps as polygons.

search result:

Loading Data is finished ----------------------------------------------------------------------

建索引花费时间：982ms

LatLon - Box Query Example------------------------------------------------------------------------------

search_LatLon_Box 花费时间：69ms

LatLon - K Nearest------------------------------------------------------------------------------

search_LatLon_Nearest 花费时间：108ms

DoublePoint 1D Point Exact------------------------------------------------------------------------------

search_Double_1D_Exact 花费时间：10ms

DoublePoint 1D - Range------------------------------------------------------------------------------

search_Double_1D_range 花费时间：8ms

DoublePoint 1D - Range Buckets -----------------------------------------------------------------------------

search_Double_1D_range_bucket 花费时间：58ms

DoublePoint multi dimensional - Range------------------------------------------------------------------------------

search_Double_MiltiDimensional_Range 花费时间：1ms

查看全部

https://www.elastic.co/blog/lucene-points-6.0

Block k-d trees are a simple yet powerful data structure. At index time, they are built by recursively partitioning the full space of N-dimensional points to be indexed into smaller and smaller rectangular cells, splitting equally along the widest ranging dimension at each step of the recursion. However, unlike an ordinary k-d tree, a block k-d tree stops recursing once there are fewer than a pre-specified (1024 in our case, by default) number of points in the cell.

At that point, all points within that cell are written into one leaf block on disk and the starting file-pointer for that block is saved into an in-heap binary tree structure. In the 1D case, this is simply a full sort of all values, divided into adjacent leaf blocks. There are k-d tree variants that can support removing values, and rebalancing, but Lucene does not need these operations because of its write-once per-segment design.

At search time, the same recursion takes place, testing at each level whether the requested query shape intersects the left or right sub-tree of each dimensional split, and recursing if so. In the 1D case, the query shape is simply a numeric range whereas in the 2D and 3D cases, it is a geo-spatial shape (circle, ring, rectangle, polygon, cube, etc.).

`测试集合：模拟一亿条`

0," nnrIuS","raet","lnsr","inu ","saia",83.405273,73.302012,3991,24,"N"," usA","airport","rra i"

1,"omlritp","aaVe","y Mu","AaVV","NMc ",15.459643,-20.826241,2627,54,"a","eemo","airport","MaArp"

2,"kyaneMr","iasm","raAA"," tnt","inls",16.606066,38.663728,2761,53,"o","arIi","airport","uiron"

1. General Multidimensional Space Points

Search for points with exact given values.

Search for points which has one of the value from a given set of values.

Search for points within a given range.

Get the number of points which has exact point.

Get the number of points within a given range. (Ranges are multidimensional ranges. In 3D, they are boxes.)

Divide points into range-buckets and get the count in each buckets. (Range bucket is a range which has a label in it)

2. Locations on the planet surface. (Latitude, Longitude)

Find closest set of airports to a given town.

Find the set of airports within a given radius from a particular town.

Find the set of airports inside a country. (Country can be given as a polygon)

Find the set of airports within a given range of Latitudes and Longitudes. It is a Latitude, Longitude box query. (For a examples: Airports closer to the equatorial)

Find the set of airports closer to a given path. (Path can be something like a road. Find the airports which are less than 50km away from a given highway)

Count the airports in each country by giving country maps as polygons.

**search result:**

Loading Data is finished ----------------------------------------------------------------------

建索引花费时间：982ms

LatLon - Box Query Example------------------------------------------------------------------------------

search_LatLon_Box 花费时间：69ms

LatLon - K Nearest------------------------------------------------------------------------------

search_LatLon_Nearest 花费时间：108ms

DoublePoint 1D Point Exact------------------------------------------------------------------------------

search_Double_1D_Exact 花费时间：10ms

DoublePoint 1D - Range------------------------------------------------------------------------------

search_Double_1D_range 花费时间：8ms

DoublePoint 1D - Range Buckets -----------------------------------------------------------------------------

search_Double_1D_range_bucket 花费时间：58ms

DoublePoint multi dimensional - Range------------------------------------------------------------------------------

search_Double_MiltiDimensional_Range 花费时间：1ms

#### es 返回文档中如何看是由哪个搜索词搜出的

Elasticsearch • colie 回复了问题 • 2 人关注 • 2 个回复 • 270 次浏览 • 2017-08-03 16:07

#### 关于elasticsearch的python api查询效率问题

回复Elasticsearch • 丶Dik1s 发起了问题 • 1 人关注 • 0 个回复 • 118 次浏览 • 2017-09-01 10:50

#### elasticsearch-hadoopp hive导入数据到es中的总是version conflict?

回复Elasticsearch • medcl 回复了问题 • 2 人关注 • 1 个回复 • 296 次浏览 • 2017-08-09 17:30

#### Lucene 6 基于BKD Tree Index 的应用

Elasticsearch • keehang 发表了文章 • 0 个评论 • 306 次浏览 • 2017-08-04 10:20

https://www.elastic.co/blog/lucene-points-6.0

Block k-d trees are a simple yet powerful data structure. At index time, they are built by recursively partitioning the full space of N-dimensional points to be indexed into smaller and smaller rectangular cells, splitting equally along the widest ranging dimension at each step of the recursion. However, unlike an ordinary k-d tree, a block k-d tree stops recursing once there are fewer than a pre-specified (1024 in our case, by default) number of points in the cell.

At that point, all points within that cell are written into one leaf block on disk and the starting file-pointer for that block is saved into an in-heap binary tree structure. In the 1D case, this is simply a full sort of all values, divided into adjacent leaf blocks. There are k-d tree variants that can support removing values, and rebalancing, but Lucene does not need these operations because of its write-once per-segment design.

At search time, the same recursion takes place, testing at each level whether the requested query shape intersects the left or right sub-tree of each dimensional split, and recursing if so. In the 1D case, the query shape is simply a numeric range whereas in the 2D and 3D cases, it is a geo-spatial shape (circle, ring, rectangle, polygon, cube, etc.).测试集合：模拟一亿条

0," nnrIuS","raet","lnsr","inu ","saia",83.405273,73.302012,3991,24,"N"," usA","airport","rra i"

1,"omlritp","aaVe","y Mu","AaVV","NMc ",15.459643,-20.826241,2627,54,"a","eemo","airport","MaArp"

2,"kyaneMr","iasm","raAA"," tnt","inls",16.606066,38.663728,2761,53,"o","arIi","airport","uiron"

1. General Multidimensional Space Points

Search for points with exact given values.

Search for points which has one of the value from a given set of values.

Search for points within a given range.

Get the number of points which has exact point.

Get the number of points within a given range. (Ranges are multidimensional ranges. In 3D, they are boxes.)

Divide points into range-buckets and get the count in each buckets. (Range bucket is a range which has a label in it)

2. Locations on the planet surface. (Latitude, Longitude)

Find closest set of airports to a given town.

Find the set of airports within a given radius from a particular town.

Find the set of airports inside a country. (Country can be given as a polygon)

Find the set of airports within a given range of Latitudes and Longitudes. It is a Latitude, Longitude box query. (For a examples: Airports closer to the equatorial)

Find the set of airports closer to a given path. (Path can be something like a road. Find the airports which are less than 50km away from a given highway)

Count the airports in each country by giving country maps as polygons.

search result:

Loading Data is finished ----------------------------------------------------------------------

建索引花费时间：982ms

LatLon - Box Query Example------------------------------------------------------------------------------

search_LatLon_Box 花费时间：69ms

LatLon - K Nearest------------------------------------------------------------------------------

search_LatLon_Nearest 花费时间：108ms

DoublePoint 1D Point Exact------------------------------------------------------------------------------

search_Double_1D_Exact 花费时间：10ms

DoublePoint 1D - Range------------------------------------------------------------------------------

search_Double_1D_range 花费时间：8ms

DoublePoint 1D - Range Buckets -----------------------------------------------------------------------------

search_Double_1D_range_bucket 花费时间：58ms

DoublePoint multi dimensional - Range------------------------------------------------------------------------------

search_Double_MiltiDimensional_Range 花费时间：1ms

查看全部

https://www.elastic.co/blog/lucene-points-6.0

Block k-d trees are a simple yet powerful data structure. At index time, they are built by recursively partitioning the full space of N-dimensional points to be indexed into smaller and smaller rectangular cells, splitting equally along the widest ranging dimension at each step of the recursion. However, unlike an ordinary k-d tree, a block k-d tree stops recursing once there are fewer than a pre-specified (1024 in our case, by default) number of points in the cell.

At that point, all points within that cell are written into one leaf block on disk and the starting file-pointer for that block is saved into an in-heap binary tree structure. In the 1D case, this is simply a full sort of all values, divided into adjacent leaf blocks. There are k-d tree variants that can support removing values, and rebalancing, but Lucene does not need these operations because of its write-once per-segment design.

At search time, the same recursion takes place, testing at each level whether the requested query shape intersects the left or right sub-tree of each dimensional split, and recursing if so. In the 1D case, the query shape is simply a numeric range whereas in the 2D and 3D cases, it is a geo-spatial shape (circle, ring, rectangle, polygon, cube, etc.).

`测试集合：模拟一亿条`

0," nnrIuS","raet","lnsr","inu ","saia",83.405273,73.302012,3991,24,"N"," usA","airport","rra i"

1,"omlritp","aaVe","y Mu","AaVV","NMc ",15.459643,-20.826241,2627,54,"a","eemo","airport","MaArp"

2,"kyaneMr","iasm","raAA"," tnt","inls",16.606066,38.663728,2761,53,"o","arIi","airport","uiron"

1. General Multidimensional Space Points

Search for points with exact given values.

Search for points which has one of the value from a given set of values.

Search for points within a given range.

Get the number of points which has exact point.

Get the number of points within a given range. (Ranges are multidimensional ranges. In 3D, they are boxes.)

Divide points into range-buckets and get the count in each buckets. (Range bucket is a range which has a label in it)

2. Locations on the planet surface. (Latitude, Longitude)

Find closest set of airports to a given town.

Find the set of airports within a given radius from a particular town.

Find the set of airports inside a country. (Country can be given as a polygon)

Find the set of airports within a given range of Latitudes and Longitudes. It is a Latitude, Longitude box query. (For a examples: Airports closer to the equatorial)

Find the set of airports closer to a given path. (Path can be something like a road. Find the airports which are less than 50km away from a given highway)

Count the airports in each country by giving country maps as polygons.

**search result:**

Loading Data is finished ----------------------------------------------------------------------

建索引花费时间：982ms

LatLon - Box Query Example------------------------------------------------------------------------------

search_LatLon_Box 花费时间：69ms

LatLon - K Nearest------------------------------------------------------------------------------

search_LatLon_Nearest 花费时间：108ms

DoublePoint 1D Point Exact------------------------------------------------------------------------------

search_Double_1D_Exact 花费时间：10ms

DoublePoint 1D - Range------------------------------------------------------------------------------

search_Double_1D_range 花费时间：8ms

DoublePoint 1D - Range Buckets -----------------------------------------------------------------------------

search_Double_1D_range_bucket 花费时间：58ms

DoublePoint multi dimensional - Range------------------------------------------------------------------------------

search_Double_MiltiDimensional_Range 花费时间：1ms