公司需求是实时删除已经失效的地址链接(就是我建立索引时,将地址保存成了一个字段),索引大概在380W份文档左右,看了下elasticsearch使用scroll遍历可以快速的将我需要的这个地址字段获取出来,将失效地址删除,但是scroll遍历时,我将search_type模式改成scan时出现了400异常,将search_type参数去掉时,可以查询到数据,但是速度很不理想,希望大佬们能够帮看看,是怎么回事,谢谢啦,elasticsearch版本为5.5.1 以下是我的代码片段。
出现了400错误
ERROR:root:TransportError(400, u'illegal_argument_exception')
Traceback (most recent call last):
File "C:/Users/hjyPC/Desktop/ecm+ydyp/linux/util\es_py.py", line 305, in scroll_search
**kwargs)
File "C:\Python27\lib\site-packages\elasticsearch\client\utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
File "C:\Python27\lib\site-packages\elasticsearch\client\__init__.py", line 531, in search
doc_type, '_search'), params=params, body=body)
File "C:\Python27\lib\site-packages\elasticsearch\transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "C:\Python27\lib\site-packages\elasticsearch\connection\http_urllib3.py", line 93, in perform_request
self._raise_error(response.status, raw_data)
File "C:\Python27\lib\site-packages\elasticsearch\connection\base.py", line 105, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
RequestError: TransportError(400, u'illegal_argument_exception')
出现了400错误
ERROR:root:TransportError(400, u'illegal_argument_exception')
Traceback (most recent call last):
File "C:/Users/hjyPC/Desktop/ecm+ydyp/linux/util\es_py.py", line 305, in scroll_search
**kwargs)
File "C:\Python27\lib\site-packages\elasticsearch\client\utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
File "C:\Python27\lib\site-packages\elasticsearch\client\__init__.py", line 531, in search
doc_type, '_search'), params=params, body=body)
File "C:\Python27\lib\site-packages\elasticsearch\transport.py", line 307, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "C:\Python27\lib\site-packages\elasticsearch\connection\http_urllib3.py", line 93, in perform_request
self._raise_error(response.status, raw_data)
File "C:\Python27\lib\site-packages\elasticsearch\connection\base.py", line 105, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
RequestError: TransportError(400, u'illegal_argument_exception')
2 个回复
kennywu76 - Wood
赞同来自: qq402424088
实际上python client提供了一个helper方法helpers.html#scan 让scroll写起来更容易。
也可以使用high level的python api => Elasticsearch DSL scan#pagination
hufuman
赞同来自: