Python执行update_by_query 报错.

Elasticsearch | 作者 manman | 发布于2019年02月01日 | 阅读数:420

from elasticsearch import Elasticsearch

client =Elasticsearch(hosts=["127.0.0.1"])

country_list = ['罗马尼亚','比利时','挪威','葡萄牙','摩洛哥','丹麦','白俄罗斯','阿尔及利亚','塞尔维亚','斯洛伐克','日本','伊朗',
'摩尔多瓦','新加坡','埃及','加拿大','中国台湾','韩国','奥地利','克罗地亚','科特迪瓦','斯里兰卡','芬兰','希腊','黎巴嫩',
'泰国','突尼斯','荷兰','哈萨克斯坦','哥伦比亚','墨西哥','中国香港','南非','爱沙尼亚','阿联酋','斯洛文尼亚','亚美尼亚','立陶宛',
'卢森堡','拉脱维亚','爱尔兰','保加利亚','阿塞拜疆','阿根廷','新西兰','马来西亚','秘鲁','摩纳哥','吉尔吉斯斯坦','孟加拉','印尼',
'塞浦路斯','澳大利亚',]

english_list = ['Romania','Belgium','Norway','Portugal','Morocco','Denmark','Belarus','Algeria','Serbia','Slovakia','Japan',
'Iran','Moldova','Singapore','Egypt','Canada','Taiwan, China','Korea','Austria','Croatia','Cote d\'Ivoire',
'Sri Lanka','Finland','Greece','Lebanon','Thailand','Tunisia','Netherlands','Kazakhstan','Colombia','Mexico',
'Hong Kong, China','South Africa','Estonia','United Arab Emirates','Slovenia','Armenia','Lithuania','Luxembourg',
'Latvia','Ireland','Bulgaria','Azerbaijan','Argentina','new Zealand','Malaysia','Peru','Monaco','Kyrgyzstan','Bengal',
'Indonesia','Cyprus','Australia',]

dict = dict(zip(country_list,english_list))

for country in country_list:
client.update_by_query(
index="english",
request_timeout=60,
body={
"script": {
"inline": "if (ctx._source.country ==" + country + ") {ctx._source.country= " + dict[country] + "}"
}
}
)
print(country + "Changed" + "*"*10)
运行这段代码后,直接报错
 
 
: RequestsDependencyWarning: urllib3 (1.24.1) or chardet (3.0.4) doesn't match a supported version!
RequestsDependencyWarning)
POST http://127.0.0.1:9200/english/_update_by_query [status:500 request:0.038s]
Traceback (most recent call last):
File "/home/elliot/Developer/Python/update_by_query/replace_country.py", line 26, in <module>
"inline": "if (ctx._source.country ==" + country + ") {ctx._source.country= " + dict[country] + "}"
File "/home/elliot/anaconda3/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 73, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/elliot/anaconda3/lib/python3.6/site-packages/elasticsearch/client/__init__.py", line 737, in update_by_query
doc_type, '_update_by_query'), params=params, body=body)
File "/home/elliot/anaconda3/lib/python3.6/site-packages/elasticsearch/transport.py", line 312, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/elliot/anaconda3/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
self._raise_error(response.status, raw_data)
File "/home/elliot/anaconda3/lib/python3.6/site-packages/elasticsearch/connection/base.py", line 125, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.TransportError: TransportError(500, 'script_exception', 'compile error')
弄的我很迷茫.
我用的elasticsearch-dls 是5.3.0版本
 
已邀请:

rochy - rochy_he@jointsky

赞同来自:

强烈不推荐你使用上述的更新方式,效率非常非常低;
 
推荐你使用 scroll 方式遍历整个索引的数据
然后使用 update api 设置文档的 country 字段内容;
最后使用 bulk 批量提交修改

laoyang360 - [死磕Elasitcsearch]知识星球地址:http://t.cn/RmwM3N9;微信公众号:铭毅天下; 博客:blog.csdn.net/laoyang360

赞同来自:

排错方式:单步执行一个中英文更新的dsl 便于发现dsl错误原因

的确这样效率非常低

JackGe

赞同来自:

会不会是对字符串少了单引号造成的
client.update_by_query(
index="english",
request_timeout=60,
body={
"script": {
"inline": "if (ctx._source.country == '" + country + "') {ctx._source.country= '" + dict[country] + "'}",
"lang": "painless"
}
}
)
https://stackoverflow.com/ques ... query 这里有个例子
q = {
"script": {
"inline": "ctx._source.Device='Test'",
"lang": "painless"
},
"query": {
"match": {
"Device": "Boiler"
}
}
}

es.update_by_query(body=q, doc_type='AAA', index='testindex')

要回复问题请先登录注册