解决windows无法使用 eland_import_hub_model的问题
我的原始命令是eland_import_hub_model --url https://elastic:[email protected]:9200 --hub-model-id sentence-transformers/clip-ViT-B-32-multilingual-v1 --task-type text_embedding --start --ca-certs app/conf/ca.crt
,这个命令我在window上执行会直接弹出打开方式,打开一下就是一个py文件吧,没法执行。
今天想使用eland_import_hub_model
命令将ML模型上传至本地windows系统的elasticsearch中,发现使用这个eland_import_hub_model
命令上传不了,在GitHub上同样出现这个问题的还有一个日本老哥。点我日本老哥问题,不知道他的问题有没有解决,反正我是解决了,哈哈哈哈。
日本老哥问题
为了提升SEO搜索引擎优化,方便后面的小伙伴检索到我这篇文章,在此将这个日本老哥的问题也描述一下
Right after pip install eland
, neither python -m eland
nor eland_import-hub-model
works on Windows.
C:\Users\YouheiSakurai>python -m eland
C:\Program Files\Python311\python.exe: No module named eland.__main__; 'eland' is a package and cannot be directly executed
C:\Users\YouheiSakurai>eland_import_hub_model
'eland_import_hub_model' is not recognized as an internal or external command,
operable program or batch file.
python -m eland
doesn’t work because no ofeland.__main__
.eland_import-hub-model
doesn’t work because scripts is used instead of console_scripts insetup.py
.
As the above picture shows, eland_import_hub_model
is not executable on Windows due to no file extension like .exe
.
I will open a PR to solve this hassle on Windows.
我的解决方案
eland的GitHub上有给出readme文档,可以借鉴看一下https://github.com/elastic/eland/READED.md,不过现在es的版本是8.16.1,我的es版本是8.6.0,所以需要找一下最近的8.6.0的readme文档,直接在右侧tags找一下,发现没有8.6.0,于是直接找到最近的8.7.0,下载源文件zip,打开看看8.7.0中的readme文档是如何解决window上传的,直接照着修改一下就行
import elasticsearch
from pathlib import Path
from eland.ml.pytorch import PyTorchModel
from eland.ml.pytorch.transformers import TransformerModel
# Load a Hugging Face transformers model directly from the model hub
tm = TransformerModel("sentence-transformers/clip-ViT-B-32-multilingual-v1", "text_embedding")
# Export the model in a TorchScrpt representation which Elasticsearch uses
tmp_path = "models"
Path(tmp_path).mkdir(parents=True, exist_ok=True)
model_path, config, vocab_path = tm.save(tmp_path)
ca_certs_path = "../app/conf/ca.crt"
# Import model into Elasticsearch
es = elasticsearch.Elasticsearch("https://elastic:[email protected]:9200",
ca_certs=ca_certs_path,
verify_certs=True, timeout=300) # 5 minute timeout
ptm = PyTorchModel(es, tm.elasticsearch_model_id())
ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config)
上传成功
写在最后
编程精选网(www.codehuber.com),程序员的终身学习网站已上线!
如果这篇【文章】有帮助到你,希望可以给【JavaGPT】点个赞👍,创作不易,如果有对【后端技术】、【前端领域】感兴趣的小可爱,也欢迎关注❤️❤️❤️ 【JavaGPT】❤️❤️❤️,我将会给你带来巨大的【收获与惊喜】💝💝💝!