vLLM support and ONNX models
#5
by
Napron - opened
Thanks for a great model.
- When vLLM support arrives?
- Can we serve jina-VLM model with jina-serve?
- Are you going to share ONNX models for vision and language models?
Napron changed discussion title from
vLLM support
to vLLM support and ONNX models
Hey @Napron , vLLM is coming very soon, we are working on this right now. For ONNX you mean separate models for vision and language?
Thats great! Yes I want to test the inference of vision and language models, would be good to have quantized onnx models too.
Thanks in advance.
I feel like they are very slow on adding vLLM
Hi @grozatech , its taking a bit longer than expected, but I am back at it now. Will update this thread when its ready, thanks โ๏ธ
Hi again, sorry for the delay, it should work now ๐ https://huggingface.co/jinaai/jina-vlm#using-vllm
Hi @gmastparas, thank you for updating. Could you also provide pre-quantized AWQ or GPTQ weights for jina-vlm?