vLLM support and ONNX models

#5
by Napron - opened

Thanks for a great model.

  1. When vLLM support arrives?
  2. Can we serve jina-VLM model with jina-serve?
  3. Are you going to share ONNX models for vision and language models?
Napron changed discussion title from vLLM support to vLLM support and ONNX models
Jina AI org

Hey @Napron , vLLM is coming very soon, we are working on this right now. For ONNX you mean separate models for vision and language?

Thats great! Yes I want to test the inference of vision and language models, would be good to have quantized onnx models too.

Thanks in advance.

Hey @Napron , vLLM is coming very soon, we are working on this right now. For ONNX you mean separate models for vision and language?

Hi, thanks for such a great model. Are there any updates on vLLM serving?

I feel like they are very slow on adding vLLM

Jina AI org

Hi @grozatech , its taking a bit longer than expected, but I am back at it now. Will update this thread when its ready, thanks โœŒ๏ธ

Jina AI org

Hi again, sorry for the delay, it should work now ๐Ÿ‘‰ https://huggingface.co/jinaai/jina-vlm#using-vllm

Hi @gmastparas, thank you for updating. Could you also provide pre-quantized AWQ or GPTQ weights for jina-vlm?

Sign up or log in to comment