Video-Text-to-Text
Transformers
Safetensors
MLX
English
smolvlm
image-text-to-text
How to use from the
Use from the
MLX library
# Download the model from the Hub
pip install huggingface_hub[hf_xet]

huggingface-cli download --local-dir SmolVLM2-500M-Video-Instruct-mlx mlx-community/SmolVLM2-500M-Video-Instruct-mlx

HuggingFaceTB/SmolVLM2-500M-Video-Instruct-mlx

This model was converted to MLX format from HuggingFaceTB/SmolVLM2-500M-Video-Instruct using mlx-vlm version 0.1.13. Refer to the original model card for more details on the model.

Use with mlx

pip install -U mlx-vlm
python -m mlx_vlm.generate --model mlx-community/SmolVLM2-500M-Video-Instruct-mlx --image https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg --prompt "Can you describe this image?"
Downloads last month
2,075
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/SmolVLM2-500M-Video-Instruct-mlx

Finetuned
(143)
this model

Datasets used to train mlx-community/SmolVLM2-500M-Video-Instruct-mlx