see the instructions there. The fp32 can be probably used with CPU. fp16 is only gpu. all conversion done on an RX7600 8GB with my custom versions of PyTorch and Onnxruntime. https://huggingface.co/aless2212/Mistral-7B-Instruct-v0.2-onnx-fp32

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support