When nvfp4 version?

#28

by akisviete - opened 19 days ago

Discussion

akisviete

19 days ago

Is it possible to release nvfp4 version of both models? Now we only have fp8.

juliendenize

Mistral AI_ org 19 days ago

Not planned however llmcompressor is a cool library to make quantization if you'd like to try !

akisviete

19 days ago

Should a quality quantinization be done from bf16 version that is not released by mistral and not fp8? I guess there is no other way if we have just the fp8 model released. Thanks for the answer. :)

akisviete

16 days ago

•

edited 16 days ago

Now I was thinking to do a finetune of the model but the script asks for 'float16', 'bfloat16', 'float32' weights. So now I can't use this model. Other Mistral models are released in bf16 why not this one?
"LLM Compressor does support NVFP4 quantization, but it does not support taking an already‑FP8 model and directly turning it into NVFP4; you need to start from an unquantized (FP16/BF16/FP32) checkpoint."

juliendenize

Mistral AI_ org 14 days ago

Hey ! to have BF16 one way would be to simply descale the weights by multiplying by the scales for the weights :)

juliendenize

Mistral AI_ org 14 days ago

Oh and to answer the second part of the question, we don't release in BF16 anymore to limit the number of ckpts we release to make it easier to identify our models and avoid cluttering our org with duplicated checkpoints. For all inference cases it is strictly better to use FP8 as our models are natively trained to handle this format, so it is free memory gain.

As per my comment above, it is quite easy to retrieve a BF16 from a FP8 model so it shouldn't be a blocker !

akisviete

2 days ago

So you are not willing to publish the bf16 and not willing to make nvfp4 from the original bf16 for your users so the users have to do a workaround to cast fp8 to get the non original bf16 to get nvfp4. I am afraid the fp8 cast to bf16 would be not identical to your private bf16, and that would degrade the nvfp4 performance.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment