calcuis/lumina-gguf · Gemma 2 2b quantized doesn't work

Feb 8, 2025

Good day.
Is there's a way to make model work with quantized Gemma 2? When I run it, it just says - "ClipLoaderGGUF Unknown architecture: 'gemma2'"
I tried running finetuned version with Q_8 quant.
Also, Lumina was Q_8 too.

Owner Feb 8, 2025

•

edited Feb 8, 2025

upgraded your comfyui as well as the node to the latest version? code support just added since

Owner Feb 8, 2025

you could try the gemma-2-2b-fp16 version here; tested, works fine

Feb 8, 2025

Tried upgrading, still don't work. I can (try) paste log here.

Owner Feb 8, 2025

•

edited Feb 8, 2025

Good day.
Is there's a way to make model work with quantized Gemma 2? When I run it, it just says - "ClipLoaderGGUF Unknown architecture: 'gemma2'"
I tried running finetuned version with Q_8 quant.
Also, Lumina was Q_8 too.

oh, i see; did you quantize the gemma-2-2b-fp16 to fp8 or gguf or etc.? it won't work in that case; still sorting out right away; use the fp16 safetensors, should work

Feb 8, 2025

No, I downloaded already quantized finetune by bartowski (abliterated, Q_8). Send a log?

Owner Feb 8, 2025

never test the one (q_8) you mentioned; we don't think it works actually; any link/source; the log won't help

Feb 8, 2025

Alright. Thanks for help

Owner Feb 8, 2025

since it was for the text-generation; different format; it works for llama.cpp related connector(s) but doesn't work as text encoder for image model

Feb 9, 2025

Same issue. I have it working fine with the fp16 safetensors file, but I'd like to be able to use it with a gguf. I don't see why that wouldn't be possible.

Owner Feb 9, 2025

it's possible but takes time to re-format it

Feb 10, 2025

It would be awesome to see how this version might work with it:
https://huggingface.co/bartowski/gemma-2-2b-it-abliterated-GGUF

I'd guess the weights are similar enough that it should be fine. But if 9b could also work, that would be great to see, although it seems like that might not be possible.

razvanab

Feb 10, 2025

It would be awesome to see how this version might work with it:
https://huggingface.co/bartowski/gemma-2-2b-it-abliterated-GGUF

I'd guess the weights are similar enough that it should be fine. But if 9b could also work, that would be great to see, although it seems like that might not be possible.

I will do some tests to see if this will work. I wonder if this will make the model generate more NSFW images.

razvanab

Feb 10, 2025

It would be awesome to see how this version might work with it:
https://huggingface.co/bartowski/gemma-2-2b-it-abliterated-GGUF

I'd guess the weights are similar enough that it should be fine. But if 9b could also work, that would be great to see, although it seems like that might not be possible.

I will do some tests to see if this will work. I wonder if this will make the model generate more NSFW images.

Nope. It doesn't work. Unfortunately.

Owner Feb 10, 2025

code support added; but might need to wait for the upper level upgrade since gemma2 is not a typical model, the prompt will keep looping inside

Mescalamba

Feb 12, 2025

It would be awesome to see how this version might work with it:
https://huggingface.co/bartowski/gemma-2-2b-it-abliterated-GGUF

I'd guess the weights are similar enough that it should be fine. But if 9b could also work, that would be great to see, although it seems like that might not be possible.

I will do some tests to see if this will work. I wonder if this will make the model generate more NSFW images.

In theory it could. Issue is that in order to make custom Gemma 2 models work, you need to set specific LLM settings (meaning you need to be able to set temperature, top_k and so on) cause otherwise those models are usually useless. There are models that work "as they are", but its pretty rare. Regular abliterated and unfixed model is mostly just lobotomized and impossible to work with.

Owner Feb 12, 2025

gemma2 was a headache before, even for text-generation, the tokenizer is very easy to break; not really worth to make huge effort to recode the tokenizer, guess not very common as an encoder for image/video generation

Feb 13, 2025

So the gemma that's used in Lumina has a custom tokenizer?

I was actually wondering about temp. Would it be of any benefit to have a slider in Comfy to adjust the temp on image generation? If it's defaulted, I presume the highest value to get the most diverse images, but since LLM's are non-deterministic, and people expect the same image seed to generate the same image, maybe the temp is set to 0? Would be cool to be able to play with that.

Owner Feb 13, 2025

the upper side has transforming code which makes the decoded tensors from gguf doesn't work; wait for their new update or use the safetensors instead for the time being; since 5GB was not considered large, understood that the text encoder is larger than the model itself a lot sounds funny

Owner Feb 13, 2025

you folks convert it by yourself and test it; upgrade your node, you will find a new convertor (zero); any form of safetensors can be converted by it

Apr 3, 2025

anyone done any work on this, or is it a deal model? Seems it hasn't really taken off like others. But I would think with a LLM using it's latent space to guide the generation, it could maybe be similar to OpenAI's image generation, maybe with gemma 3 or something.