Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Paper
•
2203.05482
•
Published
•
7
This is a merge of Broken-Tutu-24B-Unslop-v2.0 and Huihui-Mistral-Small-3.2-24B-Instruct-2506-abliterated created using mergekit. It mellows out some of the biases of Broken Tutu and steers it back towards baseline Mistral Small 3.2 24B. Note that the resultant model is still censored per se - it will require the appropriate system prompt or jailbreak in order to get unrestricted responses, similar to Broken Tutu.
GGUF quants can be found at https://huggingface.co/concedo/CabbageSoup-24B-GGUF
This model was merged using the Linear merge method using Broken-Tutu-24B-Unslop-v2.0 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
base_model: Broken-Tutu-24B-Unslop-v2.0
dtype: float32
merge_method: linear
modules:
default:
slices:
- sources:
- layer_range: [0, 40]
model: Broken-Tutu-24B-Unslop-v2.0
parameters:
weight: 0.9
- layer_range: [0, 40]
model: Huihui-Mistral-Small-3.2-24B-Ablit-Novision
parameters:
weight: 0.1