DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Paper • 2406.11617 • Published • 10
⚠️ Note: This model requires Mistral Tekken chat template.
karcher method while running for 500 iterations.
base_model: B:/12B/models--mistralai--Mistral-Nemo-Instruct-2407
models:
- model: B:/12B/models--aixonlab--Aether-12b
- model: B:/12B/models--aixonlab--Zinakha-12b
- model: B:/12B/models--allura-org--Bigger-Body-12b
- model: B:/12B/models--allura-org--MN-12b-RP-Ink
- model: B:/12B/models--allura-org--remnant-mn-12b
- model: B:/12B/models--anthracite-org--magnum-v4-12b
- model: B:/12B/models--ArliAI--Mistral-Nemo-12B-ArliAI-RPMax-v1.2
- model: B:/12B/models--Babsie--Opulus-12B-v3
- model: B:/12B/models--BeaverAI--mistral-doryV2-12b
- model: B:/12B/models--crestf411--nemo-sunfall-v0.6.1
- model: B:/12B/models--EpistemeAI2--Fireball-Mistral-Nemo-12B-Philos
- model: B:/12B/models--EpistemeAI--Mistral-Nemo-Instruct-12B-Philosophy-Math
- model: B:/12B/models--Fizzarolli--MN-12b-Rosier-v1
- model: B:/12B/models--HumanLLMs--Human-Like-Mistral-Nemo-Instruct-2407
- model: B:/12B/models--IIEleven11--Kalypso
- model: B:/12B/models--intervitens--mini-magnum-12b-v1.1
- model: B:/12B/models--jtatman--mistral_nemo_12b_reasoning_psychology_lora
- model: B:/12B/models--KOOWEEYUS--BlackSheep-RP-12B
- model: B:/12B/models--Lambent--Arsenic-Shahrazad-12B-v2
- model: B:/12B/models--Lambent--Arsenic-Shahrazad-12B-v3
- model: B:/12B/models--Lambent--arsenic-nemo-unleashed-12B
- model: B:/12B/models--Lambent--Gilded-Arsenic-12B
- model: B:/12B/models--mistralai--Mistral-Nemo-Instruct-2407
- model: B:/12B/models--nbeerbower--Lyra-Gutenberg-mistral-nemo-12B
- model: B:/12B/models--nbeerbower--Lyra4-Gutenberg-12B
- model: B:/12B/models--nbeerbower--mistral-nemo-bophades-12B
- model: B:/12B/models--nbeerbower--mistral-nemo-gutenberg-12B-v3
- model: B:/12B/models--nbeerbower--mistral-nemo-gutenberg-12B-v4
- model: B:/12B/models--nbeerbower--Mistral-Nemo-Gutenberg-Doppel-12B
- model: B:/12B/models--nbeerbower--Mistral-Nemo-Gutenberg-Encore-12B
- model: B:/12B/models--nbeerbower--Mistral-Nemo-Gutenberg-Vitus-12B
- model: B:/12B/models--nbeerbower--mistral-nemo-wissenschaft-12B
- model: B:/12B/models--NeverSleepHistorical--lumi-nemo-e2.0
- model: B:/12B/models--NeverSleep--Lumimaid-v0.2-12B
- model: B:/12B/models--nothingiisreal--Celeste-12B-V1.6
- model: B:/12B/models--nothingiisreal--MN-12B-Celeste-V1.9
- model: B:/12B/models--PocketDoc--Dans-DangerousWinds-V1.1.0-12b
- model: B:/12B/models--ReadyArt--Dark-Nexus-12B-v2.0
- model: B:/12B/models--ReadyArt--Forgotten-Safeword-12B-v4.0
- model: B:/12B/models--ReadyArt--Omega-Darker_The-Final-Directive-12B
- model: B:/12B/models--romaingrx--red-teamer-mistral-nemo
- model: B:/12B/models--Sao10K--MN-12B-Lyra-v1
- model: B:/12B/models--Sao10K--MN-12B-Lyra-v4
- model: B:/12B/models--shisa-ai--shisa-v2-mistral-nemo-12b
- model: B:/12B/models--sleepdeprived3--Christian-Bible-Expert-v2.0-12B
- model: B:/12B/models--SuperbEmphasis--MN-12b-RP-Ink-RP-Longform
- model: B:/12B/models--SuperbEmphasis--Omega-Darker_The-Final-Directive-Longform-Stage2-ERP-12B-v0.2
- model: B:/12B/models--TheDrummer--Rivermind-12B-v1
- model: B:/12B/models--TheDrummer--Rocinante-12B-v1
- model: B:/12B/models--TheDrummer--Rocinante-X-12B-v1
- model: B:/12B/models--Trappu--Nemo-Picaro-12B
- model: B:/12B/models--Undi95--LocalC-12B-e2.0
- model: B:/12B/models--VAGOsolutions--SauerkrautLM-Nemo-12b-Instruct
merge_method: karcher
parameters:
max_iter: 500
tol: 1.0e-9
dtype: float32
out_dtype: bfloat16
tokenizer:
source: union
chat_template: auto
name: 🦑 Kraken-Karcher-12B-v1
ID | Status | Delta Norm | Orig Size | Model Name
----------------------------------------------------------------------------------------------------
#1 | OK | 1.6926 | 74448896 | models--aixonlab--Aether-12b
#2 | OK | 2.4361 | 74448896 | models--aixonlab--Zinakha-12b
#3 | OK | 0.0407 | 74448896 | models--allura-org--Bigger-Body-12b
#4 | OK | 1.6611 | 74448896 | models--allura-org--MN-12b-RP-Ink
#5 | OK | 0.0866 | 74449920 | models--allura-org--remnant-mn-12b
#6 | OK | 1.5070 | 74448896 | models--anthracite-org--magnum-v4-12b
#7 | OK | 0.7476 | 74448896 | models--ArliAI--Mistral-Nemo-12B-ArliAI-RPMax-v1.2
#8 | OK | 4.6310 | 74448896 | models--Babsie--Opulus-12B-v3
#9 | HIGH MAG | 5.2080 | 74448896 | models--BeaverAI--mistral-doryV2-12b
#10 | OK | 0.3196 | 74448896 | models--crestf411--nemo-sunfall-v0.6.1
#11 | HIGH MAG | 5.7044 | 74448896 | models--EpistemeAI2--Fireball-Mistral-Nemo-12B-Philos
#12 | OK | 2.3099 | 74448896 | models--EpistemeAI--Mistral-Nemo-Instruct-12B-Philosophy-Math
#13 | HIGH MAG | 5.2074 | 74448896 | models--Fizzarolli--MN-12b-Rosier-v1
#14 | OK | 0.0452 | 74448896 | models--HumanLLMs--Human-Like-Mistral-Nemo-Instruct-2407
#15 | OK | 1.6716 | 74448896 | models--IIEleven11--Kalypso
#16 | HIGH MAG | 5.1134 | 74449408 | models--intervitens--mini-magnum-12b-v1.1
#17 | OK | 2.0833 | 74448896 | models--jtatman--mistral_nemo_12b_reasoning_psychology_lora
#18 | OK | 2.9633 | 74448896 | models--KOOWEEYUS--BlackSheep-RP-12B
#19 | OK | 2.8456 | 74448896 | models--Lambent--Arsenic-Shahrazad-12B-v2
#20 | OK | 2.8456 | 74448896 | models--Lambent--Arsenic-Shahrazad-12B-v3
#21 | OK | 2.8456 | 74448896 | models--Lambent--arsenic-nemo-unleashed-12B
#22 | OK | 2.8461 | 74448896 | models--Lambent--Gilded-Arsenic-12B
#23 | OK | 0.0000 | 74448896 | models--mistralai--Mistral-Nemo-Instruct-2407
#24 | OK | 3.8395 | 74448896 | models--nbeerbower--Lyra-Gutenberg-mistral-nemo-12B
#25 | OK | 0.4548 | 74448896 | models--nbeerbower--Lyra4-Gutenberg-12B
#26 | OK | 0.0451 | 74448896 | models--nbeerbower--mistral-nemo-bophades-12B
#27 | HIGH MAG | 5.1134 | 74449408 | models--nbeerbower--mistral-nemo-gutenberg-12B-v3
#28 | OK | 1.9967 | 74448896 | models--nbeerbower--mistral-nemo-gutenberg-12B-v4
#29 | OK | 0.0272 | 74448896 | models--nbeerbower--Mistral-Nemo-Gutenberg-Doppel-12B
#30 | OK | 0.0350 | 74448896 | models--nbeerbower--Mistral-Nemo-Gutenberg-Encore-12B
#31 | OK | 0.0501 | 74448896 | models--nbeerbower--Mistral-Nemo-Gutenberg-Vitus-12B
#32 | OK | 0.0353 | 74448896 | models--nbeerbower--mistral-nemo-wissenschaft-12B
#33 | HIGH MAG | 5.1626 | 74449408 | models--NeverSleepHistorical--lumi-nemo-e2.0
#34 | HIGH MAG | 5.1647 | 74449408 | models--NeverSleep--Lumimaid-v0.2-12B
#35 | OK | 0.0208 | 74448896 | models--nothingiisreal--Celeste-12B-V1.6
#36 | OK | 4.1274 | 74448896 | models--nothingiisreal--MN-12B-Celeste-V1.9
#37 | HIGH MAG | 5.4769 | 74448896 | models--PocketDoc--Dans-DangerousWinds-V1.1.0-12b
#38 | OK | 2.4160 | 74448896 | models--ReadyArt--Dark-Nexus-12B-v2.0
#39 | OK | 2.4154 | 74448896 | models--ReadyArt--Forgotten-Safeword-12B-v4.0
#40 | OK | 0.0281 | 74448896 | models--ReadyArt--Omega-Darker_The-Final-Directive-12B
#41 | OK | 0.0000 | 74448896 | models--romaingrx--red-teamer-mistral-nemo
#42 | OK | 3.8395 | 74448896 | models--Sao10K--MN-12B-Lyra-v1
#43 | OK | 0.4543 | 74448896 | models--Sao10K--MN-12B-Lyra-v4
#44 | OK | 2.9147 | 74448896 | models--shisa-ai--shisa-v2-mistral-nemo-12b
#45 | OK | 0.0123 | 74448896 | models--sleepdeprived3--Christian-Bible-Expert-v2.0-12B
#46 | OK | 1.6650 | 74448896 | models--SuperbEmphasis--MN-12b-RP-Ink-RP-Longform
#47 | OK | 0.1050 | 74448896 | models--SuperbEmphasis--Omega-Darker_The-Final-Directive-Longform-Stage2-ERP-12B-v0.2
#48 | OK | 1.6959 | 74448896 | models--TheDrummer--Rivermind-12B-v1
#49 | OK | 1.9967 | 74448896 | models--TheDrummer--Rocinante-12B-v1
#50 | OK | 2.2716 | 74448896 | models--TheDrummer--Rocinante-X-12B-v1
#51 | OK | 4.4899 | 74448896 | models--Trappu--Nemo-Picaro-12B
#52 | HIGH MAG | 5.1950 | 74448896 | models--Undi95--LocalC-12B-e2.0
#53 | OK | 1.4549 | 74448896 | models--VAGOsolutions--SauerkrautLM-Nemo-12b-Instruct