Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
anthonym21
/
Mistral-7B-v0.3-CoDA-GQA-L
like
0
Text Generation
Transformers
Safetensors
PyTorch
English
mistral
attention
differential-attention
bounded-memory
kv-cache
landmark
coda-gqa-l
text-generation-inference
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
Mistral-7B-v0.3-CoDA-GQA-L
17.2 GB
1 contributor
History:
11 commits
anthonym21
Phase 2 bounded checkpoint (400+600 steps, medium config)
c842ab6
verified
4 days ago
.gitattributes
1.57 kB
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
README.md
11.6 kB
Add model card from paper/hf_model_card.md
10 days ago
coda_adapters.pt
2.69 GB
xet
Phase 2 bounded checkpoint (400+600 steps, medium config)
4 days ago
config.json
689 Bytes
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
10 days ago
generation_config.json
110 Bytes
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
10 days ago
model.safetensors
14.5 GB
xet
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
10 days ago
special_tokens_map.json
414 Bytes
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
tokenizer.json
1.96 MB
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
tokenizer.model
587 kB
xet
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
tokenizer.model.v3
587 kB
xet
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
tokenizer_config.json
137 kB
Fix tokenizer: upload complete tokenizer from mistralai/Mistral-7B-v0.3
10 days ago
training_config.json
787 Bytes
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
10 days ago
training_training_log.json
30.7 kB
Mistral 7B v0.3 + CoDA-GQA-L: two-phase trained (unbounded + bounded)
10 days ago