Eagle-Vicuna-13B-v1.3
This is a fine-tuned version of Vicuna-13B using the EAGLE method for fast inference.
Model Details
- Base model: lmsys/vicuna-13b-v1.3
- Method: EAGLE (Efficient speculative decoding)
- Training data: ShareGPT, etc.
模型配置
base_model: lmsys/vicuna-13b-v1.3
eagle-model
Model(
(embed_tokens): Embedding(32000, 4096, padding_idx=0)
(layers): ModuleList(
(0): LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=4096, out_features=4096, bias=False)
(k_proj): Linear(in_features=4096, out_features=4096, bias=False)
(v_proj): Linear(in_features=4096, out_features=4096, bias=False)
(o_proj): Linear(in_features=4096, out_features=4096, bias=False)
(rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
(gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
(up_proj): Linear(in_features=4096, out_features=11008, bias=False)
(down_proj): Linear(in_features=11008, out_features=4096, bias=False)
(act_fn): SiLU()
)
(post_attention_layernorm): LlamaRMSNorm()
)
)
(fc): Linear(in_features=8192, out_features=4096, bias=True)
(act): SiLU()
)
vicuna-13B-config.json
{
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 13824,
"max_position_embeddings": 2048,
"model_type": "llama",
"num_attention_heads": 40,
"num_hidden_layers": 1,
"pad_token_id": 0,
"rms_norm_eps": 1e-06,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.28.1",
"use_cache": true,
"vocab_size": 32000
}
模型训练
数据生成
python -m eagle.ge_data.allocation --outdir ../data
训练
accelerate launch -m --mixed_precision=bf16 eagle.train.main --tmpdir eagle/data/sharegpt_0_67999_mufp16 --cpdir eagle/checkpoint --configpath eagle/train/vicuna_13B_config.json
模型上传
from huggingface_hub import HfApi
api = HfApi()
# 只上传修改后的 README.md 文件
api.upload_file(
path_or_fileobj="checkpoints/eagle-vicuna-13B/README.md", # 本地修改后的 README 路径
path_in_repo="README.md", # 仓库中的目标路径(根目录)
repo_id="Gavin1104/eagle-vicuna-13b-v1.3",
repo_type="model"
)
- Downloads last month
- 8
Model tree for Gavin1104/eagle-vicuna-13b-v1.3
Base model
lmsys/vicuna-13b-v1.3