--- license: apache-2.0 datasets: - shibing624/alpaca-zh - hiyouga/DPO-En-Zh-20k language: - zh - en metrics: - accuracy base_model: Qwen/Qwen2.5-1.5B pipeline_tag: text-generation library_name: transformers tags: - qwen - dpo - sft - alignment - ollama --- A lightweight model fine-tuned from Qwen2.5-1.5B via SFT and DPO alignment. Enjoy! | Phase | Metric | Value | | :--- | :--- | :--- | | SFT | Final Loss | 1.65 | | DPO | Accuracies | 70.4% | | DPO | Margins | 1.022 |