zqmalyssa's picture
Update README.md
5cfeb35 verified
metadata
license: apache-2.0
datasets:
  - shibing624/alpaca-zh
  - hiyouga/DPO-En-Zh-20k
language:
  - zh
  - en
metrics:
  - accuracy
base_model: Qwen/Qwen2.5-1.5B
pipeline_tag: text-generation
library_name: transformers
tags:
  - qwen
  - dpo
  - sft
  - alignment
  - ollama

A lightweight model fine-tuned from Qwen2.5-1.5B via SFT and DPO alignment. Enjoy!

Phase Metric Value
SFT Final Loss 1.65
DPO Accuracies 70.4%
DPO Margins 1.022