Instructions to use YLX1965/medical-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use YLX1965/medical-model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("question-answering", model="YLX1965/medical-model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("YLX1965/medical-model", dtype="auto")

llama-cpp-python

How to use YLX1965/medical-model with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="YLX1965/medical-model",
	filename="unsloth.Q8_0.gguf",
)

llm.create_chat_completion(
	messages = "{\n    \"question\": \"What is my name?\",\n    \"context\": \"My name is Clara and I live in Berkeley.\"\n}"
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use YLX1965/medical-model with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf YLX1965/medical-model:Q8_0
# Run inference directly in the terminal:
llama-cli -hf YLX1965/medical-model:Q8_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf YLX1965/medical-model:Q8_0
# Run inference directly in the terminal:
llama-cli -hf YLX1965/medical-model:Q8_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf YLX1965/medical-model:Q8_0
# Run inference directly in the terminal:
./llama-cli -hf YLX1965/medical-model:Q8_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf YLX1965/medical-model:Q8_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf YLX1965/medical-model:Q8_0

Use Docker

docker model run hf.co/YLX1965/medical-model:Q8_0

LM Studio
Jan
Ollama
How to use YLX1965/medical-model with Ollama:
```
ollama run hf.co/YLX1965/medical-model:Q8_0
```

Unsloth Studio new

How to use YLX1965/medical-model with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for YLX1965/medical-model to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for YLX1965/medical-model to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for YLX1965/medical-model to start chatting

Docker Model Runner
How to use YLX1965/medical-model with Docker Model Runner:
```
docker model run hf.co/YLX1965/medical-model:Q8_0
```

Lemonade

How to use YLX1965/medical-model with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull YLX1965/medical-model:Q8_0

Run and chat with the model

lemonade run user.medical-model-Q8_0

List all available models

lemonade list

Model Card for Model ID

license: [apache-2.0] language: - zh - en datasets: - FreedomIntelligence/medical-o1-reasoning-SFT base_model: unsloth/DeepSeek-R1-Distill-Llama-8B pipeline_tag: text-generation library_name: - transformers - trl - unsloth tags: - medical - question-answering - llama - deepseek - unsloth - finetuned - gguf - q8_0 - chinese - clinical-reasoning

医学模型 (微调的 DeepSeek-R1-Distill-Llama-8B)

此模型是 unsloth/DeepSeek-R1-Distill-Llama-8B 在 FreedomIntelligence/medical-o1-reasoning-SFT 数据集（一个中文医学问答数据集）上的微调版本。它专为临床推理和回答医学问题而设计。该模型以 Q8_0 量化的 GGUF 格式保存。

模型详情

基础模型： unsloth/DeepSeek-R1-Distill-Llama-8B
数据集： FreedomIntelligence/medical-o1-reasoning-SFT
微调库： Unsloth, TRL, Transformers
量化： Q8_0 GGUF
语言： 主要为中文 (zh)

预期用途

此模型旨在用于医学人工智能领域的研究和教育目的。它可用于回答医学问题、协助临床推理以及探索 LLM 在医疗保健中的能力。重要提示： 此模型不能替代专业的医疗建议。如有任何健康问题，请务必咨询合格的医疗保健提供者。

训练过程

该模型使用 LoRA（低秩适应）进行微调，参数如下：

LoRA r: 16
LoRA alpha: 16
LoRA dropout: 0
目标模块： q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
优化器： adamw_8bit
学习率： 2e-4
批量大小（每个设备）： 2
梯度累积步数： 4
最大步数： 60
学习率调度器： Linear

训练使用 Unsloth 进行，以提高效率并减少 VRAM 使用。还使用了梯度检查点。

如何使用

您可以通过 Ollama 使用此模型：

ollama run Anita2023/medical-model:q8_0
from huggingface_hub import hf_hub_download

#下载模型
hf_hub_download(repo_id="Anita2023/medical-model", filename="medical-model.gguf")
以下是描述任务的指令，以及提供进一步上下文的输入。
写出一个适当完成请求的回复。
在回答之前，请仔细考虑问题，并创建一个逐步的思维链，以确保逻辑和准确的回答。

### 指令：
您是一位在临床推理、诊断和治疗计划方面具有高级知识的医学专家。
请回答以下医学问题。

### 问题：
一个患有急性阑尾炎的病人已经发病5天，腹痛稍有减轻但仍然发热，在体检时发现右下腹有压痛的包块，此时应如何处理？

### 回复：
<think>
@misc{medical-model,
    author = {Anita2023},
    title = {医学模型：用于医学问题回答的微调 DeepSeek-R1-Distill-Llama-8B},
    year = {2024},
    publisher = {Hugging Face},
    journal = {Hugging Face Model Hub},
    howpublished = {\url{https://huggingface.co/Anita2023/medical-model}},
}
局限性
模型的性能受训练数据的大小和质量的限制。

它可能无法准确回答所有医学问题。

它主要在中文医学文本上训练。

模型可能表现出训练数据中存在的偏差。

Downloads last month: 11

GGUF

Model size

8B params

Architecture

llama

Hardware compatibility

8-bit

Model tree for YLX1965/medical-model

Base model

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Finetuned

unsloth/DeepSeek-R1-Distill-Llama-8B

Quantized

(21)

this model

YLX1965
/

medical-model