Instructions to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF", dtype="auto") - llama-cpp-python
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF", filename="Llama-3.2-Taiwan-Legal-3B-Instruct.Q2_K.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Use Docker
docker model run hf.co/QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
- SGLang
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Ollama:
ollama run hf.co/QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
- Unsloth Studio new
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF to start chatting
- Pi new
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Docker Model Runner:
docker model run hf.co/QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
- Lemonade
How to use QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF-Q4_K_M
List all available models
lemonade list
- QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF
- Original Model Card
- Model Card for Model lianghsun/Llama-3.2-Taiwan-Legal-3B-Instruct
QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF
This is quantized version of lianghsun/Llama-3.2-Taiwan-Legal-3B-Instruct created using llama.cpp
Original Model Card
Model Card for Model lianghsun/Llama-3.2-Taiwan-Legal-3B-Instruct
基於 meta-llama/Llama-3.2-3B-Instruct 模型,透過中華民國台灣法律條文及判決書等相關資料集進行微調。
Model Update History
| Update Date | Model Version | Key Changes |
|---|---|---|
| 2024-10-17 | v1.1.0 | Experimental fine-tuning on v1.0.0 with added legal code data from the Republic of China (Taiwan) |
| 2024-10-10 | v1.0.0 | Full model training completed, but missing legal code data for the Republic of China (Taiwan) |
| 2024-09-27 | v0.1.0 | Model v0.1.0 released, but training was interrupted after 3 epochs due to lack of compute resources |
Model Details
Model Description
基於 meta-llama/Llama-3.2-3B-Instruct 模型,此微調過程使用了來自中華民國台灣的法律條文與相關判決書資料集,以提升模型在法律領域的專業知識與應用能力。這些資料集涵蓋了法律條文的結構、判決書的格式,法庭上常見的法律語言與術語,並包含了部分法律資料科學任務的應用,使模型能夠更準確地理解和處理與台灣法律體系相關的問題。經過這些微調,模型將能夠更好地為法律專業人士提供幫助,並在台灣法制框架內提供更精準的回應與建議。
- Developed by: Huang Liang Hsun
- Model type: LlamaForCausalLM
- Language(s) (NLP): 主要處理繁體中文(zh-tw),針對中華民國台灣的法律用語與判決書進行微調。
- License: llama3.2
- Finetuned from model: meta-llama/Llama-3.2-3B-Instruct
Model Sources
- Repository: lianghsun/Llama-3.2-Taiwan-Legal-3B-Instruct
- Demo: (WIP)
Uses
Direct Use
此模型可以直接用於理解和生成繁體中文法律文本,適合需要處理台灣法律相關問題的應用場景。模型預設的指令和回應能夠有效提供法律資訊、釐清法律條文、並生成符合法律專業的回應。其直接使用範圍包括但不限於法律資訊查詢、法律文本摘要、和基本的法條對話。
Downstream Use
經過微調後,該模型可用於更具體的法律任務,如自動判決書分析、法律實體識別(NER)、法規編號轉換,以及法律合規審查輔助。此模型可以無縫集成至法律數據科學應用或法律技術(LegalTech)系統中,幫助法律專業人士或企業提升工作效率。
Out-of-Scope Use
該模型並不適用於非法律相關領域的生成任務,且不應用於進行可能涉及誤導或錯誤的法律建議,尤其是在未經專業審查的情況下。避免將模型用於未經授權或非法用途,如生成具爭議性或具偏見的法律建議。
Bias, Risks, and Limitations
模型在生成法律條文和判決書內容時,可能會生成虛構或不存在的法條或判決書內容,這是模型的內在限制之一。使用者在參考這些資料時,應謹慎檢查生成的內容,並避免將模型輸出視為法律依據。建議在實際應用中,將模型生成的結果與可靠的法律見解和來源進行比對,確保準確性、合法性和適用性。
Recommendations
此模型雖然經過法律文本的微調,但在於法律文本的數量及基礎模型為 SLM,模型能力仍有極限,使用者應注意以下風險與限制:
偏見風險: 模型可能會反映其訓練資料中的潛在偏見。由於法律文本的特定性,模型可能更熟悉某些法規、條文或判決案例,而在其他領域表現較弱。特別是在處理不常見的法律問題或未被訓練過的新法規時,模型的輸出可能會帶有偏見。
技術限制: 雖然模型能夠處理大部分的法律文本,但對於結構極其複雜或語言模棱兩可的法律條文,模型可能無法產生精確的回答。使用者應避免完全依賴模型的輸出,尤其在法律決策過程中,建議進行額外的專業檢查。
法律責任: 模型並非專業法律顧問,因此其生成的回應不應被視為正確的法律建議。使用者應確保在理性且專業背景下進行模型的應用,並避免在關鍵決策中過度依賴模型。
誤用風險: 不當使用模型進行錯誤或誤導性的法律建議,可能對個人或企業造成負面影響。使用者應謹慎應用模型於合規或法律相關任務中,並保持對其輸出的檢視及校正。
為了減少這些風險,建議使用者在應用模型輸出時進行二次檢查,特別是在涉及法律決策的情境中。本模型現階段為提供法律科技領域進行大語言模型研究,並非取代專業法律工作者之專業建議。
How to Get Started with the Model
Using vLLM
要使用 vLLM Docker image 來啟動此模型,您可以按照以下操作:
docker run --runtime nvidia --gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HUGGING_FACE_HUB_TOKEN=<secret>" \
-p 8000:8000 \
--ipc=host \
vllm/vllm-openai:latest \
--model lianghsun/Llama-3.2-Taiwan-Legal-3B-Instruct
Training Details
Training Data (for v1.1.0)
- lianghsun/tw-legal-nlp
- lianghsun/tw-legal-synthetic-qa
- lianghsun/tw-law-article-qa
- lianghsun/tw-judgment-qa
- lianghsun/tw-bar-examination-2020-chat
- lianghsun/tw-emergency-medicine-bench
Training procedure
Preprocessing
無。基本上我們並沒有針對 meta-llama/Llama-3.2-3B-Instruct 做任何的預訓練或更改其模型架構;Tokenizer 也是採用原生所提供的。
Training hyperparameters (for v1.1.0)
The following hyperparameters were used during training:
- learning_rate: 0.0004378 (value at epoch 3.9)
- train_batch_size: 12
- eval_batch_size: Not specified
- seed: Not specified
- distributed_type: single-GPU
- num_devices: 1
- gradient_accumulation_steps: 512
- total_train_batch_size: 6144 (train_batch_size * gradient_accumulation_steps)
- optimizer: AdamW
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 15
- grad_norm: 0.0899 (value at epoch 3.9)
- global_step: 645
Speeds, Sizes, Times (for v1.1.0)
- Duration: 92h 27m 40s
- Train runtime: 92h 27m 40s
- Train samples per second: Not directly available
- Train steps per second: Approximately 0.002 steps/s
- Total training FLOPs: Not directly provided
- Train loss: 0.0512 (at epoch 3.9)
Evaluation
Testing Data, Factors & Metrics
Testing Data
Note: ..(WIP)..
Factors
Note: ..(WIP)..
Metrics
Note: ..(WIP)..
Results
Note: ..(WIP)..
Summary
Note: ..(WIP)..
Model Examination
法條回覆
Note: ..(WIP)..
判決書內容
Note: ..(WIP)..
法律 NLP 任務
Note: ..(WIP)..
Environmental Impact (for v1.1.0)
- Hardware Type: 1 x NVIDIA H100 NVL 80GB
- Hours used: 92h 27m 40s
- Cloud Provider: N/A
- Compute Region: N/A
- Carbon Emitted: N/A
Technical Specifications
Model Architecture and Objective
本模型基於 meta-llama/Llama-3.2-3B-Instruct,使用自回歸 Transformer 架構進行語言建模。該模型的主要目標是提升對台灣法律文本的理解與生成能力,尤其是針對判決書、法條的專業處理與應用。透過專門設計的法律文本集進行微調,模型能更精確地回答法律問題並提供相關建議。
Compute Infrastructure
Hardware (for v1.1.0)
- 1 x NVIDIA H100 NVL 80GB
Software
- 微調過程使用了 hiyouga/LLaMA-Factory 框架進行訓練。
Citation
無。
Glossary
無。
More Information
算力
儘管我們已準備了許多關於中華民國台灣法律領域的資料集,但由於算力資源有限,無法將所有資料集進行完整訓練(是的,我們並沒有將全部資料集都進行訓練,僅取出被認為最基礎的法律文本),導致模型尚未達到最佳表現。因此,目前的 checkpoint 是基於有限資源的版本。如果您有意願贊助算力,歡迎與我聯繫。我相信,若能將更多已準備但尚未納入訓練的法律語料進行微調,該模型將能達到繁體中文法律領域的最佳表現。
持績更新
此模型如有進一步資源,將會不定期更新。
Model Card Authors
Model Card Contact
Framework versions
- Transformers 4.45.2
- Pytorch 2.4.1+cu121
- Datasets 2.21.0
- Tokenizers 0.20.0
- Downloads last month
- 333
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
Model tree for QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF
Base model
meta-llama/Llama-3.2-3B-Instruct
docker model run hf.co/QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF: