Text Generation
Transformers
PyTorch
English
qwen2
text-generation-inference
unsloth
trl
gammacorpus
zurich
chat
conversational
Instructions to use rubenroy/Zurich-7B-GCv2-10k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rubenroy/Zurich-7B-GCv2-10k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rubenroy/Zurich-7B-GCv2-10k") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("rubenroy/Zurich-7B-GCv2-10k") model = AutoModelForCausalLM.from_pretrained("rubenroy/Zurich-7B-GCv2-10k") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use rubenroy/Zurich-7B-GCv2-10k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rubenroy/Zurich-7B-GCv2-10k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-7B-GCv2-10k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/rubenroy/Zurich-7B-GCv2-10k
- SGLang
How to use rubenroy/Zurich-7B-GCv2-10k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rubenroy/Zurich-7B-GCv2-10k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-7B-GCv2-10k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rubenroy/Zurich-7B-GCv2-10k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-7B-GCv2-10k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use rubenroy/Zurich-7B-GCv2-10k with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rubenroy/Zurich-7B-GCv2-10k to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rubenroy/Zurich-7B-GCv2-10k to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for rubenroy/Zurich-7B-GCv2-10k to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="rubenroy/Zurich-7B-GCv2-10k", max_seq_length=2048, ) - Docker Model Runner
How to use rubenroy/Zurich-7B-GCv2-10k with Docker Model Runner:
docker model run hf.co/rubenroy/Zurich-7B-GCv2-10k
| base_model: Qwen/Qwen2.5-7B-Instruct | |
| tags: | |
| - text-generation-inference | |
| - transformers | |
| - unsloth | |
| - qwen2 | |
| - trl | |
| - gammacorpus | |
| - zurich | |
| - chat | |
| - conversational | |
| license: apache-2.0 | |
| language: | |
| - en | |
| datasets: | |
| - rubenroy/GammaCorpus-v2-10k | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
|  | |
| # Zurich 7B GammaCorpus v2-10k | |
| *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset* | |
| ## Overview | |
| Zurich 7B GammaCorpus v2-10k is a fine-tune of Alibaba's **Qwen 2.5 7B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-10k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-10k). | |
| ## Model Details | |
| - **Base Model:** [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) | |
| - **Type:** Causal Language Models | |
| - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias | |
| - **Number of Parameters:** 7.61B | |
| - **Number of Paramaters (Non-Embedding):** 6.53B | |
| - **Number of Layers:** 28 | |
| - **Number of Attention Heads (GQA):** 28 for Q and 4 for KV | |
| ## Training Details | |
| Zurich-7B-GCv2-10k underwent fine-tuning with 1 T4 GPU for ~20 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-7B-GCv2-10k was trained for **60 Epochs**. | |
| ## Usage | |
| ### Requirements | |
| We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows: | |
| ``` | |
| pip install transformers | |
| ``` | |
| ### Quickstart | |
| Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents; | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_name = "rubenroy/Zurich-7B-GCv2-10k" | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype="auto", | |
| device_map="auto" | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| prompt = "How tall is the Eiffel tower?" | |
| messages = [ | |
| {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 7B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."}, | |
| {"role": "user", "content": prompt} | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True | |
| ) | |
| model_inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
| generated_ids = model.generate( | |
| **model_inputs, | |
| max_new_tokens=512 | |
| ) | |
| generated_ids = [ | |
| output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) | |
| ] | |
| response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] | |
| ``` | |
| ## About GammaCorpus | |
| This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations. | |
| GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes: | |
| ### GammaCorpus v1 | |
| - 10k UNFILTERED | |
| - 50k UNFILTERED | |
| - 70k UNFILTERED | |
| Here is a link to the GCv1 dataset collection:<br> | |
| https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60 | |
| ### GammaCorpus v2 | |
| - **10k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.** | |
| - 50k | |
| - 100k | |
| - 500k | |
| - 1m | |
| - 5m | |
| Here is a link to the GCv2 dataset collection:<br> | |
| https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df | |
| ### GammaCorpus CoT | |
| - Math 170k | |
| Here is a link to the GC-CoT dataset collection:<br> | |
| https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f | |
| ### GammaCorpus QA | |
| - Fact 450k | |
| Here is a link to the GC-QA dataset collection:<br> | |
| https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7 | |
| ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac). | |
| ## Known Limitations | |
| - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers. | |
| ## Additional Information | |
| ### Licensing Information | |
| The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions. |