Instructions to use bambisheng/UltraIF-8B-UltraComposer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bambisheng/UltraIF-8B-UltraComposer with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="bambisheng/UltraIF-8B-UltraComposer")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("bambisheng/UltraIF-8B-UltraComposer")
model = AutoModelForCausalLM.from_pretrained("bambisheng/UltraIF-8B-UltraComposer")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use bambisheng/UltraIF-8B-UltraComposer with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "bambisheng/UltraIF-8B-UltraComposer"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bambisheng/UltraIF-8B-UltraComposer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/bambisheng/UltraIF-8B-UltraComposer

SGLang

How to use bambisheng/UltraIF-8B-UltraComposer with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "bambisheng/UltraIF-8B-UltraComposer" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bambisheng/UltraIF-8B-UltraComposer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "bambisheng/UltraIF-8B-UltraComposer" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bambisheng/UltraIF-8B-UltraComposer",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use bambisheng/UltraIF-8B-UltraComposer with Docker Model Runner:
```
docker model run hf.co/bambisheng/UltraIF-8B-UltraComposer
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

UltraIF-8B-UltraComposer

Links 🚀

UltraIF model series and data are available at 🤗 HuggingFace.

Also check out our 📚 Paper and 💻code

Model Description

UltraIF-8B-UltraComposer is a specialized composer that can facilitate the synthesis of wild instructions with more complex and diverse constraints, fine-tuned from Llama-3.1-8B-Instruct.

Introduction of UltraIF

UltraIF first constructs the UltraComposer by decomposing user instructions into simplified ones and constraints, along with corresponding evaluation questions. This specialized composer facilitates the synthesis of instructions with more complex and diverse constraints, while the evaluation questions ensure the correctness and reliability of the generated responses.

Then, we introduce the Generate-then-Evaluate process. This framework first uses UltraComposer to incorporate constraints into instructions and then evaluates the generated responses using corresponding evaluation questions covering various quality levels.

Usage

Format your input as follows:

[history]: {your_chat_history}
[initial query]: {your_query}

And the output will be organized in json format:

{"augmented query":.., "question":..}

For more details, check out our official implementation for UltraComposer.

Reference

📑 If you find our projects helpful to your research, please consider citing:

@article{an2025ultraif,
  title={UltraIF: Advancing Instruction Following from the Wild},
  author={An, Kaikai and Sheng, Li and Cui, Ganqu and Si, Shuzheng and Ding, Ning and Cheng, Yu and Chang, Baobao},
  journal={arXiv preprint arXiv:2502.04153},
  year={2025}
}