Instructions to use xingshen/prompt4trust-cgpgenerator-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use xingshen/prompt4trust-cgpgenerator-1.5B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="xingshen/prompt4trust-cgpgenerator-1.5B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("xingshen/prompt4trust-cgpgenerator-1.5B")
model = AutoModelForCausalLM.from_pretrained("xingshen/prompt4trust-cgpgenerator-1.5B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use xingshen/prompt4trust-cgpgenerator-1.5B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "xingshen/prompt4trust-cgpgenerator-1.5B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "xingshen/prompt4trust-cgpgenerator-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/xingshen/prompt4trust-cgpgenerator-1.5B

SGLang

How to use xingshen/prompt4trust-cgpgenerator-1.5B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "xingshen/prompt4trust-cgpgenerator-1.5B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "xingshen/prompt4trust-cgpgenerator-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "xingshen/prompt4trust-cgpgenerator-1.5B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "xingshen/prompt4trust-cgpgenerator-1.5B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use xingshen/prompt4trust-cgpgenerator-1.5B with Docker Model Runner:
```
docker model run hf.co/xingshen/prompt4trust-cgpgenerator-1.5B
```

Prompt4Trust

This repository contains the official implementation of the paper:

Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models
Anita Kriz*, Elizabeth Laura Janes*, Xing Shen*, Tal Arbel
*Equal contribution
IEEE/CVF International Conference on Computer Vision 2025 Workshop CVAMD
Paper (arXiv preprint)
Code (GitHub)

Overview

Multimodal large language models (MLLMs) show great potential for healthcare applications, but their clinical deployment is challenged by prompt sensitivity and overconfident incorrect responses. To improve trustworthiness in safety-critical settings, we introduce Prompt4Trust, the first reinforcement learning framework for prompt augmentation focused on confidence calibration in MLLMs. A lightweight LLM is trained to generate context-aware auxiliary prompts that guide a downstream MLLM to produce predictions with confidence scores that better reflect true accuracy. By prioritizing clinically meaningful calibration, Prompt4Trust enhances both reliability and task performance, achieving state-of-the-art results on the PMC-VQA benchmark while enabling efficient zero-shot generalization to larger MLLMs.

fig

Usage

As this model (Calibration Guidance Prompt Generator) is a finetuned version of the Qwen2.5-1.5B-Instruct, we refer users to Qwen’s documentation for details on model loading and inference. We also recommand using vLLM for faster inference.

An example code of loading the model using vLLM:

from vllm import LLM
cgp_generator = LLM(model="xingshen/prompt4trust-cgpgenerator-1.5B")

Acknowledgments

This work was supported in part by the Natural Sciences and Engineering Research Council of Canada, in part by the Canadian Institute for Advanced Research (CIFAR) Artificial Intelligence Chairs Program, in part by the Mila - Quebec Artificial Intelligence Institute, in part by the compute resources provided by Mila (mila.quebec), in part by the Mila-Google Research Grant, in part by the Fonds de recherche du Québec, in part by the Canada First Research Excellence Fund, awarded to the Healthy Brains, Healthy Lives initiative at McGill University, and in part by the Department of Electrical and Computer Engineering at McGill University.

Contact

Please raise a GitHub issue here or email us at xing.shen@mail.mcgill.ca (with the email subject starting with "[Prompt4Trust]") if you have any question or encounter any issue.