Instructions to use nikhilchandak/LlamaForecaster-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nikhilchandak/LlamaForecaster-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nikhilchandak/LlamaForecaster-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("nikhilchandak/LlamaForecaster-8B") model = AutoModelForCausalLM.from_pretrained("nikhilchandak/LlamaForecaster-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use nikhilchandak/LlamaForecaster-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nikhilchandak/LlamaForecaster-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nikhilchandak/LlamaForecaster-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/nikhilchandak/LlamaForecaster-8B
- SGLang
How to use nikhilchandak/LlamaForecaster-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nikhilchandak/LlamaForecaster-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nikhilchandak/LlamaForecaster-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nikhilchandak/LlamaForecaster-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nikhilchandak/LlamaForecaster-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use nikhilchandak/LlamaForecaster-8B with Docker Model Runner:
docker model run hf.co/nikhilchandak/LlamaForecaster-8B
LlamaForecaster-8B
LlamaForecaster-8B is a specialized language model for open-ended forecasting and predicting future events. This model is post-trained from Llama-3.1-8B-Instruct using reinforcement learning on the OpenForesight dataset.
Performance on OpenForesight Test Set
Training Llama-3.1-8B-Instruct on increasing number of samples from OpenForesight leads to continued improvements, making it surpass Qwen3-235B, DeepSeek v3, and almost match R1! We call the final checkpoint (trained on whole of OpenForesight) LlamaForecaster.
Model Description
LlamaForecaster-8B is trained to make calibrated predictions on open-ended questions about future events. The model has been trained to provide calibrated confidence estimates when asked (please prompt explicitly).
Training
This model was trained on the OpenForesight dataset, which contains over 52,000 forecasting questions generated from global news events. The training was done using GRPO optimizing a joint reward function combining accuracy and brier score. Please check the paper for more details.
Base Model: Llama-3.1-8B-Instruct
Training Dataset: OpenForesight
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "nikhilchandak/LlamaForecaster-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# template
prompt = "What is the likelihood that [future event] will occur by [date]?"
# example
prompt = "Who will become the next Prime Minister of India based on the general election to be held in 2029? Provide specific predictions with probabilities."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=8192)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(prediction)
Performance
LlamaForecaster-8B achieves competitive performance with much larger models like DeepSeek-v3 and Qwen3-235B-A22B on forecasting benchmarks. Key improvements include:
- Improved Accuracy: Better prediction of future events
- Better Calibration: More reliable confidence estimates
- Enhanced Consistency: Reduced logical violations in predictions
Citation
If you use this model in any way, please cite the corresponding paper:
@article{chandak2025scaling,
title={Scaling Open-Ended Reasoning to Predict the Future},
author={Chandak, Nikhil and Goel, Shashwat and Prabhu, Ameya and Hardt, Moritz and Geiping, Jonas},
journal={arXiv preprint arXiv:2512.25070},
year={2025}
}
License
This model is released under the MIT License.
Contact
For questions or issues, please visit our website or open an issue on the model repository.
- Downloads last month
- 5

