Instructions to use prithivMLmods/Marco-o1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use prithivMLmods/Marco-o1-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="prithivMLmods/Marco-o1-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("prithivMLmods/Marco-o1-GGUF", dtype="auto") - llama-cpp-python
How to use prithivMLmods/Marco-o1-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="prithivMLmods/Marco-o1-GGUF", filename="marco-o1-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use prithivMLmods/Marco-o1-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf prithivMLmods/Marco-o1-GGUF:Q4_K_M
Use Docker
docker model run hf.co/prithivMLmods/Marco-o1-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use prithivMLmods/Marco-o1-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "prithivMLmods/Marco-o1-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Marco-o1-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/prithivMLmods/Marco-o1-GGUF:Q4_K_M
- SGLang
How to use prithivMLmods/Marco-o1-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "prithivMLmods/Marco-o1-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Marco-o1-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "prithivMLmods/Marco-o1-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "prithivMLmods/Marco-o1-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use prithivMLmods/Marco-o1-GGUF with Ollama:
ollama run hf.co/prithivMLmods/Marco-o1-GGUF:Q4_K_M
- Unsloth Studio new
How to use prithivMLmods/Marco-o1-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for prithivMLmods/Marco-o1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for prithivMLmods/Marco-o1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for prithivMLmods/Marco-o1-GGUF to start chatting
- Docker Model Runner
How to use prithivMLmods/Marco-o1-GGUF with Docker Model Runner:
docker model run hf.co/prithivMLmods/Marco-o1-GGUF:Q4_K_M
- Lemonade
How to use prithivMLmods/Marco-o1-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull prithivMLmods/Marco-o1-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Marco-o1-GGUF-Q4_K_M
List all available models
lemonade list
MarcoPolo / Marco-o1-GGUF Modelfile
The Marco-o1 model is designed to excel not only in structured disciplines like mathematics, physics, and coding, which traditionally benefit from reinforcement learning (RL), but also in addressing open-ended problems that require creativity, reasoning, and nuanced understanding. This unique capability positions it as a versatile tool for tasks demanding both precision and innovation.
| File Name [ Updated Files ] | Size | Description | Upload Status |
|---|---|---|---|
.gitattributes |
2.25 kB | Git attributes configuration file | Uploaded |
README.md |
5.23 kB | Updated README | Uploaded |
config.json |
29 Bytes | Configuration file | Uploaded |
marco-o1-f16.gguf |
15.2 GB | MARCO-O1 model (F16 precision) | Uploaded (LFS) |
marco-o1-q2_k.gguf |
3.02 GB | MARCO-O1 model (Q2_K quantization) | Uploaded (LFS) |
marco-o1-q3_k_l.gguf |
4.09 GB | MARCO-O1 model (Q3_K_L quantization) | Uploaded (LFS) |
marco-o1-q3_k_m.gguf |
3.81 GB | MARCO-O1 model (Q3_K_M quantization) | Uploaded (LFS) |
marco-o1-q3_k_s.gguf |
3.49 GB | MARCO-O1 model (Q3_K_S quantization) | Uploaded (LFS) |
marco-o1-q4_0.gguf |
4.43 GB | MARCO-O1 model (Q4_0 quantization) | Uploaded (LFS) |
marco-o1-q4_k_m.gguf |
4.68 GB | MARCO-O1 model (Q4_K_M quantization) | Uploaded (LFS) |
marco-o1-q4_k_s.gguf |
4.46 GB | MARCO-O1 model (Q4_K_S quantization) | Uploaded (LFS) |
marco-o1-q5_0.gguf |
5.32 GB | MARCO-O1 model (Q5_0 quantization) | Uploaded (LFS) |
marco-o1-q5_k_m.gguf |
5.44 GB | MARCO-O1 model (Q5_K_M quantization) | Uploaded (LFS) |
marco-o1-q5_k_s.gguf |
5.32 GB | MARCO-O1 model (Q5_K_S quantization) | Uploaded (LFS) |
marco-o1-q6_k.gguf |
6.25 GB | MARCO-O1 model (Q6_K quantization) | Uploaded (LFS) |
marco-o1-q8_0.gguf |
8.1 GB | MARCO-O1 model (Q8_0 quantization) | Uploaded (LFS) |
Key Features:
- Structured Problem Solving: Strong performance in tasks with definitive answers, such as calculations, algorithm design, and logical problem-solving.
- Open-Ended Resolutions: Exceptional capability to generate thoughtful, nuanced responses for abstract or subjective queries, making it ideal for discussions, ideation, and explorative problem-solving.
- Reinforcement Learning Optimization: Utilizes RL for improving accuracy in structured tasks while employing sophisticated datasets to enhance performance in creative or subjective domains.
Intended Applications:
- Academic Assistance: Solve complex mathematical or scientific problems and explain concepts with clarity.
- Creative Ideation: Generate innovative ideas, solutions, or approaches to open-ended challenges.
- Coding and Debugging: Provide reliable coding solutions, optimizations, and error debugging in various programming languages.
- Discussion and Debate: Engage in meaningful conversations on subjective or philosophical topics, offering well-reasoned perspectives.
The Marco-o1 model seamlessly blends analytical rigor with creative adaptability, making it an exceptional choice for a wide range of applications in education, research, and beyond.
Run Ollama [ Marco-o1 ]
Ollama is a powerful tool that simplifies running machine learning models, allowing you to manage GGUF models effortlessly. This guide outlines the steps to download, install, and run your models quickly. To get started, download Ollama from https://ollama.com/download and install it on your Windows or Mac system. Once installed, creating a GGUF model involves a few straightforward steps.
First, create a model file and name it appropriately, such as metallama. Inside this file, include a FROM line to specify the base model you want to use. For instance, you can use FROM Llama-3.2-1B.F16.gguf. Ensure the specified model file is in the same directory as your script. Next, open your terminal and run the command ollama create metallama -f ./metallama to create and patch your model. After the process completes, you can confirm the successful creation of the model by running ollama list and ensuring metallama appears in the list.
To run your newly created model, use the command ollama run metallama in your terminal. You can then interact with the model directly. For example, asking the model to "write a mini passage about Space X" might generate a response highlighting Space X's revolutionary role in aerospace, its reusable rockets, and its vision for establishing colonies on Mars.
Sample Usage
In the command prompt, you can execute:
D:\>ollama run metallama
You can interact with the model like this:
>>> write a mini passage about space x
Space X, the private aerospace company founded by Elon Musk, is revolutionizing the field of space exploration.
With its ambitious goals to make humanity a multi-planetary species and establish a sustainable human presence in
the cosmos, Space X has become a leading player in the industry. The company's spacecraft, like the Falcon 9, have
demonstrated remarkable capabilities, allowing for the transport of crews and cargo into space with unprecedented
efficiency. As technology continues to advance, the possibility of establishing permanent colonies on Mars becomes
increasingly feasible, thanks in part to the success of reusable rockets that can launch multiple times without
sustaining significant damage. The journey towards becoming a multi-planetary species is underway, and Space X
plays a pivotal role in pushing the boundaries of human exploration and settlement.
With these easy steps, Ollama enables you to download, install, and operate custom or pre-trained models seamlessly. Whether you're exploring Llama’s capabilities or working on custom GGUF models, Ollama offers an efficient and user-friendly solution to achieve your machine learning objectives.
- Downloads last month
- 113
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for prithivMLmods/Marco-o1-GGUF
Base model
AIDC-AI/Marco-o1