Instructions to use huihui-ai/Huihui3.5-67B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use huihui-ai/Huihui3.5-67B-A3B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="huihui-ai/Huihui3.5-67B-A3B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("huihui-ai/Huihui3.5-67B-A3B") model = AutoModelForImageTextToText.from_pretrained("huihui-ai/Huihui3.5-67B-A3B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use huihui-ai/Huihui3.5-67B-A3B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "huihui-ai/Huihui3.5-67B-A3B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "huihui-ai/Huihui3.5-67B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/huihui-ai/Huihui3.5-67B-A3B
- SGLang
How to use huihui-ai/Huihui3.5-67B-A3B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "huihui-ai/Huihui3.5-67B-A3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "huihui-ai/Huihui3.5-67B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "huihui-ai/Huihui3.5-67B-A3B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "huihui-ai/Huihui3.5-67B-A3B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use huihui-ai/Huihui3.5-67B-A3B with Docker Model Runner:
docker model run hf.co/huihui-ai/Huihui3.5-67B-A3B
huihui-ai/Huihui3.5-67B-A3B
Model Overview
huihui-ai/Huihui3.5-67B-A3B is a Mixture of Experts (MoE) language model developed by huihui.ai, built upon the Qwen/Qwen3.5-35B-A3B base model. It enhances the standard Transformer architecture by replacing MLP layers with MoE layers, each containing 512 experts, to achieve high performance with efficient inference. The model is designed for natural language processing tasks, including image-text-to-text generation, question answering, and conversational applications.
Note huihui-ai/Huihui3.5-67B-A3B is not an ablated model..
This is just a test. The exploration of merging different manifestations of models of the same type is another possibility.
- Architecture: Qwen3_5MoeForConditionalGeneration model with 512 experts per layer, activating 8 expert per token.
- Total Parameters: ~67 billion (67B)
- Activated Parameters: ~3 billion (3B) during inference, comparable to Qwen/Qwen3.5-35B-A3B
- Developer: huihui.ai
- Release Date: January 2026
- License: Inherits the license of the Qwen3.5 base model (apache-2.0)
Expert Models:
Expert 1-256:
Expert 257-512:
Instruction Following:
Training
- Base Model: Qwen/Qwen3.5-35B-A3B
- Conversion: The model copies embeddings, self-attention, and normalization weights from Qwen/Qwen3.5-35B-A3B, replacing MLP layers with MoE layers (512 experts). Gating weights are randomly initialized.
- Fine-Tuning: Not fine-tuned; users are recommended to fine-tune for specific tasks to optimize expert routing.
ollama
You can use huihui_ai/huihui3.5:67b directly,
ollama run huihui_ai/huihui3.5:67b
Applications
- image-text-to-text Generation: Articles, dialogues, and creative writing.
- Question Answering: Information retrieval and query resolution.
- Conversational AI: Multi-turn dialogues for chatbots.
- Research: Exploration of MoE architectures and efficient model scaling.
Limitations
- Fine-Tuning Required: Randomly initialized gating weights may lead to suboptimal expert utilization without fine-tuning.
- Compatibility: Developed with transformers 5.5.0; ensure matching versions to avoid loading issues.
- Inference Speed: While efficient for an MoE model, performance depends on hardware (GPU recommended).
Ethical Considerations
- Bias: Inherits potential biases from the Qwen3-4B-abliterated base model; users should evaluate outputs for fairness.
- Usage: Intended for research and responsible applications; avoid generating harmful or misleading content.
Contact
- Developer: huihui.ai
- Repository: huihui-ai/Huihui3.5-67B-A3B (available locally or on Hugging Face)
- Issues: Report bugs or request features via the repository or please send an email to support@huihui.ai
- Downloads last month
- 19
Model tree for huihui-ai/Huihui3.5-67B-A3B
Base model
Qwen/Qwen3.5-35B-A3B-Base