--- title: ComputeAgent - Hivenet AI Deployment emoji: ๐Ÿš€ colorFrom: yellow colorTo: red sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: AI-Powered Deployment using MCP of Compute by Hivenet tags: - mcp-in-action-track-enterprise - mcp-in-action-track-consumer - mcp-in-action-track-creative --- # ๐Ÿš€ ComputeAgent - Autonomous AI Deployment via MCP **An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute** ๐Ÿ”— **Hackathon Entry:** [Agents & MCP Hackathon โ€“ Winter 2025 (Track 2: Agentic Applications)](https://huggingface.co/Agents-MCP-Hackathon-Winter25#-track-2-agentic-applications) --- ## ๐ŸŽฏ Overview ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the **MCP 1st Birthday Hackathon**, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration. **What once required hours of DevOps work now takes seconds.** Simply say: *"Deploy meta-llama/Llama-3.1-70B"* โ€” and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution. --- ## ๐Ÿ”ฎ Live Demo Try the chatbot: **[https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent](https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent)** --- ## ๐Ÿ“น Preview --- ## ๐Ÿ“น Linkedin Post Linkedin Post: **[https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/](https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/)** --- ## ๐Ÿ’ก The Problem Deploying AI models at scale remains frustratingly manual and error-prone: - โŒ **Manual capacity planning** - Calculating GPU memory requirements for each model - โŒ **Complex infrastructure setup** - SSH keys, networking, environment dependencies - โŒ **Inference server configuration** - vLLM, TensorRT-LLM parameter tuning - โŒ **Trial-and-error debugging** - Hours spent troubleshooting deployment issues - โŒ **High barrier to entry** - Requires DevOps expertise that many researchers lack This friction slows innovation and makes large-model deployment inaccessible to many teams. --- ## โœจ Our Solution ComputeAgent introduces **autonomous compute orchestration** through a multi-agent MCP architecture that thinks, plans, and acts on your behalf: ### The Workflow 1. **๐Ÿค– Natural Language Interface** - Chat with the agent to deploy models 2. **๐Ÿง  Intelligent Analysis** - Automatically estimates GPU requirements from model architecture 3. **โšก Automated Provisioning** - Spins up HiveCompute instances via MCP 4. **๐Ÿ”ง Smart Configuration** - Generates optimized vLLM commands 5. **โœ… Human-in-the-Loop** - Review and approve each step with modification capabilities 6. **๐ŸŽฏ One-Click Deployment** - From request to running endpoint in minutes **Powered entirely by open-source models** (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure. --- ## ๐ŸŽฎ Key Features ### ๐Ÿค– Conversational Deployment Deploy any Hugging Face model through natural language: ``` "Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France" "I need Mistral-7B with low latency" "Deploy GPT-OSS-20B for production" ``` ### ๐Ÿ”ง Tool Approval System Complete control with human-in-the-loop oversight: - **โœ… Approve All** - Execute all proposed tools - **โŒ Reject All** - Skip tool execution and get alternative responses - **๐Ÿ”ง Selective Approval** - Choose specific tools (e.g., "1,3,5") - **๐Ÿ“ Modify Arguments** - Edit parameters before execution - **๐Ÿ”„ Re-Reasoning** - Provide feedback for agent reconsideration ### ๐Ÿ“Š Automatic Capacity Estimation Intelligent resource planning: - Calculates GPU memory from model architecture - Recommends optimal GPU types and quantities - Considers tensor parallelism and quantization - Accounts for KV cache and activation memory ### ๐ŸŒ Multi-Location Support Deploy across global regions: - ๐Ÿ‡ซ๐Ÿ‡ท **France** - ๐Ÿ‡ฆ๐Ÿ‡ช **UAE** - ๐Ÿ‡บ๐Ÿ‡ธ **Texas** ### ๐ŸŽฏ GPU Selection Support for latest hardware: - NVIDIA RTX 4090 (24GB VRAM) - NVIDIA RTX 5090 (32GB VRAM) - Multi-GPU configurations - Automatic tensor parallelism setup ### Custom Capacity Configuration Override automatic estimates ### Tool Modification Edit tool arguments before execution: ```json { "name": "meta-llama-llama-3-1-8b", "location": "uae", "config": "1x RTX4090" } ``` ### ๐Ÿ’ฌ Interactive Gradio UI Beautiful, responsive interface: - Real-time chat interaction - Tool approval panels - Capacity configuration editor - Session management ### โšก Real-time Processing Fast and responsive: - Async API with FastAPI --- ## ๐Ÿš€ Quick Start ### Deploy Your First Model #### 1. **Simple Deployment** ``` Deploy meta-llama/Llama-3.1-8B ``` The agent will: - Analyze the model (8B parameters, ~16GB VRAM needed) - Recommend 1x RTX 4090 - Generate vLLM configuration - Provision infrastructure - Provide deployment commands --- ## ๐ŸŽ“ Learning Resources ### Understanding MCP - [Model Context Protocol Specification](https://modelcontextprotocol.io/) - [MCP Documentation](https://github.com/modelcontextprotocol) ### LangGraph & Agents - [LangGraph Documentation](https://langchain-ai.github.io/langgraph/) - [Building Agentic Systems](https://python.langchain.com/docs/modules/agents/) ### vLLM Deployment - [vLLM Documentation](https://docs.vllm.ai/) - [Optimizing Inference](https://docs.vllm.ai/en/latest/serving/performance.html) --- ## ๐Ÿ‘ฅ Team **Team Name:** Hivenet AI Team **Team Members:** - **Igor Carrara** - [@carraraig](https://huggingface.co/carraraig) - AI Scientist - **Mamoutou Diarra** - [@mdiarra](https://huggingface.co/mdiarra) - AI Scientist --- ## ๐ŸŽ‰ Hackathon Context Created for the **MCP 1st Birthday Hackathon**, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration. ### Why This Matters ComputeAgent showcases: - โœ… **MCP's power** for building production-grade agents - โœ… **Human-in-the-loop** design for responsible AI - โœ… **Real-world utility** solving actual deployment pain points - โœ… **Open-source first** approach with accessible technology --- ## ๐Ÿค Contributing We welcome contributions! Here's how you can help: ### Areas for Contribution - ๐Ÿ› **Bug fixes** - Report and fix issues - โœจ **New features** - Add support for more models or GPUs - ๐Ÿ“š **Documentation** - Improve guides and examples - ๐Ÿงช **Testing** - Add test coverage - ๐ŸŽจ **UI/UX** - Enhance the interface --- ## ๐Ÿ“„ License Apache 2.0 --- --- ## ๐ŸŒ About Hivenet & HiveCompute ### What is Hivenet? Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly. ### HiveCompute: Distributed GPU Cloud **Compute with Hivenet** is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources. #### ๐ŸŽฏ Key Features **๐Ÿš€ High-Performance GPUs** - Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090 - Performance that matches or exceeds traditional data center GPUs - Perfect for AI inference, training, rendering, and scientific computing **๐Ÿ’ฐ Transparent & Affordable Pricing** - Per-second billing with up to 58% savings compared to GCP, AWS, and Azure - No hidden egress fees or long-term commitments - Pay only for what you use with prepaid credits **๐ŸŒ Global Infrastructure** - GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance - Built-in GDPR compliance - Data stays local for faster AI model responses **โ™ป๏ธ Sustainable Computing** - Uses unused computing power from devices worldwide instead of power-hungry data centers - Reduces carbon footprint by up to 77% compared to traditional cloud services - Community-driven distributed infrastructure - Utilizes existing, underutilized hardware - Reduces the need for new data center construction **โšก Instant Setup** - Launch GPU instances in seconds - Pre-configured templates for popular frameworks - Jupyter notebooks and SSH access included - Pause/resume instances without losing your setup **๐Ÿ”’ Enterprise-Grade Reliability** - Workloads automatically replicate across trusted nodes, keeping downtime near-zero - Hive-Certified providers with 99.9% uptime SLA - Tier-3 data center equivalent quality Learn more at [compute.hivenet.com](https://compute.hivenet.com/) --- ## ๐Ÿ’ฌ Support ### Need Help? - ๐Ÿ“– **Documentation** - Check this README and inline code comments - ๐Ÿ“ง **Email** - Contact the HiveNet team ---
**Built with โค๏ธ by the HiveNet Team** *Making large-scale AI deployment accessible to everyone.*