--- title: ComputeAgent - Hivenet AI Deployment emoji: ๐ colorFrom: yellow colorTo: red sdk: docker app_port: 7860 pinned: false license: apache-2.0 short_description: AI-Powered Deployment using MCP of Compute by Hivenet tags: - mcp-in-action-track-enterprise - mcp-in-action-track-consumer - mcp-in-action-track-creative --- # ๐ ComputeAgent - Autonomous AI Deployment via MCP **An Intelligent Multi-Agent System for Zero-Friction Model Deployment on HiveCompute** ๐ **Hackathon Entry:** [Agents & MCP Hackathon โ Winter 2025 (Track 2: Agentic Applications)](https://huggingface.co/Agents-MCP-Hackathon-Winter25#-track-2-agentic-applications) --- ## ๐ฏ Overview ComputeAgent transforms the complex process of deploying large-scale AI models into a single natural-language command. Built for the **MCP 1st Birthday Hackathon**, this autonomous system leverages the Model Context Protocol (MCP) to deploy any Hugging Face model onto HiveCompute infrastructure with zero manual configuration. **What once required hours of DevOps work now takes seconds.** Simply say: *"Deploy meta-llama/Llama-3.1-70B"* โ and ComputeAgent handles everything: capacity estimation, infrastructure provisioning, vLLM configuration, and deployment execution. --- ## ๐ฎ Live Demo Try the chatbot: **[https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent](https://huggingface.co/spaces/MCP-1st-Birthday/Hivenet_ComputeAgent)** --- ## ๐น Preview --- ## ๐น Linkedin Post Linkedin Post: **[https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/](https://www.linkedin.com/feed/update/urn:li:activity:7400886621627285505/)** --- ## ๐ก The Problem Deploying AI models at scale remains frustratingly manual and error-prone: - โ **Manual capacity planning** - Calculating GPU memory requirements for each model - โ **Complex infrastructure setup** - SSH keys, networking, environment dependencies - โ **Inference server configuration** - vLLM, TensorRT-LLM parameter tuning - โ **Trial-and-error debugging** - Hours spent troubleshooting deployment issues - โ **High barrier to entry** - Requires DevOps expertise that many researchers lack This friction slows innovation and makes large-model deployment inaccessible to many teams. --- ## โจ Our Solution ComputeAgent introduces **autonomous compute orchestration** through a multi-agent MCP architecture that thinks, plans, and acts on your behalf: ### The Workflow 1. **๐ค Natural Language Interface** - Chat with the agent to deploy models 2. **๐ง Intelligent Analysis** - Automatically estimates GPU requirements from model architecture 3. **โก Automated Provisioning** - Spins up HiveCompute instances via MCP 4. **๐ง Smart Configuration** - Generates optimized vLLM commands 5. **โ Human-in-the-Loop** - Review and approve each step with modification capabilities 6. **๐ฏ One-Click Deployment** - From request to running endpoint in minutes **Powered entirely by open-source models** (GPT-OSS-20B orchestrator) running on HiveCompute infrastructure. --- ## ๐ฎ Key Features ### ๐ค Conversational Deployment Deploy any Hugging Face model through natural language: ``` "Deploy meta-llama/Llama-3.1-70B on RTX 5090 in France" "I need Mistral-7B with low latency" "Deploy GPT-OSS-20B for production" ``` ### ๐ง Tool Approval System Complete control with human-in-the-loop oversight: - **โ Approve All** - Execute all proposed tools - **โ Reject All** - Skip tool execution and get alternative responses - **๐ง Selective Approval** - Choose specific tools (e.g., "1,3,5") - **๐ Modify Arguments** - Edit parameters before execution - **๐ Re-Reasoning** - Provide feedback for agent reconsideration ### ๐ Automatic Capacity Estimation Intelligent resource planning: - Calculates GPU memory from model architecture - Recommends optimal GPU types and quantities - Considers tensor parallelism and quantization - Accounts for KV cache and activation memory ### ๐ Multi-Location Support Deploy across global regions: - ๐ซ๐ท **France** - ๐ฆ๐ช **UAE** - ๐บ๐ธ **Texas** ### ๐ฏ GPU Selection Support for latest hardware: - NVIDIA RTX 4090 (24GB VRAM) - NVIDIA RTX 5090 (32GB VRAM) - Multi-GPU configurations - Automatic tensor parallelism setup ### Custom Capacity Configuration Override automatic estimates ### Tool Modification Edit tool arguments before execution: ```json { "name": "meta-llama-llama-3-1-8b", "location": "uae", "config": "1x RTX4090" } ``` ### ๐ฌ Interactive Gradio UI Beautiful, responsive interface: - Real-time chat interaction - Tool approval panels - Capacity configuration editor - Session management ### โก Real-time Processing Fast and responsive: - Async API with FastAPI --- ## ๐ Quick Start ### Deploy Your First Model #### 1. **Simple Deployment** ``` Deploy meta-llama/Llama-3.1-8B ``` The agent will: - Analyze the model (8B parameters, ~16GB VRAM needed) - Recommend 1x RTX 4090 - Generate vLLM configuration - Provision infrastructure - Provide deployment commands --- ## ๐ Learning Resources ### Understanding MCP - [Model Context Protocol Specification](https://modelcontextprotocol.io/) - [MCP Documentation](https://github.com/modelcontextprotocol) ### LangGraph & Agents - [LangGraph Documentation](https://langchain-ai.github.io/langgraph/) - [Building Agentic Systems](https://python.langchain.com/docs/modules/agents/) ### vLLM Deployment - [vLLM Documentation](https://docs.vllm.ai/) - [Optimizing Inference](https://docs.vllm.ai/en/latest/serving/performance.html) --- ## ๐ฅ Team **Team Name:** Hivenet AI Team **Team Members:** - **Igor Carrara** - [@carraraig](https://huggingface.co/carraraig) - AI Scientist - **Mamoutou Diarra** - [@mdiarra](https://huggingface.co/mdiarra) - AI Scientist --- ## ๐ Hackathon Context Created for the **MCP 1st Birthday Hackathon**, celebrating the first anniversary of the Model Context Protocol with innovative AI applications that demonstrate the power of standardized tool-use and agent orchestration. ### Why This Matters ComputeAgent showcases: - โ **MCP's power** for building production-grade agents - โ **Human-in-the-loop** design for responsible AI - โ **Real-world utility** solving actual deployment pain points - โ **Open-source first** approach with accessible technology --- ## ๐ค Contributing We welcome contributions! Here's how you can help: ### Areas for Contribution - ๐ **Bug fixes** - Report and fix issues - โจ **New features** - Add support for more models or GPUs - ๐ **Documentation** - Improve guides and examples - ๐งช **Testing** - Add test coverage - ๐จ **UI/UX** - Enhance the interface --- ## ๐ License Apache 2.0 --- --- ## ๐ About Hivenet & HiveCompute ### What is Hivenet? Hivenet provides secure, sustainable cloud storage and computing through a distributed network, utilizing unused computing power from devices worldwide rather than relying on massive data centers. This approach makes cloud computing more efficient, affordable, and environmentally friendly. ### HiveCompute: Distributed GPU Cloud **Compute with Hivenet** is a revolutionary GPU cloud computing platform that democratizes access to high-performance computing resources. #### ๐ฏ Key Features **๐ High-Performance GPUs** - Instant access to dedicated GPU nodes powered by RTX 4090 and RTX 5090 - Performance that matches or exceeds traditional data center GPUs - Perfect for AI inference, training, rendering, and scientific computing **๐ฐ Transparent & Affordable Pricing** - Per-second billing with up to 58% savings compared to GCP, AWS, and Azure - No hidden egress fees or long-term commitments - Pay only for what you use with prepaid credits **๐ Global Infrastructure** - GPU clusters run locally in the UAE, France, and the USA for lower latency and tighter compliance - Built-in GDPR compliance - Data stays local for faster AI model responses **โป๏ธ Sustainable Computing** - Uses unused computing power from devices worldwide instead of power-hungry data centers - Reduces carbon footprint by up to 77% compared to traditional cloud services - Community-driven distributed infrastructure - Utilizes existing, underutilized hardware - Reduces the need for new data center construction **โก Instant Setup** - Launch GPU instances in seconds - Pre-configured templates for popular frameworks - Jupyter notebooks and SSH access included - Pause/resume instances without losing your setup **๐ Enterprise-Grade Reliability** - Workloads automatically replicate across trusted nodes, keeping downtime near-zero - Hive-Certified providers with 99.9% uptime SLA - Tier-3 data center equivalent quality Learn more at [compute.hivenet.com](https://compute.hivenet.com/) --- ## ๐ฌ Support ### Need Help? - ๐ **Documentation** - Check this README and inline code comments - ๐ง **Email** - Contact the HiveNet team ---