ALWAS ML Models — Analog Layout Workflow Automation System

4 production-ready ML models for the ALWAS system. Replaces the Groq LLM API dependency with faster, free, local inference.

🎯 Models

Model	Task	Metric	Value
Hours Estimator	Predict layout hours from block metadata	R² / MAE	0.881 / 5.78h
Complexity Classifier	Classify Low/Medium/High complexity	Accuracy / F1	91.7% / 0.917
Bottleneck Predictor	Detect blocks at risk of getting stuck	Accuracy / F1	99.6% / 0.996
Completion Predictor	Predict remaining hours to completion	R² / MAE	0.945 / 1.65h

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    ALWAS ML Pipeline                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Block Created ──► Hours Estimator (XGBoost) ──► Est. Hours    │
│                ──► Complexity Classifier (XGB+LGB) ──► Class   │
│                                                                 │
│  Block In-Progress ──► Bottleneck Predictor ──► Risk Alert     │
│                    ──► Completion Predictor ──► ETA             │
│                                                                 │
│  Hourly Cron ──► Batch Bottleneck Scan ──► Notifications       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

🚀 Quick Start

Python (Direct)

import joblib
import numpy as np

# Load models
hours_model = joblib.load('models/hours_estimator.joblib')
complexity_xgb = joblib.load('models/complexity_xgb.joblib')
complexity_lgb = joblib.load('models/complexity_lgb.joblib')
bottleneck_model = joblib.load('models/bottleneck_predictor.joblib')
completion_model = joblib.load('models/completion_predictor.joblib')

# Load encoders
tech_node_encoder = joblib.load('models/tech_node_encoder.joblib')
block_type_encoder = joblib.load('models/block_type_encoder.joblib')

REST API

# Install
pip install fastapi uvicorn joblib xgboost lightgbm scikit-learn numpy

# Run
MODEL_DIR=./models python inference_server.py

# Call
curl -X POST http://localhost:7860/predict/estimate \
  -H "Content-Type: application/json" \
  -d '{
    "block_type": "PLL",
    "tech_node": "7nm",
    "priority": "P1-Critical",
    "transistor_count": 80000,
    "has_dependencies": true,
    "num_dependencies": 3,
    "constraint_complexity": 2.5,
    "drc_iterations": 4
  }'

Response:

{
  "complexity": "High",
  "estimated_hours": 89.0,
  "confidence": 0.996,
  "risk_level": "high",
  "reasoning": "Advanced 7nm node requires extensive DRC/LVS iterations...",
  "recommended_drc_iterations": 4,
  "suggested_engineer_skill_level": "senior",
  "complexity_probabilities": {"High": 0.996, "Low": 0.0, "Medium": 0.003},
  "estimated_days": 11.1
}

📡 API Endpoints

Method	Endpoint	Description
`POST`	`/predict/estimate`	Complexity & hours estimation (replaces Groq)
`POST`	`/predict/bottleneck`	Bottleneck risk prediction
`POST`	`/predict/completion`	Completion time prediction
`POST`	`/predict/bulk-estimate`	Bulk estimation (up to 200 blocks)
`GET`	`/model/metrics`	Model performance metrics
`GET`	`/model/supported-values`	Supported block types, tech nodes, etc.
`GET`	`/health`	Health check

🔌 ALWAS Integration

Replace Groq API in Express.js

Before (server/routes/blocks.js):

// Old: Groq LLM call ($0.002/request, 300ms latency)
const response = await groq.chat.completions.create({
  model: "llama-3.3-70b-versatile",
  messages: [{ role: "user", content: prompt }]
});

After (using ALWAS ML API):

// New: Local ML model (free, <5ms latency)
const response = await fetch('http://localhost:7860/predict/estimate', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    block_type: block.type,
    tech_node: block.techNode,
    priority: block.priority,
    transistor_count: block.transistorCount,
    has_dependencies: block.dependencies?.length > 0,
    num_dependencies: block.dependencies?.length || 0,
    constraint_complexity: block.constraintComplexity || 1.0,
    drc_iterations: block.drcIterations || 2
  })
});
const estimate = await response.json();

Add Bottleneck Scanning to Cron Job

// In server/cron/bottleneckScanner.js
const blocks = await Block.find({ status: { $ne: 'Completed' } });

for (const block of blocks) {
  const risk = await fetch('http://localhost:7860/predict/bottleneck', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      block_type: block.type,
      tech_node: block.techNode,
      estimated_hours: block.estimatedHours,
      hours_logged: block.hoursLogged,
      current_stage: block.status,
      days_in_current_stage: daysSinceLastTransition(block),
      drc_violations_total: block.drcViolations,
      is_overdue: new Date() > block.dueDate
    })
  });
  const result = await risk.json();
  
  if (result.should_alert) {
    // Create notification for manager
    await Notification.create({
      type: 'stuck',
      message: `ML Alert: ${block.name} has HIGH bottleneck risk`,
      recommendations: result.recommendations
    });
    io.emit('newNotification', { blockId: block._id, risk: result });
  }
}

Add Completion ETA to Block Detail

// In GET /api/blocks/:id
const completion = await fetch('http://localhost:7860/predict/completion', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    block_type: block.type,
    tech_node: block.techNode,
    estimated_hours: block.estimatedHours,
    current_stage: block.status,
    cumulative_hours: block.hoursLogged,
    cumulative_days: daysSinceStart(block),
    cumulative_drc_violations: block.drcViolations
  })
});
const eta = await completion.json();
// eta.remaining_hours, eta.estimated_completion_date, eta.progress_percent

📊 Supported Values

Block Types (20)

ADC, BGR, BandgapRef, Comparator, CurrentMirror, DAC, DiffAmp, LDO, LNA, LVDS_Driver, Mixer, OTA, Oscillator, PA, PLL, PowerDetector, SampleHold, SerDes, TIA, VCO

Technology Nodes (8)

5nm, 7nm, 12nm, 14nm, 22nm, 28nm, 45nm, 65nm

Pipeline Stages (7)

Not Started → In Progress → DRC → LVS → ERC → Review → Completed

📈 Feature Importance

Hours Estimation — Top Features

transistor_count_log (31.5%) — Most predictive: larger blocks take longer
transistor_count (28.6%) — Raw count captures non-log relationships
engineer_skill_factor (7.7%) — Skill level matters significantly
tech_node_encoded (6.8%) — Advanced nodes are harder
constraint_complexity (2.7%) — Analog constraints add overhead

Completion Prediction — Top Features

current_stage_idx (44.9%) — Current stage is the strongest signal
stages_completed (22.3%) — Progress through pipeline
avg_hours_per_stage_so_far (21.0%) — Pace of work predicts future

🔧 Retraining

# Generate new training data from ALWAS MongoDB exports
python training/generate_dataset.py

# Train all models
python training/train_models.py
python training/train_completion.py

Recommended retraining schedule: Monthly, or when >100 new completed blocks accumulate.

📦 Files

models/
  hours_estimator.joblib          # XGBoost regressor
  complexity_xgb.joblib           # XGBoost classifier (ensemble member)
  complexity_lgb.joblib           # LightGBM classifier (ensemble member)
  bottleneck_predictor.joblib     # Calibrated XGBoost classifier
  completion_predictor.joblib     # XGBoost regressor for remaining time
  tech_node_encoder.joblib        # LabelEncoder
  block_type_encoder.joblib       # LabelEncoder
  priority_encoder.joblib         # OrdinalEncoder
  complexity_encoder.joblib       # LabelEncoder
  bottleneck_encoder.joblib       # LabelEncoder
  feature_config.json             # Feature lists and supported values
  metrics.json                    # Model evaluation metrics
inference_server.py               # FastAPI inference server
training/
  generate_dataset.py             # Synthetic data generator
  train_models.py                 # Model training (Models 1-3)
  train_completion.py             # Completion model training (Model 4)

📐 Performance vs Groq API

Metric	Groq llama-3.3-70b	ALWAS ML Models
Latency	~300ms	<5ms
Cost per request	$0.002	Free
Internet required	Yes	No
Structured output	Sometimes	Always (JSON guaranteed)
Batch support	Limited	200 blocks/call
Bottleneck detection	No	Yes (real-time)
Completion prediction	No	Yes (R²=0.945)
Explainability	LLM narrative	Feature importance + reasoning

License

MIT — Built for EPIC Build-A-Thon 2026 | Epical Layouts Pvt. Ltd.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support