YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Mobile VLA: Vision-Language-Action System for Omniwheel Robot Navigation

Model Description

This model is a Vision-Language-Action (VLA) system adapted from RoboVLMs framework for omniwheel robot navigation. It demonstrates framework robustness by successfully adapting from robot manipulator tasks to mobile robot navigation tasks.

Performance

MAE: 0.222 (72.5% improvement from baseline)
Task: Omniwheel Mobile Robot Navigation
Framework: RoboVLMs adapted for mobile robots
Performance Level: Practical

Key Features

Task Adaptation: Successfully adapted from manipulator to mobile robot tasks
Framework Robustness: Cross-domain application capability
Omniwheel Optimization: Omnidirectional control for mobile robots
Real-world Applicability: Practical navigation performance

Model Architecture

Vision Encoder: Kosmos-2 based image processing
Language Encoder: Korean text command understanding
Action Predictor: 2D action prediction (linear_x, linear_y)
Output: Continuous action values for robot control

Usage

import torch

# Load model
model = torch.load("best_simple_lstm_model.pth")

# Example usage
image = load_image("robot_environment.jpg")
text_command = "Move forward to the target"
action = model.predict_action(image, text_command)

Training Data

Dataset: Mobile VLA Dataset
Total Frames: 1,296
Action Range: linear_x [0.0, 1.15], linear_y [-1.15, 1.15]
Action Pattern: Forward (56.1%), Left turn (10%), Right turn (7.2%)

Research Contribution

This work demonstrates the robustness of VLA frameworks by successfully adapting RoboVLMs from robot manipulator tasks to mobile robot navigation tasks, achieving practical performance with MAE 0.222.

Citation

@article{mobile_vla_2024,
  title={Mobile VLA: Vision-Language-Action System for Omniwheel Robot Navigation},
  author={Your Name},
  journal={arXiv preprint},
  year={2024}
}

License

MIT License

Model Performance: MAE 0.222 | Task: Omniwheel Robot Navigation | Framework: RoboVLMs Adapted

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support