YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Mobile VLA: Vision-Language-Action System for Omniwheel Robot Navigation
Model Description
This model is a Vision-Language-Action (VLA) system adapted from RoboVLMs framework for omniwheel robot navigation. It demonstrates framework robustness by successfully adapting from robot manipulator tasks to mobile robot navigation tasks.
Performance
- MAE: 0.222 (72.5% improvement from baseline)
- Task: Omniwheel Mobile Robot Navigation
- Framework: RoboVLMs adapted for mobile robots
- Performance Level: Practical
Key Features
- Task Adaptation: Successfully adapted from manipulator to mobile robot tasks
- Framework Robustness: Cross-domain application capability
- Omniwheel Optimization: Omnidirectional control for mobile robots
- Real-world Applicability: Practical navigation performance
Model Architecture
- Vision Encoder: Kosmos-2 based image processing
- Language Encoder: Korean text command understanding
- Action Predictor: 2D action prediction (linear_x, linear_y)
- Output: Continuous action values for robot control
Usage
import torch
# Load model
model = torch.load("best_simple_lstm_model.pth")
# Example usage
image = load_image("robot_environment.jpg")
text_command = "Move forward to the target"
action = model.predict_action(image, text_command)
Training Data
- Dataset: Mobile VLA Dataset
- Total Frames: 1,296
- Action Range: linear_x [0.0, 1.15], linear_y [-1.15, 1.15]
- Action Pattern: Forward (56.1%), Left turn (10%), Right turn (7.2%)
Research Contribution
This work demonstrates the robustness of VLA frameworks by successfully adapting RoboVLMs from robot manipulator tasks to mobile robot navigation tasks, achieving practical performance with MAE 0.222.
Citation
@article{mobile_vla_2024,
title={Mobile VLA: Vision-Language-Action System for Omniwheel Robot Navigation},
author={Your Name},
journal={arXiv preprint},
year={2024}
}
License
MIT License
Model Performance: MAE 0.222 | Task: Omniwheel Robot Navigation | Framework: RoboVLMs Adapted
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support