Crab SmolVLA — SA-RWFM Teacher (Tactile)

Fine-tuned SmolVLA with Sensitivity-Aware Reward-Weighted Flow Matching (SA-RWFM) and dual tactile sensors for right-arm manipulation on the Crab robot.

This model serves as the tactile-conditioned teacher for knowledge distillation into HapticVLA.

Model Details

Base model: lerobot/smolvla_base (450M params) + DualTactileEncoder
Action space: 6-DOF absolute joint positions (indices 6–11)
State input: 6D proprioception + 128D tactile embedding (2×10×10 force matrices)
Training data: 27 demonstrations + reward labels across 3 tasks
Best validation loss: 6.56 (note: RWFM loss is not directly comparable to standard MSE)
Training: 50K steps, RTX 5090, ~4 hrs

Key Features

Dual tactile sensing: Processes left and right 10×10 tactile force matrices
Reward-weighted flow matching: Upweights successful demonstrations, downweights failures
Anchor regularization: Prevents reward weight collapse

Performance (Sync Mode, 20 trials per task)

Task	Success Rate	Force Errors
Eggs	85%	3/20
Can	55%	9/20
Waffles	85%	3/20
Mean	75.0%	15/60

Note: This model requires tactile sensor hardware at inference. For a tactile-free alternative with better performance, see HapticVLA.

Usage

import torch
checkpoint = torch.load("best/model.pt", map_location="cpu")

See Advanced-Robotic-Manipulation/crab for full inference pipeline.

Citation

If you use this model, please cite our paper:

@article{gubernatorov2026hapticvla,
  title={HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing},
  author={Gubernatorov, Konstantin and Sannikov, Mikhail and Mikhalchuk, Ilya and Kuznetsov, Egor and Artemov, Makar and Ouwatobi, Ogunwoye Faith and Fernando, Marcelino and Asanov, Artem and Guo, Ziang and Tsetserukou, Dzmitry},
  journal={arXiv preprint arXiv:2603.15257},
  year={2026}
}

Downloads last month: 7

Video Preview

Robotics

Model tree for armteam/crab-smolvla-rwfm

Base model

lerobot/smolvla_base

Finetuned

(6022)

this model

Paper for armteam/crab-smolvla-rwfm

HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing

Paper • 2603.15257 • Published Mar 16