SmolVLA Trainer with LeRobot

This runbook is for operators running SmolVLA training in Plato with LeRobot datasets. It complements the parameter contract in Configuration Parameters.

Migration note

Older setup notes may still reference uv sync --extra robotics. That root-package extra no longer exists. For current Plato, keep LeRobot / SmolVLA dependencies in a dedicated environment and verify them with import lerobot before launching these configs.

1) Setup

Install core dependencies:


uv sync

Add the optional LeRobot / SmolVLA robotics stack in a dedicated environment so the default Plato install stays lean.

The exact command depends on how you package the robotics dependencies in your environment. Plato itself only requires that lerobot is importable at runtime before you launch a LeRobot config.

Authenticate to Hugging Face when using private repos:


huggingface-cli login

Verify optional stack import:


uv run python -c "import lerobot; print(lerobot.__version__)"

If your dataset is video-backed, ensure ffmpeg is installed on the host.

2) Config Profiles to Start From

Use the configs added under configs/LeRobot/:

configs/LeRobot/lerobot_datasource_base.toml: shared LeRobot datasource include.
configs/LeRobot/smolvla_single_client_smoke.toml: minimal single-client smoke run.
configs/LeRobot/smolvla_fedavg_two_client_smoke.toml: 2-client FedAvg smoke run.
configs/LeRobot/smolvla_full_finetune.toml: longer full fine-tune profile.

3) Run Commands

Single-client smoke:


uv run python plato.py --config configs/LeRobot/smolvla_single_client_smoke.toml

Two-client federated smoke:


uv run python plato.py --config configs/LeRobot/smolvla_fedavg_two_client_smoke.toml

Full-finetune profile:


uv run python plato.py --config configs/LeRobot/smolvla_full_finetune.toml

4) Required Plato TOML Fields

Minimum contract for this integration:


[data]
datasource = "LeRobot"

[trainer]
type = "lerobot"
model_type = "smolvla"
model_name = "smolvla"

[parameters.policy]
type = "smolvla"
path = "lerobot/smolvla_base"
finetune_mode = "adapter" # or "full"
precision = "fp32"
device = "cpu"            # or "cuda" / "mps"

[parameters.dataset]
repo_id = "lerobot/pusht_image"
delta_timestamps = { observation_image = [-0.2, -0.1, 0.0] }

[parameters.transforms]
image_size = [224, 224]
normalize = true
interpolation = "bilinear"

5) Plato ↔ `lerobot-train` Mapping

Plato config field(s)	`lerobot-train` equivalent	Type
`parameters.policy.path`	`--policy.path`	Direct
`parameters.dataset.repo_id`	`--dataset.repo_id`	Direct
`trainer.batch_size`	`--batch_size`	Direct
`parameters.policy.device`	`--policy.device`	Direct
`trainer.rounds` + `trainer.epochs`	`--steps`	Conceptual scheduling mapping
`server.checkpoint_path` / `server.model_path`	`--output_dir`	Conceptual output-location mapping
`parameters.dataset.delta_timestamps`	LeRobot dataset `delta_timestamps` usage during training	Conceptual data-window mapping
`parameters.policy.finetune_mode` (`full`/`adapter`)	Trainable-parameter strategy during policy training	Conceptual finetune-mode mapping

Notes:

Upstream LeRobot examples for SmolVLA commonly use --steps; Plato uses round/epoch scheduling.
Adapter-mode behavior in Plato is implemented via parameters.policy.finetune_mode and adapter parameter selection in the SmolVLA model wrapper.

6) Troubleshooting

Missing optional robotics dependencies

Symptom:

ImportError mentioning missing lerobot / SmolVLA runtime dependencies.

Action:

Install the LeRobot / SmolVLA stack in the active environment, then rerun the import check:


uv run python -c "import lerobot; print(lerobot.__version__)"

Dataset repo not configured

Symptom:

LeRobot datasource requires "parameters.dataset.repo_id" to be set.

Action:

Set parameters.dataset.repo_id in your TOML.

Private dataset/model access failure

Symptoms:

SmolVLA load failure from parameters.policy.path.
Dataset access/auth failures from the Hub.

Actions:


huggingface-cli login
# optionally for non-interactive runs
export HF_TOKEN=<token>

No episodes found

Symptom:

No episodes found for LeRobot dataset "<repo_id>"

Actions:

Verify parameters.dataset.repo_id exists and contains episodes.
Confirm access permissions for the dataset repository.

Invalid `delta_timestamps` shape

Symptom:

"parameters.dataset.delta_timestamps" must be a mapping of key -> list[float].

Action:

Use mapping syntax, for example:


[parameters.dataset]
delta_timestamps = { observation_image = [-0.2, -0.1, 0.0] }

Device/precision mismatch or OOM

Symptoms:

CUDA/MPS initialization failures.
Out-of-memory during training.

Actions:

Start from configs/LeRobot/smolvla_single_client_smoke.toml (CPU, tiny batch).
Reduce trainer.batch_size.
Use parameters.policy.device = "cpu" for smoke checks.
Move to cuda + higher batch sizes only after smoke passes.

FFmpeg / build issues in robotics stack

Symptom:

Build/runtime errors mentioning FFmpeg or PyAV dependencies.

Actions:

Install host FFmpeg libraries and build toolchain (cmake, build-essential, FFmpeg libs), then reinstall the LeRobot / SmolVLA robotics stack in that environment.

SmolVLA + LeRobot Optional Setup

This setup path is optional. Core Plato federated workloads continue to use the default dependency set from uv sync.

Install the robotics stack in a separate environment

Keep LeRobot / SmolVLA dependencies out of the default Plato environment unless you are actively working on robotics workloads. The only hard requirement for Plato's LeRobot path is that import lerobot succeeds in the environment where you launch the training run.

Environment gating

When adding LeRobot-backed modules, keep imports guarded so non-robotics environments fail with a clear action instead of a hard crash at import time.


try:
    import lerobot
except ImportError as exc:
    raise ImportError(
        "LeRobot support is optional. Install the LeRobot / SmolVLA robotics stack in the active environment before using LeRobot configs."
    ) from exc

Runtime notes for SmolVLA/LeRobot

CUDA-capable GPUs are recommended for practical SmolVLA fine-tuning; CPU is mainly suitable for smoke checks.
Install ffmpeg on hosts that read video-backed LeRobot datasets.
Authenticate with Hugging Face (huggingface-cli login) when accessing private dataset repositories.
LeRobot currently constrains the Torch stack used by this optional path; if you need different Torch constraints for non-robotics research, keep a separate virtual environment.

Quick verification


uv run python -c "import lerobot; print(lerobot.__version__)"

SmolVLA Trainer with LeRobot

1) Setup

2) Config Profiles to Start From

3) Run Commands

4) Required Plato TOML Fields

5) Plato ↔ lerobot-train Mapping

6) Troubleshooting

Missing optional robotics dependencies

Dataset repo not configured

Private dataset/model access failure

No episodes found

Invalid delta_timestamps shape

Device/precision mismatch or OOM

FFmpeg / build issues in robotics stack

SmolVLA + LeRobot Optional Setup

Install the robotics stack in a separate environment

Environment gating

Runtime notes for SmolVLA/LeRobot

Quick verification

5) Plato ↔ `lerobot-train` Mapping

Invalid `delta_timestamps` shape