Skip to content

TensorBoard Integration

WorldForge ships an optional TensorBoard bridge for inspecting the LeWorldModel checkpoint used during local inference in the robotics showcase. It writes sanitized provenance text, latency and cost scalars, a histogram of candidate costs, and a per-event text feed to a local tfevents log directory.

The integration is host-owned and local-only:

  • TensorBoard is not a provider capability and the catalog never advertises it.
  • The base WorldForge install does not depend on tensorboard or torch. The bridge module imports the SDK lazily so the integration is a no-op when the optional extra is absent.
  • tfevents files stay under the host-chosen log directory. WorldForge never uploads them to a hosted service.

What you see in TensorBoard

Tag prefix Plugin What it shows
worldforge/leworldmodel/checkpoint/* Text, Scalars LeWorldModel checkpoint path, policy, repo, revision, provenance JSON, and a created indicator.
worldforge/leworldmodel/robotics_showcase/* Text Task description, checkpoint display path, state directory, sanitized JSON of the full run summary, inputs, and health snapshot.
worldforge/leworldmodel/scores/* Scalars, Histogram Per-candidate costs, min/max/mean, best index, best score, and a histogram of the cost distribution.
worldforge/leworldmodel/metrics/* Scalars Score stats and end-to-end latency metrics from the summary.
worldforge/leworldmodel/events/<provider>/<operation>/* Scalars, Text Per-event attempt count, duration, status code, failure/retry flags, and the sanitized event message.
worldforge/leworldmodel/weights/* Histogram Optional per-parameter weight distributions when a host passes a state dict to log_state_dict_histograms.

The bridge sanitizes secret-like key=value fragments and signed-URL query parameters before writing text panels, so tfevents files are safe to share in adoption stories or evidence bundles.

Install

uv add "worldforge-ai[tensorboard]"

For repository development:

uv sync --group dev --extra tensorboard

The extra installs tensorboard>=2.16,<3. Base WorldForge still depends only on httpx. If the host already provides torch.utils.tensorboard (for example, through an existing PyTorch install) the bridge uses that; tensorboardX is accepted as a fallback when both torch and tensorboard are absent.

Robotics showcase flags

scripts/robotics-showcase enables the bridge by default when an interactive showcase is requested:

Flag Default Purpose
--tensorboard implied (unless --no-tensorboard / --health-only / --json-only) Enable the writer with a default log dir under .worldforge/tensorboard/.
--tensorboard-logdir <path> .worldforge/tensorboard Override the destination directory. Relative paths resolve against the repo root.
--tensorboard-run-name <name> timestamped name Subdirectory under --tensorboard-logdir for this run. Pick a stable name to make repeated runs comparable in the same TensorBoard view.
--tensorboard-flush-secs <int> 30 Flush interval (seconds) for the SummaryWriter.
--no-tensorboard off Disable the wrapper's default TensorBoard recording.

A typical inspection workflow:

# Record a real LeRobot + LeWorldModel run (TensorBoard enabled by default).
scripts/robotics-showcase --json-only --no-tui --no-rerun

# Open the run alongside the published checkpoint.
uvx --from "tensorboard>=2.16,<3" tensorboard --logdir .worldforge/tensorboard

For an explicit destination:

scripts/robotics-showcase \
  --tensorboard-logdir .worldforge/tensorboard/pusht \
  --tensorboard-run-name baseline \
  --no-tui --no-rerun --json-only

The summary JSON gains a "tensorboard" block listing log_dir, run_name, flush_secs, and an events_written flag that mirrors what the wrapper would show under the Artifacts section of the visual report.

The Textual showcase report (worldforge.harness.tui.RoboticsShowcaseApp) surfaces the run with a RoboticsTensorBoardPane and a t keybinding that launches uvx --from "tensorboard>=2.16,<3" --with "setuptools<81" tensorboard --logdir <path> --port 6006 in a detached subprocess. The setuptools<81 constraint keeps pkg_resources available for TensorBoard's import path (TensorBoard still calls import pkg_resources at startup but setuptools>=81 removed that package). The shortcut is also visible in the footer alongside o for Rerun.

Because TensorBoard is a web server (not a GUI app like Rerun), the binding then runs a Textual background worker that polls localhost:6006 every ~0.5 s for up to ~60 s and only opens the browser to http://localhost:6006/ once the port responds. This survives a slow first-run uvx resolve where TensorBoard takes longer than a couple of seconds to bind. The URL is surfaced in the pane and in the notification so headless / remote users can copy-paste it (or set up an SSH tunnel). If webbrowser cannot find a default browser, the binding emits a warning telling the user to visit the URL manually.

The TensorBoard subprocess's stdout and stderr are captured to tensorboard.stdout.log and tensorboard.stderr.log next to the run's events.out.tfevents.* files. When the server never comes up within the poll window, the binding emits an error notification pointing at the stderr log so the failure is debuggable rather than silent.

If the summary lacks a "tensorboard" block (for example when --no-tensorboard is passed), the pane is omitted, no subprocess is started, no browser is opened, and the keybinding emits a warning notification instead.

Non-interactive launcher (worldforge-open-tensorboard)

The same launch / poll / probe flow is available as a CLI for non-interactive use - validating the wiring in CI, opening TensorBoard from a shell without running the Textual report, or testing changes to the launch command:

uv run worldforge-open-tensorboard --logdir .worldforge/tensorboard/<run>

Common flags:

Flag Default Purpose
--logdir <path> required Run directory with events.out.tfevents.* files.
--port <int> 6006 TCP port to bind.
--host <host> localhost Host to poll and probe.
--ready-timeout <sec> 60 How long to wait for the port to bind.
--poll-interval <sec> 0.5 Seconds between TCP probes during the ready wait.
--probe off After ready, fetch http://host:port/ and assert the body contains the TensorBoard marker. Tears the subprocess down on success or failure. Useful for "did it actually work" smoke checks.
--no-browser off Skip webbrowser.open after the server is ready.
--keep-running off (implied without --probe) Block until the subprocess exits or SIGINT is received.
--shutdown-timeout <sec> 5 Seconds to wait for graceful subprocess teardown.

Exit codes: 0 on success, 1 on ready timeout or probe failure, 2 on bad input (missing --logdir, non-positive timing, launch OSError).

Example smoke check:

mkdir -p /tmp/tb-smoke
uv run worldforge-open-tensorboard \
  --logdir /tmp/tb-smoke \
  --probe --no-browser \
  --ready-timeout 120

Programmatic surface

Hosts that already drive WorldForge directly can use the bridge without going through the showcase script. The public surface mirrors worldforge.rerun and lives in worldforge.tensorboard:

from worldforge.tensorboard import (
    TensorBoardCheckpointInspector,
    TensorBoardLogConfig,
    TensorBoardSession,
    create_tensorboard_inspector,
)

inspector = create_tensorboard_inspector(
    log_dir=".worldforge/tensorboard/checkpoint-inspection",
    run_name="baseline",
)
inspector.log_checkpoint_summary(
    {
        "output": "/path/to/pusht/lewm_object.ckpt",
        "policy": "pusht/lewm",
        "repo_id": "galilai/lewm-pusht",
        "revision": "<commit-sha>",
    }
)
inspector.log_score_distribution(scores, best_index=1, best_score=scores[1])
inspector.log_metrics({"plan_latency_ms": 25.0, "total_latency_ms": 30.0})

# Optional: surface per-parameter weight histograms when host has the live
# checkpoint loaded. Tensor-like values (``detach``/``cpu``/``numpy``) are
# flattened automatically.
inspector.log_state_dict_histograms(model.state_dict())

inspector.flush()
inspector.close()

The session is initialized lazily, so constructing an inspector is free of side effects until the first log_* call. close() is idempotent.

Boundaries

  • The bridge never advertises predict, embed, plan, score, or policy capabilities. It is purely observability.
  • The bridge never imports torch on its own. Weight histograms require the host to pass tensor-like objects through log_state_dict_histograms - appropriate when the LeWorldModel runtime is already loaded host-side.
  • The bridge never reads .env or any other secret file. Provider event messages are sanitized in :mod:worldforge.models before they reach the bridge; the bridge then applies a second pass that strips api_key=, token=, secret=, signature=, and signed-URL ?signature= / ?token= query fragments before writing text panels.
  • tfevents files stay local. If you want to share a TensorBoard run as release evidence, archive the log directory the same way you archive any other showcase artifact.