Contributing¶
WorldForge contributions should keep code, tests, docs, and agent context in sync.
uv sync --group dev
uv lock --check
uv run ruff check src tests examples scripts
uv run ruff format --check src tests examples scripts
uv run python scripts/generate_provider_docs.py --check
uv run python scripts/check_docs_commands.py
uv run python scripts/check_docs_snippets.py
uv run python scripts/check_wrapper_portability.py
uv run python scripts/check_optional_import_boundaries.py
uv run python scripts/check_core_performance.py
uv run python scripts/generate_quality_dashboard.py
uv run mkdocs build --strict
uv run pytest
uv run --extra harness pytest --cov=src/worldforge --cov-report=term-missing --cov-fail-under=90
bash scripts/test_package.sh
uv build --out-dir dist --clear --no-build-logs
Before tags or package publishing, also run the locked dependency audit from Operations. If setup fails before the gate starts, run:
uv run python scripts/contributor_doctor.py --format markdown
uv run python scripts/contributor_doctor.py --format json
The contributor doctor checks Python 3.13, uv, source-tree shape, docs tooling, GitHub CLI auth status, and optional runtime skip reasons without installing dependencies, reading secrets, or assuming LeWorldModel, LeRobot, GR00T, or Rerun are present. Its Markdown output is safe to paste into public issues.
uv run python scripts/generate_provider_docs.py --check,
uv run python scripts/check_docs_commands.py, uv run python scripts/check_docs_snippets.py,
uv run python scripts/manage_fixture_snapshots.py --format markdown,
uv run python scripts/check_wrapper_portability.py,
uv run python scripts/check_optional_import_boundaries.py, and uv run mkdocs build --strict
verify generated provider docs, documented command drift, selected executable docs snippets,
fixture snapshot drift, wrapper portability, optional-runtime import boundaries, and the MkDocs
Material site in strict mode. bash scripts/test_package.sh checks the wheel/sdist contents before
installing the built wheel and running tests against the installed package. See
Artifact Integrity for the release artifact hashing, quality dashboard,
and evidence-linking contract.
Docs snippet markers¶
Use snippet markers directly before fenced code blocks when a Python or JSON example is stable enough to gate:
<!-- worldforge-snippet: execute -->
```python
from worldforge import WorldForge
forge = WorldForge()
print(forge.providers())
```
Use <!-- worldforge-snippet: parse --> for JSON blocks. The gate parses generic JSON and applies
schema checks for selected scenario and benchmark examples. Use explicit skip markers instead of
leaving fragile examples unmarked:
<!-- worldforge-snippet: skip-host-owned -->for optional runtimes, checkpoints, GPU hosts, or prepared robotics environments.<!-- worldforge-snippet: skip-credentialed -->for paid providers, private endpoints, or examples requiring secrets.<!-- worldforge-snippet: skip-illustrative -->for fragments with placeholders such as..., undefined host objects, or intentionally incomplete code.
Run uv run python scripts/check_docs_snippets.py before changing Python or JSON examples in the
Python API, scenarios, provider routing, external provider, benchmarking, artifact, or report docs.
Deterministic artifact tests¶
Use worldforge.testing determinism helpers when tests compare exact artifact, report, or manifest
output:
from worldforge.testing import DeterministicIdFactory, stable_json_dumps, stable_snapshot
ids = DeterministicIdFactory()
snapshot = stable_snapshot(payload, path_roots={tmp_path: "<tmp>"})
assert stable_json_dumps(snapshot) == expected_json
Exact snapshots are useful for schema-versioned JSON artifacts, issue templates, rendered reports, and stable CLI text. Prefer semantic assertions for real latency or throughput measurements, host paths, current git metadata, live timestamps, optional runtime warning text, and values owned by a prepared external runtime. Do not globally monkeypatch clocks or randomness for host-owned smokes; pass deterministic clocks or explicit IDs into the test helper or renderer being tested.
Before changing public imports, CLI flags, provider capabilities, or artifact schemas, classify the surface through Public API Stability and the Artifact Schemas ownership map. Stable and provisional surfaces need a deprecation or migration plan unless the change fixes a security exposure, false capability claim, or persisted-state incoherence.
Key directories:
src/worldforge/models.py: public compatibility facade and model re-exports.src/worldforge/_model_utils.py: shared JSON-native validation helpers and framework errors.src/worldforge/scene_models.py: geometry, action, scene object, structured-goal, and history contracts.src/worldforge/capability_results.py: embedding, action-score, and embodied-policy result contracts.src/worldforge/provider_models.py: compatibility facade for provider-facing contracts.src/worldforge/provider_profiles.py,provider_request_policy.py,provider_events.py,provider_diagnostics.py, andprovider_redaction.py: focused provider contracts for capabilities/profile metadata, retries/timeouts, events, lifecycle diagnostics, and sanitization.src/worldforge/framework.py: runtime facade, provider registry, persistence, and diagnostics.src/worldforge/framework_capabilities.py: internal capability-protocol registry and dispatch.src/worldforge/_world.py: mutable world state, history, and planning.src/worldforge/_world_prompt_seeders.py: prompt-derived local seed-scene helpers.src/worldforge/harness/tui_styles.py: Textual-free CSS constants for the robotics showcase report.src/worldforge/providers/: provider interfaces, catalog, adapters, and scaffolds.src/worldforge/testing/: reusable provider contract helpers, fixture loaders, runtime markers, and deterministic artifact test controls.tests/fixtures/fixture-snapshots.json: manifest of tracked JSON fixtures; update it withuv run python scripts/manage_fixture_snapshots.py --writeafter intentional fixture changes.src/worldforge/evaluation/: deterministic evaluation suites.src/worldforge/benchmark.py: provider benchmark harness.src/worldforge/observability.py: provider event sinks.docs/src/: user docs, architecture, playbooks, provider pages, and API notes.tests/: behavior and regression tests.examples/: runnable examples and compatibility wrappers.scripts/: docs generation, scaffolding, package validation, and optional smokes.
Provider work belongs in src/worldforge/providers/. Keep adapter capabilities honest and add
tests for every new supported path.
Before editing, pick the matching Contributor Task Starters entry. The starter packs list likely files, forbidden shortcuts, validation commands, evidence artifacts, docs/changelog expectations, and review checklist items for provider, docs-only, demo, artifact, evaluation, and CLI work.
For adapter packages and in-repo providers, use the reusable contract helper:
from worldforge.testing import assert_provider_contract
report = assert_provider_contract(provider)
print(report.exercised_operations)
Score-capable providers must pass provider-specific score fixtures:
report = assert_provider_contract(
provider,
score_info=score_fixture["info"],
score_action_candidates=score_fixture["action_candidates"],
)
Contributor Triage And Labels¶
Use labels to make an issue's roadmap stream and evidence contract clear before work starts.
| Axis | Labels | Use when |
|---|---|---|
| Roadmap stream | stream: provider-evidence |
provider selection, runtime contracts, provider promotion, runtime manifests, upstream validation |
| Roadmap stream | stream: evidence-integrity |
evals, benchmarks, budgets, preserved run evidence, release evidence, provenance, public claims |
| Roadmap stream | stream: ops-authoring |
operator workflows, robotics showcase evidence, adapter authoring loops, reference hosts, persistence, runbooks |
| Capability | predict, embed, score, policy |
the issue changes or validates that public capability surface |
| Severity | severity: blocking, severity: quality, type: hardening |
release blockers, quality regressions, validation/redaction/recovery hardening |
| Release scope | release, release: provider-hardening-rc |
release process or named release-candidate scope |
Provider runtime issues should use the provider adapter template, stream: provider-evidence,
provider, the claimed capability labels, and any relevant optional-dependency, robotics,
security, or research labels. New runtime families or unclear upstream contracts need a
selection record before implementation. Provider promotion work must cite the provider authoring
guide promotion gate, runtime manifest, fixtures, docs, and live-smoke or explicit blocker
evidence.
Evaluation, benchmark, artifact, budget, report, or claim issues should use the eval/benchmark
template with stream: evidence-integrity. Release-candidate or public-claim issues need preserved
run evidence, an evidence bundle, or release evidence before closure.
Operator workflow issues should use stream: ops-authoring plus operations, harness,
developer-experience, persistence, reliability, or examples as appropriate. The issue should
name the command to run, expected success signal, first triage step, and recovery command.
Architecture, persistence-boundary, provider-selection, or runtime-ownership changes need a design record or selection record before broad implementation.
Security-sensitive reports still route through the private Security tab. Do not open public issues containing vulnerabilities, credentials, signed URLs, private endpoints, or host-local artifacts.
Before publishing a branch:
- run the full release gate from User And Operator Playbooks.
- update provider docs and generated catalog tables for provider behavior changes.
- update Python API for public API or exception changes.
- update Architecture for new flows or ownership boundaries.
- update Artifact Schemas for new or changed public artifact families.
- update Operations and Playbooks for new operator work.
- update
CHANGELOG.mdfor user-visible changes. - update
mkdocs.ymlwhen the docs navigation changes. - update
AGENTS.mdfor new commands, constraints, gotchas, or architecture facts.