Static HTML Reports¶
WorldForge can export evaluation reports, benchmark reports, run comparisons, and issue-ready bundles as self-contained, sanitized HTML documents. The HTML form is for cases where a single static file needs to land in an issue body, a Slack post, or a release-review email without forcing the reader to install WorldForge — JSON and Markdown are still preferred for machine consumption and code review.
Open-file workflow¶
# Eval report → HTML.
uv run worldforge eval --suite planning --provider mock --format html > report.html
# Benchmark report → HTML.
uv run worldforge benchmark --provider mock --operation predict \
--iterations 5 --format html > benchmark.html
# Comparison across two preserved runs → HTML.
uv run worldforge runs compare path/to/run_a path/to/run_b \
--format html --output review/compare.html
# Baseline-vs-candidate regression report → HTML.
uv run worldforge runs compare path/to/baseline path/to/candidate \
--mode regression --format html --output review/regression.html
# Issue-ready bundle for one preserved run.
uv run worldforge runs bundle <run-id> --workspace-dir .worldforge \
--format html > issue.html
Open the resulting file in any browser; it is self-contained — no
external CSS, no JavaScript, no network-loaded assets, no anchor links.
Drop it in an issue attachment, attach to a release-review email, or
preview alongside the JSON/Markdown counterparts under
<workspace>/issue-bundles/<run-id>/.
worldforge runs bundle always writes both summary.html and
issue.html next to the existing summary.md and issue.md files;
selecting --format html simply prints issue.html to stdout.
For a checkout-safe non-developer review package that combines evaluation, benchmark, world-diff, and issue-bundle evidence into one static artifact set:
uv run python scripts/demo_showcases.py run non-developer-evidence-review \
--workspace-dir .worldforge/demo-showcases
Attach the generated review-package.html, review-package.json, and
review-package.md only after checking the share_policy rows. Relative safe
artifacts are linked for local inspection; host-local paths, signed URLs, and
raw provider payloads are marked local-only or excluded and should not be
uploaded to issues or release review.
When to use HTML¶
- The audience is non-developer (a release-review reader, a partner) and needs a rendered table without installing tools.
- A single artifact has to be attached and viewed offline.
- The report needs to be shared in a context where Markdown rendering is unreliable (an internal wiki that strips tables, an email client).
When not to use HTML¶
- The reader is human, technical, and reading inside GitHub or the terminal — Markdown is shorter, diff-friendly, and renders the same tables.
- The reader is a script — JSON or CSV are stable contracts; HTML is a presentation format with a layout that may evolve between releases.
- The output needs to be diffed across runs in version control — Markdown is a stable text format; HTML wraps additional markup.
Limitations¶
- HTML output is presentation-only. The structure may change between WorldForge releases as renderers improve. Treat JSON as the durable contract; HTML is a render of a JSON payload.
- The renderer never emits
<a href="…">anchors. URLs that appear in manifest fields are escaped as plain text. This is deliberate — a malformed manifest cannot inject an exfiltration link or an XSS payload into a shared report. - The HTML reader will not download or include external assets even if
the source manifest references them. Use
worldforge runs bundleto package the safe artifacts alongside the HTML when sharing. - Unsafe artifacts (oversized files, suffixes outside the safe-attach
list, host-local paths) are listed but not embedded. The issue-bundle
HTML surfaces a warning banner when the bundle's
safe_to_attachflag isfalse.
Sanitization guarantees¶
- All user-supplied text (provider names, commands, failure summaries,
validation errors, run identifiers) is escaped via
html.escapewithquote=Truebefore insertion. A provider literally named<script>renders as<script>in the document. - The renderer emits
<script>tags in zero outputs. Tests assert this on every artifact kind. - The renderer emits
<a>tags in zero outputs. Tests assert this for the issue bundle path. - The HTML document declares
lang="en"and uses inline styles only; no<link rel="stylesheet">, no<script src="…">.
Public Python surface¶
The renderer functions are re-exported on the top-level package:
from worldforge import (
render_evaluation_html,
render_benchmark_html,
render_comparison_html,
render_evidence_bundle_html,
render_issue_bundle_html,
)
# Or via `EvaluationReport.to_html()` / `BenchmarkReport.to_html()`
# and the `artifacts()` dict on each report class.
HTML_REPORT_SCHEMA_VERSION is exposed so callers can branch on
renderer-version drift; today it is 1.
Renderer Extension Points¶
External suites and host applications can register safe renderers without patching WorldForge internals. A renderer declares its artifact family, output format, media type, supported schema ids, and whether its output is safe to attach or local-only. Registration accepts callable renderers only; WorldForge does not load renderer plugins from arbitrary files.
from worldforge import ReportRenderer, register_report_renderer, render_report_artifact
register_report_renderer(
ReportRenderer(
artifact_family="comparison",
output_format="summary",
media_type="text/plain",
supported_schemas=("comparison:2",),
safe_to_attach=True,
render=lambda payload: f"runs={payload['run_count']}",
)
)
artifact = render_report_artifact("comparison", "summary", {"run_count": 2})
assert artifact.safe_to_attach is True
Built-in renderer families include comparison, evidence-bundle, and
issue-bundle. Duplicate family/format registrations fail unless the caller
explicitly replaces a renderer. Safe-to-attach renderer output is validated for
secret-like material before it can be returned.