Tutorial 60

TauEnergy Workbench: Replayable Training

Learn how the TauEnergy workbench turns Tau-checked formulas into measured route labels, trains an advisory energy ranker, and publishes a license-safe replayable demo.

View source Built 2026-05-20

This tutorial explains the replayable TauEnergy workbench.

The goal is narrow:

learn a better order for checking Tau optimizer routes

Tau, deterministic route certificates, or fallback checks determine route validity, rather than the learned model.

Scope

This tutorial explains an independent research demo in the TauLang-Experiments repository. It does not represent an official Tau feature, a production optimizer, or a license grant for Tau Language. The public demo publishes metrics only. Readers must obtain Tau from the official source before replaying the experiment.

1. The simple picture

Imagine a solver with several possible routes:

default route
read-once structural route
small truth-table route
Tseitin SAT cross-check route
ordered-BDD route
Tau fallback route

Several routes may be semantically safe for a given formula. One route may be faster on a read-once formula. Another may be better on an ordered-BDD-shaped formula. Fallback may be safest when the formula is outside the supported fragment.

TauEnergy learns this kind of preference:

For this checked formula shape, which route should be checked first?

The workbench keeps a separate authority rule:

low energy + checker rejects -> reject
low energy + checker accepts -> candidate route may be used

The score determines search order without affecting validity.

2. The roles

The workbench has four roles.

LLM interface

The language model can explain the experiment, propose a candidate route family, or help turn a question into structured work.

TauEnergy

TauEnergy is an energy ranker. It scores structured route candidates. Lower energy means “check this earlier.”

WES-style schedule

WES means the checker order is treated as a search schedule. If the ranker is good, the schedule reaches the best valid route faster.

Tau and route certificates

Tau checks syntax and formula status. Route-specific deterministic checks compare candidate route results against Tau status or a certificate.

3. The training pipeline

The measured training pipeline is:

generate or load Tau-shaped formula
-> run Tau syntax/status check
-> run candidate route checks
-> keep routes whose result matches Tau
-> label the fastest valid route
-> train TauEnergy to rank that route first
-> evaluate on held-out formulas

The label is measured from checked routes rather than manual estimation.

The important fields in a report are:

failed_check_count
invalid_accept_count
fitted_eval_test.top1_oracle_route_rate
hand_eval_test.top1_oracle_route_rate
mean_calls_to_oracle
family_holdout

The first two fields are gates. If either one is nonzero, the report does not support the tutorial claim.

4. The replay command

The experiment repository has a license-aware replay wrapper:

./scripts/run_tau_energy_training_demo.sh --accept-tau-license --quick

The larger replay is:

./scripts/run_tau_energy_training_demo.sh --accept-tau-license --full

That wrapper:

obtains Tau from the official IDNI repository,
applies the local research patches,
builds Tau locally,
runs the optimizer workbench,
trains the measured route ranker,
runs stress checks,
runs the ordered-BDD curriculum,
builds a public metrics-only snapshot.

The generated local reports live under:

results/local/tau-energy/

The public website reads a smaller JSON summary:

docs/assets/tau-energy-demo-summary.json

That summary is designed for publication and excludes Tau source, Tau binaries, full formulas, local paths, or full receipts.

5. The current result

The current public snapshot reports:

genuine optimization route: indexed_impacted_factor_solve
solver-call reduction: 8.0x
measured training top-1: 0.921569
ordered-BDD best top-1: 0.9375
invalid accepts: 0

This bounded workbench result indicates that the route ranker learned useful ordering on the checked corpus, without recommending that upstream Tau enable the route by default.

6. Why ordered-BDD needed its own curriculum

The stress report exposed a weak family:

ordered-BDD holdout

This indicates that a model trained on other families did not reliably rank the ordered-BDD route first when the entire ordered-BDD family was held out.

The targeted curriculum then asked:

How many measured ordered-BDD examples improve held-out ordered-BDD routing?

The result:

zero targeted BDD examples: 0.90625 top-1
32 targeted BDD examples: 0.9375 top-1
invalid accepts: 0

This is a useful lesson about training data. The next examples should target route-family weakness first. Adding generic formulas can come after the weak family is covered.

7. What can Tau developers do with this?

Tau developers can use the workbench as a route-design loop:

propose a route family
-> define its semantic guard
-> build deterministic checks or certificates
-> generate Tau-checked examples
-> measure route profit
-> train advisory ordering
-> stress by seed and formula family
-> promote only after the checker gates pass

This is useful for:

finding fragment-specific optimizations,
discovering where a hand-coded heuristic is too broad,
detecting route families that need more examples,
comparing fallback, hand rules, and learned routing,
turning failed route predictions into new training curricula.

The workbench is also useful when it recommends against promotion. Negative route telemetry can block a tempting optimization before it becomes a default behavior.

8. Assumptions and stress tests

Assumption A, formulas are representative

The synthetic formulas and small real-spec examples may not represent all Tau workloads.

Stress test:

hold out whole formula families
add more real Tau benchmark commands
track weakest-family top-1

Assumption B, measured labels are stable

A route can look profitable on one machine or Tau version and regress later.

Stress test:

record Tau grammar hashes
rerun after Tau syntax or solver changes
compare reports across seeds and source corpora

Assumption C, ranker features are sufficient

The current ranker is small. It can miss a route family if the features do not expose the relevant shape.

Stress test:

mine misranked examples
add fragment-specific features
compare against hand rules and fallback

9. What to remember

TauEnergy is useful because it turns route choice into a checked learning problem:

measure first
learn route order second
verify always
fallback on failure

The model reduces search cost without validating routes.