Research log

Energy-Based Optimizers and Telemetry

Evidence notes for using energy-based models as advisory optimizers in Tau and ZenoDEX: synthetic EBM training, real telemetry, ZenoEnergy verifier-call reduction, and fast-router timing.

View source Built 2026-06-29

Abstract

This note records the evidence used for Tutorial 59. The question is whether energy-based models can act as efficient optimizers for structured formal-methods workloads.

The current answer is conditional. For bounded candidate ranking, the evidence is strong enough to justify a research pattern:

generate candidates
rank by energy
verify deterministically
fall back if ranking misses

It is not a production authority claim. The model may change candidate order. It must not decide validity.

1. Assumptions

Assumption A, bounded candidate set

The useful EBM setting is finite:

C(x) = finite candidates for context x

If candidate generation is incomplete, the EBM can only rank the emitted candidates. It cannot recover candidates that were never generated.

Assumption B, verifier authority

The EBM is advisory. Validity comes from a semantic guard, replay, Tau parity check, certificate, or deterministic verifier.

The safety rule is:

LowEnergy(x,c) and VerifierRejects(x,c) -> reject

Assumption C, telemetry labels

Training labels must come from measured or verified outcomes. For Tau route optimization, the labels are route-profit telemetry. For ZenoEnergy v0, the labels are synthetic candidate rows checked by the UPBA v2 verifier.

Synthetic data can test method viability. It cannot prove production distribution performance by itself.

2. External Framing

Logical Intelligence’s article Energy-Based Models for Reasoning, LLMs for the Interface frames energy-based reasoning as scoring intermediate states against constraints and objectives.

The transfer used here is narrower:

intermediate reasoning state -> candidate route or settlement certificate
energy -> advisory priority
constraint -> semantic guard or verifier obligation
objective -> speed, verifier-call reduction, or settlement objective

This keeps the model useful without giving it authority over formal validity.

3. Tau Route-Telemetry Evidence

The Tau route-telemetry chain has four experiments:

Cycle	Corpus	Result	Boundary
v827	synthetic Tau-like route costs	guarded EBM beat hand rule	synthetic oracle only
v828	synthetic stress test	result survived small data and noisy labels	same synthetic oracle
v829	small real route-profit audit	EBM lost to baseline	too little real data
v830	expanded direct real receipts	EBM beat baseline on aggregate elapsed time	still regression-prone

The synthetic result was strong:

Policy	Accuracy	Invalid route rate	Mean regret
hand rule, iid	`0.830333`	`0.0`	`0.125538`
unguarded EBM, iid	`0.926222`	`0.025667`	`1142.106698`
guarded EBM, iid	`0.949222`	`0.0`	`0.012427`

The unguarded invalid-route rate is the key safety result. The energy score needs a support mask.

The expanded real telemetry run had:

Policy	Mean elapsed	Delta vs baseline	Mean regret	Regressions enabled
always baseline	`1534.561878 ms`	`0.0%`	`0.030701`	`0`
case-held-out EBM	`1517.503276 ms`	`+1.111627%`	`0.028217`	`6`

Interpretation:

telemetry improved the model
regression risk still blocks promotion
semantic support remains outside the EBM

4. ZenoEnergy v0 Evidence

ZenoEnergy v0 studies UPBA v2 partial-fill exact-in settlement candidate search. It trains a small CPU-only energy ranker:

features: 96
weights: 96
bias: 1
parameters: 97
model: linear pairwise-hinge energy scorer

The model ranks candidates before deterministic verifier checking. It does not authorize settlement.

The main reported holdout result:

Order	Winner-bearing batches	Mean verifier calls	Top-1	Top-5	Top-10	Invalid accepts
exhaustive	`1,983`	`19.99`	`0.000`	`0.000`	`0.000`	`0`
random	`1,983`	`10.21`	`0.048`	`0.258`	`0.527`	`0`
hand energy	`1,983`	`1.36`	`0.763`	`0.996`	`1.000`	`0`
gap-weighted learned	`1,983`	`1.017`	`0.983`	`1.000`	`1.000`	`0`

The reported cross-seed stress preserved the same pattern:

Order	Configs	Top-1 mean	Top-1 min	Top-5 mean	Top-10 min	Mean calls	p99 max	Invalid accepts
hand energy	`9`	`0.782`	`0.752`	`0.996`	`1.000`	`1.326`	`5`	`0`
learned linear	`9`	`0.982`	`0.964`	`0.999`	`1.000`	`1.026`	`2`	`0`

The current preferred gap-weighted model audit reports:

parameters: 97
feature_dim: 96
nonzero_weight_count: 38
reserved_nonzero_count: 0
forbidden_feature_names: none

The largest positive weights are verifier-shaped penalties. The largest negative weights reward executed volume and surplus.

5. Local Replays

A small local replay of the ZenoEnergy benchmark used 200 synthetic batches with 20 requested candidates per batch. It produced 198 winner-bearing batches.

Order	Mean verifier calls	Top-1	Top-5	Top-10	Invalid accepts	Permutation violations
exhaustive	`19.995`	`0.000`	`0.000`	`0.000`	`0`	`0`
random	`10.621`	`0.040`	`0.237`	`0.480`	`0`	`0`
hand energy	`1.359`	`0.778`	`0.990`	`1.000`	`0`	`0`
learned energy	`1.040`	`0.960`	`1.000`	`1.000`	`0`	`0`
hard-barrier hybrid	`1.040`	`0.960`	`1.000`	`1.000`	`0`	`0`

Focused ZenoEnergy tests passed:

24 passed

This replay is a sanity check, not a replacement for the larger reported receipt.

6. ZenoDEX Fast-Router Evidence

The ZenoDEX fast quote router is not presented here as the trained ZenoEnergy model. It is evidence for the same optimizer architecture:

cheap scalar score
bounded top-k frontier
exact integer replay
receipt verification
fallback

A local timing probe compared exhaustive routing to fast routing on synthetic CPMM markets:

Case	Two-hop pairs	Exact-in exhaustive	Fast exact-in	Speedup	Exact-out exhaustive	Fast exact-out	Speedup
small	`640`	`113.703 ms`	`12.410 ms`	`9.16x`	`102.402 ms`	`90.893 ms`	`1.13x`
medium	`2,500`	`141.357 ms`	`14.630 ms`	`9.66x`	`142.738 ms`	`92.290 ms`	`1.55x`
large	`9,000`	`251.964 ms`	`20.675 ms`	`12.19x`	`293.650 ms`	`99.290 ms`	`2.96x`

The fast route matched the exhaustive route’s final amount in those measured cases. The focused fast-router tests passed:

8 passed

Scope:

tested synthetic markets only
heuristic top-k routing only
receipt verification still required

7. LLM Comparison Boundary

This note does not claim a direct benchmark against a specific LLM API. LLM latency, price, and model behavior vary by provider, context length, and deployment.

The supported comparison is structural:

Task	Small EBM	LLM
Rank fixed numeric candidates	direct scalar scoring	requires serialization, generation, parsing, and verification
Preserve deterministic validity	no, verifier required	no, verifier required
Explain a result	limited	strong
Translate natural language into a formal request	limited	strong
Inner-loop top-k verifier scheduling	strong fit	usually inefficient

The useful system design is:

LLM for interface and proposal
EBM for cheap structured ranking
formal verifier for acceptance

8. Promotion Gate

A production candidate would need:

Exact candidate-domain definition or a certified generation boundary.
Proof or test evidence that ranking preserves candidate-set membership.
Deterministic fallback or a checked-stop certificate.
Zero invalid accepts under adversarial low-energy tests.
Held-out telemetry or replayed real workload evidence.
Regression budget stated before promotion.
Public replay receipts that rebuild without private data.

Current status:

Tau EBM: telemetry-positive, regression-prone, not promoted
ZenoEnergy v0: strong synthetic verifier-call reduction, research-only
ZenoDEX fast router: useful bounded timing signal, advisory heuristic

9. Tutorial Link

Beginner-facing explanation: Energy-Based Models as Fast Optimizers.