Grammar Witness Search Lab

Compare raw byte mutation against grammar-guided generation on a tiny arithmetic language. Watch parser acceptance, semantic execution, and bug discovery diverge under the same search budget.

raw: find any byte string that triggers Bad   |   grammar: find a valid expression that triggers Bad

What this lab shows

Both lanes use the same budget and the same bad-state predicate. The only difference is the witness language. One lane spends budget on arbitrary strings. The other stays inside the arithmetic grammar.

Try this first

Use the default seed and click Run both strategies. Then compare parser acceptance, semantic execution, and bug witnesses. After that, change the seed and see whether the grammar lane still keeps more of the budget alive.

Start here Run both strategies with the same seed to compare how much of the budget survives the parser boundary.

Tiny language and target

Expr ::= Num
       | (Expr + Expr)
       | (Expr / Expr)
Num  ::= 0 | 1 | 2 | 3
Bad(x) holds if:
1. x parses under the grammar
2. x reaches semantic evaluation
3. evaluation attempts division by zero

Comparison snapshot

Run both strategies first for the clearest comparison. Single-lane runs fill only one side.

Parser accepted
Raw
Grammar
Division reached
Raw
Grammar
Bug witnesses
Raw
Grammar

Recent samples from last run

Last comparison

Raw-byte search

not run yet

Grammar search

not run yet

The grammar loop is not smarter because it proves more. It is stronger here because it spends less budget on parser rejection and more budget in the semantic region. Even with only grammar terminals, random concatenation still wastes a large share of the budget on malformed strings.

Detailed log