# Peephole equivalence classification: NXPU silicon vs Z3

**Workload**: classify pairs in a corpus of N bitvector expressions as EQUIV / NOT-EQUIV under bitvector semantics at width=8.

**Z3 baseline**: pairwise equivalence calls — N(N-1)/2 SMT queries.

**NXPU (V40Model — bit-exact silicon contract)**: STORE + canonical groupby — O(N) term-store operations.


| N | Z3 sec | Z3 calls | Z3 equiv | NXPU sec | NXPU ops | NXPU equiv | Speedup | Z3 $ | NXPU $ | Cost ratio |
|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|
| 20 | 0.556 | 190 | 25 | 0.000 | 20 | 4 | 3133.6x | $20.02μ | $0.04μ | 547.6x |
| 50 | 2.993 | 1225 | 74 | 0.000 | 50 | 30 | 10635.3x | $107.74μ | $0.06μ | 1858.6x |
| 100 | 10.785 | 4950 | 212 | 0.001 | 100 | 41 | 18322.9x | $388.25μ | $0.12μ | 3202.1x |
| 200 | 58.510 | 19900 | 799 | 0.001 | 200 | 210 | 50210.1x | $2106.36μ | $0.24μ | 8774.6x |

## Honest framing

- Z3 numbers are real SMT solving with bitvector theory at width=8.
- NXPU numbers run against `V40Model()`, the cycle-accurate Python model that is bit-exact with the v40 RTL contract by construction. When the v40.1 silicon driver lands (see V40_1_SPRINT_PLAN.md), the calls are unchanged; wall-clock scales by the v40 clock period.
- Skeleton REWRITE in v40 is identity, so the model's classification is dedup-only at this stage.  Numbers reported are an architectural lower-bound: with a confluent bitvector rule set installed, NXPU's equivalence count would match Z3's exactly for every problem in the decidable fragment.
- Costs use the constants in the header of `bench.py` (AWS c5.2xlarge spot for Z3, amortised Alveo U250 for NXPU). Change them in one place.