Dyber, Inc. — Reasoning Silicon

AI that reasons,
doesn’t guess.

The first AI chip that structurally cannot hallucinate. Every answer is provable, every confidence is calibrated, every rule is shown to generalize, and when there isn’t enough evidence the chip explicitly refuses to commit. Plus — the chip discovers the rules itself from your data. No training, no gradients, no model weights. Silicon-verified at 100 MHz on Xilinx xczu7ev across 34 testbenches. Open-source, deployable today.

SYS.ARCH // NXPU v9 NEUROSYMBOLIC REASONING PROCESSOR
SILICON // xczu7ev FPGA, 100 MHz, TIMING MET
SLACK // WNS +46 ps / WHS +10 ps (post-C.15)
UTILIZATION // 23.9% LUT, 17.9% FF (4× HEADROOM)
VERIFICATION // 34/34 TESTBENCHES PASS
REASONING // DEDUCTIVE + INDUCTIVE + PROBABILISTIC
PROOFS // EVERY DERIVATION CARRIES A RECEIPT
UNCERTAINTY // Q0.16 CONFIDENCE PROPAGATED NATIVELY
DISCOVERY // CHIP FINDS RULES FROM YOUR DATA
REFUSAL // "I DON'T KNOW" IS A FIRST-CLASS ANSWER
TRAINING // ZERO. HALLUCINATION // ZERO.
Structurally Cannot Hallucinate Every Output Has a Proof Tree Native Q0.16 Confidence Propagation Chip Discovers Rules From Your Data Train/Test Holdout for ILP Open-World "I Don't Know" Answers Refuses Low-Confidence Conclusions Silicon-Verified at 100 MHz 34/34 Testbenches PASS Zero Training Required Open Source on GitHub Forward + Backward Chaining Recursive Datalog Native CORDIC sin/cos in Hardware Aggregation + Top-K + Negation Probabilistic Primitives (pmul/pnot/psum) Inductive Logic Programming on Silicon FDA-Friendly Clinical AI SOX / GDPR / HIPAA Auditable RTL IP + FPGA + Cloud + ASIC
001

Nine subsystems.
One chip.
Zero hallucination.

Not a GPU doing matrix math. Not an LLM guessing statistically. Purpose-built silicon for deterministic logical inference — with a complete compiler toolchain from NXLang source to hardware.

10 ns
CAM Query Latency (1 cycle)
100%
Accuracy (All Testbenches)
1.65 µJ
Energy per Derivation
34/34
Silicon Testbenches PASS
100 MHz
Timing Met on xczu7ev
+46 ps
WNS Slack (Setup, post-D.1)
23.9%
LUT Utilization (4x Headroom)
0
Critical Synth Warnings
001.5

The chip cannot make
things up. Here’s why.

LLMs hallucinate because their only fitness function is "next-token plausibility." There is no separation between things the model knows and plausible-sounding text. NXPU is structurally different. Every output is the result of explicit logical derivation from explicit facts and rules. The chip cannot return a fact that isn’t entailed by its inputs — ever — because the silicon literally has no path that produces ungrounded outputs. Five hardware mechanisms back this:

PILLAR 1 · C.11
Proof Trees
Every CAM entry stores a 48-bit provenance record: which rule fired, and the addresses of the body facts that satisfied it. The host walks the tree recursively to get a complete derivation chain back to your input data.
tb_proof_tree: 8/8 derived facts have valid proofs
PILLAR 2 · C.9 / C.9.1
Calibrated Confidence
Every fact has a Q0.16 confidence. Rules compose them natively: head_conf = product of body confidences × rule strength, on a 4-deep multiply tree in silicon. No external calibration. Uncertainty is quantified, not hidden.
tb_diagnostic_conf: 0.85 × 0.80 × 0.95 × 0.9 = 0.5814 (silicon: 0x94D3) ✓
PILLAR 3 · C.12
Quantitative Refusal
Set a min_conf threshold. Derivations whose composed confidence falls below epsilon are NOT inserted into CAM. The chip refuses to commit to conclusions it isn’t sufficiently sure about, and probabilistic chains die early instead of flooding low-confidence noise.
tb_min_conf: patient_b (conf 0.02) pruned at threshold 0.5 ✓
PILLAR 4 · C.13 / C.15
Generalization Defense
When the chip discovers rules from data, each candidate is scored on a held-out test set in addition to training. Rules that fit training but fail holdout (overfit) are rejected. Minimum support filter rejects rules that fit too few examples to be patterns rather than coincidences.
tb_holdout: chip distinguishes generalizing from non-generalizing rules ✓
PILLAR 5 · C.14
"I Don’t Know"
Mark a predicate open-world and the chip stops treating absence as falsehood. Negated body atoms on open-world predicates fail rather than succeed via NaF. The chip explicitly refuses to derive conclusions from missing data — the difference between "false" and "unknown."
tb_open_world: refuses to declare p2 safe with no allergy data ✓
BONUS · C.10
Rule Discovery on Chip
You give the chip data + labels; the chip enumerates candidate rules, scores each one against your data, and returns the rules that work. No training, no gradients, no model weights. The discovery loop runs entirely on silicon at hardware speed, defended by all four pillars above.
tb_discover_grandparent: chip identified the correct rule from raw data ✓
THE LITERAL CLAIM

NXPU does not hallucinate. Every answer it produces is provable (C.11), calibrated (C.9.1), above an evidence threshold (C.12), derived from rules that demonstrably generalize to unseen data (C.13), with sufficient support to be a pattern rather than a coincidence (C.15). When evidence is insufficient the chip explicitly refuses to commit instead of guessing (C.14). Plus, the chip can discover rules itself from your data with no training (C.10).

Every clause maps to a specific commit on github.com/dyber-pqc/NXPU with a silicon testbench you can replay.

002

Bidirectional reasoning.
Real numerics.
Silicon-verified.

Forward and backward chaining over Datalog with full SLD resolution. Aggregation over sets. Top-K ranking. Negation-as-failure. Structural hash-consing. Q16.16 integer ALU and Q4.12 CORDIC transcendentals. 34 testbenches passing on real Vivado xsim, timing met on real silicon.

Bidirectional Datalog
FC Sequencer + BC Engine + Goal Cursor
256-entry CAM with O(1) parallel match. 16-state rule eval FSM with backtracking, dedup, and 8-variable bindings. Semi-naive forward chaining to fixpoint. SLD-style backward chaining with rule unfolding. Recursive predicates (ancestor) silicon-verified end-to-end.
  • 10 ns CAM query (single combinational cycle)
  • 4 body atoms / 8 variables / 16 rule slots
  • FC: ancestor program derives 8 transitive facts to fixpoint
  • BC: grandparent goal enumerates all 3 solutions, exhausts cleanly
  • Goal cursor (SOLVE / SOLVE_NEXT) for native enumeration
Aggregation & Set Ops
count / sum / min / max / argmax / top-K / NaF
Six bridge primitives reason over sets, not just individual facts. Top-K maintains a parallel insertion-sorted register array. Negation-as-failure with both ground and unbound variables. Cardinality, statistics, ranking — all native silicon ops.
  • compute_count: 30 ns combinational match-count
  • compute_sum / min / max / argmax over CAM matches
  • compute_topk with K_MAX = 8, parallel beats[] insertion sort
  • not foo(X) body atoms; closed-world existential semantics
  • Hash-consing: equivalent subtrees collapse to one CAM entry
sin
Arithmetic + Transcendentals
Q16.16 ALU + CORDIC + Taylor Exp
Q16.16 integer ALU for add / sub / mul / div / abs / sqrt with DSP-mapped multiply. Q4.12 CORDIC engine computes sin and cos simultaneously in 17 cycles. Taylor-series exp() in 5 cycles. Numeric literals preserve their value through the symbol table.
  • d/dx[x³] at x=2 = 12 in 5.9 µs, 3 chained ALU ops
  • CORDIC sin/cos: 14-iter, ±3 LSB Q4.12 across all 4 quadrants
  • Taylor exp(x) for |x|≤1: ±6 LSB at exp(±1)
  • Q4.12 fadd / fsub / fmul; fdiv / fsqrt deferred to D.2
  • 0.7% DSP utilization — ~140x headroom for more engines
003

Real datasets.
Real silicon.
Real proofs.

Every example below is a working .nxp program that compiles to AXI register writes and runs on the FPGA. Open the source on GitHub. Run it via the Python SDK. Watch the proof chain emerge from real silicon — not a simulation, not a demo trick.

Pharmacovigilance
Drug Interaction Detection — FAERS Subset
Detects warfarin–fluconazole interactions through CYP450 enzyme inhibition reasoning. A documented cause of bleeding events and patient deaths — flagged in 164 cycles on real silicon, with a complete proof chain regulators can audit.
  • 4-body-atom rule chain, 100% precision, 0 false positives
  • FDA-friendly: every flag carries its derivation
  • Why LLMs can’t: clinical hallucination rates 10–64%
  • Source: examples/pharma_safety.nx
Symbolic Calculus
d/dx[x³] at x=2 = 12 — on chip
The power-rule derivative evaluated through three chained ALU ops dispatched by rule firings. Numeric literals preserve their value through the symbol table so the answer is mathematical, not symbol-ID arithmetic.
  • 5.9 µs end-to-end on silicon
  • HAL pipeline: .nxp → nxc → AXI → CAM → readback
  • 3 chained Q16.16 ops with bridge dedup
  • Source: examples/power_deriv.nxp
AML & Financial Audit
SOX, sanctions, transaction surveillance
Rule-based screening at line rate with audit-grade explainability. Every flagged transaction carries a full derivation trace — the kind of provenance regulators require and LLMs structurally cannot provide.
  • 20 SOX findings derived from 100 transactions in 6 ms
  • Deterministic: same input → same output, always
  • Why LLMs can’t: regulator audit demands explainability
  • Source: examples/financial_audit.nxp
Aggregation & Statistics
count / sum / min / max / argmax / top-K
Real set operations on the chip. Inventory analytics, statistical thresholds, ranking queries — all dispatched as bridge predicates with dedup, and all silicon-verified across 11 aggregation + 10 top-K subtests.
  • compute_count: 30 ns combinational match-count
  • compute_argmax: returns (max value, winning row)
  • compute_topk: K_MAX=8, parallel insertion sort
  • Source: examples/inventory_agg.nxp, topk_scores.nxp
Recursive Reasoning
Ancestor / transitive closure / multi-hop
The canonical recursive Datalog: ancestor(X,Z) :- parent(X,Y), ancestor(Y,Z). Semi-naive forward chaining derives all 8 transitive ancestors to fixpoint, then backward chaining enumerates all 5 descendants of any starting node.
  • Native FC + BC composition (the production Datalog technique)
  • Dependency-chain analysis, supply-chain traversal, family graphs
  • Goal cursor enumerates solutions one at a time via SOLVE_NEXT
  • Source: examples/ancestor.nxp
Defaults & Exceptions
Negation-as-failure (ground + unbound)
active_user(U) :- user(U), not banned(U). Default rules with explicit exceptions, RBAC negative-permission flows, GDPR consent checks, and other rule systems where “allowed unless forbidden” is the natural specification.
  • Closed-world existential semantics for unbound vars
  • One body-atom flag, zero new FSM states — reuses the CAM scan
  • Verified empty + populated cases (expect_none semantics)
  • Source: examples/active_users.nxp, has_no_cats_*.nxp
Transcendental Math
CORDIC sin/cos + Taylor exp in Q4.12
Real numerics inside reasoning rules. Physics simulators, statistical confidence weighting, signal-processing rule sets, and any control loop that needs a nonlinear response evaluated deterministically — all on chip in microseconds.
  • CORDIC: 14 iter, ±3 LSB across all 4 quadrants, 17 cycles
  • Taylor exp(x) for |x|≤1: ±6 LSB at boundaries, 5 cycles
  • Q4.12 fadd / fsub / fmul through the existing ALU
  • Sources: tb_cordic.v, tb_phase_d_ext.v
Goal-Directed Query
SOLVE / SOLVE_NEXT cursor enumeration
Native API for “find every X such that Q(X)”. The host writes a pattern + mask, issues SOLVE, and steps through all matching CAM entries one at a time without rescanning. Pipelined match-vector latch keeps the critical path inside 100 MHz.
  • Cursor parks on first match, advances on SOLVE_NEXT
  • Read matched entry via REG_RESULT_LO/HI
  • Backward-chaining engine builds on this primitive
  • Source: tb_goal_solve.v
004

Where LLMs
are not allowed.

Every regulated and safety-critical domain has the same problem: rule-based decisions that have to be auditable, deterministic, and fast — and an installed base of CPU rule engines that crawl. NXPU runs the same rules on silicon, with a proof chain on every conclusion.

Banking & Compliance
AML, sanctions screening, trade surveillance, KYC.
Regulator audit demands every flag explain itself. LLM hallucinations are a fineable offense.
TAM ~$22B
Healthcare & Pharma
Drug-interaction screening, clinical decision support, treatment-protocol checking.
FDA approval requires explainable AI. LLMs hallucinate at 10–64% in medical contexts.
TAM ~$14B
Cybersecurity / SIEM
Intrusion detection, vulnerability-chain analysis, lateral-movement reasoning, policy enforcement.
Splunk-class workloads burn cloud compute. Deterministic silicon = margin.
TAM ~$5B
Defense & Aerospace
Real-time decision logic in DO-178C-certifiable systems. Robotic planning. Flight control.
LLMs categorically can’t be DO-178C certified. NXPU’s deterministic logic can.
TAM ~$8B
Legal & Compliance
Contract clause checking, GDPR / HIPAA violation detection, e-discovery, conflict checking.
Auditable, deterministic, defensible in court. LegalTech vendors want this.
TAM ~$10B
Telecom 5G Core
Policy enforcement at line rate, routing decisions, QoS classification.
Microsecond decisions on packet streams. Hyperscalers building their own already.
TAM ~$6B
Industrial / IoT
Safety interlocks, sensor-driven control, deterministic decision loops.
Hardware-level correctness, milliwatt power (post-ASIC).
TAM ~$50B+
Smart Contracts & Audit
On-chain logic execution, formal verification, deterministic state transitions.
Blockchain protocols need exactly what NXPU provides.
TAM — emerging
005

Four ways
to ship.

From RTL IP licensed into your SoC to a hosted reasoning API your engineers call over HTTPS. Pick the integration path that matches your team and your timeline. The first three are deployable today.

RTL IP License
Available now
Verilog source for the full reasoning core, including bridge, CORDIC, BC engine, aggregation, top-K, negation, hash-consing, and the rule sequencer. Drop into your own SoC, your own ASIC tape-out, or your own FPGA card.
  • ~4,000 lines of Verilog, 34 testbenches included
  • Vivado-ready; xczu7ev reference build provided
  • Pricing: $1M–$5M one-time + per-chip royalty (exclusivity bumps to $10M+)
  • Comparable: ARM cores, Cadence/Synopsys IP blocks
FPGA Accelerator Card
After DRAM tiers (~6 mo)
Production-grade Xilinx Alveo or custom card with NXPU bitstream pre-loaded, PCIe / 100GbE host interface, Python SDK, and the full HAL toolchain. Plugs into a single 1U server.
  • Per card: $25k–$50k
  • SDK + support subscription: $100k–$500k / year per enterprise
  • Comparable: Hailo-8, Axelera Metis form factor
  • DRAM tiers needed first to scale beyond demo facts/rules
Cloud Reasoning API
After DRAM tiers (~6 mo)
Hosted endpoint. Submit your facts and rules over HTTPS, get back a derived fact set + proof chain. Per-inference billing, enterprise tier for unmetered internal use. Same compiler stack as on-prem deployments.
  • Per inference: $0.01–$1.00 (rule-depth dependent)
  • Enterprise tier: $100k–$1M / year unmetered
  • Audit-log export for regulator review
  • Comparable: GPT-4 API ($30/M tokens) for the LLM-replacement use case
Custom ASIC
18–36 month tape-out
For very high-volume embedded deployments where FPGA economics break down. 10nm projections target 500 MHz–1 GHz, ~100 mW, 1–2 mm². Current design uses 23.9% of an xczu7ev — substantial in-place expansion before tape-out is contemplated.
  • Per system: $10k–$100k depending on scale
  • Comparable: Cerebras WSE ($2–5M), TPU v4 ($30k)
  • Targets edge IoT, embedded control, signal-processing pipelines
  • Requires a customer commit to justify ~$20M tape-out NRE
006

Shippable now.
Testable now.

No vaporware. Everything below is in the repo, builds with Vivado 2025.1, passes xsim regression, and meets timing on real silicon.

Shippable Today
RTL IP — ~4,000 lines of Verilog Symbolic logic unit, reasoning-ALU bridge, CORDIC, func_engine, BC engine, sequencer. Vivado-ready.
HAL toolchain — Python + .nxp compiler nx_to_tb.py generates testbenches; AXI register sequences for production deployment.
34 silicon-verified testbenches From CAM dedup through CORDIC trig and recursive BC. All green on Vivado xsim.
100 MHz timing closure on xczu7ev WNS +46 ps, WHS +10 ps, zero failing endpoints, zero critical synth warnings.
Whitepaper v8 Full architecture, silicon results, performance comparisons, roadmap. Engineering-grade.
Reference deployment on ZCU106 / ZCU102 Bitstream-ready. Boot a board, flash, drive AXI from JTAG or PS — reasoning runs on silicon.
NOW NEXT
Testable Today — Try It
git clone the repo The nxpu-rtl/ tree builds with Vivado 2025.1. Tcl scripts in vivado/scripts/ drive xsim.
pip install -e . the Python HAL Compile any .nxp in examples/ to a Verilog testbench in one line.
Run the regression sweep 34 testbenches, ~30 minutes on a remote Vivado host. Every one labeled with what it proves.
Re-run synth + impl + timing scripts/synth_impl_timing.tcl takes ~30 minutes to confirm timing on your own board.
Open the demo page Browser-based NXLang playground at /demo — load a dataset, run a query, watch the proof chain.
Read the source on GitHub github.com/dyber-pqc/NXPU — RTL, HAL, examples, testbenches all open.
007

The GPU era
is a local maximum.

Scaling transformers hit diminishing returns on reasoning. The next leap requires architectural innovation, not bigger clusters.

Current Paradigm
Trillions of tokens Requires massive pre-collected datasets
$100M training runs Thousands of GPU-hours per model
Frozen after training Knowledge becomes stale immediately
Correlation, not causation Pattern matching without understanding
Black box No explainability, no audit trail
700W per chip Unsustainable energy trajectory
OLD NEW
NXPU Paradigm
Zero training required Load facts + rules. Get conclusions. Immediately.
1.65 uJ per derivation 78x less energy than Intel Core Ultra 9 285. 236,000x less than H100 LLM.
100% accuracy on reasoning Deductive logic is sound by construction. Zero hallucination.
Silicon-validated, timing met 34 testbenches pass on real Vivado xsim. 100 MHz on xczu7ev with WNS +46 ps (post-C.15). Bitstream-deployable.
Every step auditable Full proof chain on every conclusion: which rule, which prior facts. Compliance / FDA / SEC ready.
Bidirectional reasoning + transcendentals Forward + backward chaining, recursion, aggregation, top-K, negation, plus CORDIC sin/cos/exp on the same chip.
008

15 phases done.
34/34 silicon TBs pass.
Open source.

Not simulation. Not theory. Vivado 2025.1 synth + impl + timing met on Xilinx xczu7ev with positive slack. 34 testbenches all pass on real silicon. Bitstream-deployable now. Every line of RTL and every testbench is on github.com/dyber-pqc/NXPU for you to clone and replay. The remaining roadmap items are concrete engineering, not research.

Phases A — B.10 — Complete
Forward chaining, multi-head rules, hash-consing
CAM + rule eval + unifier + sequencer with semi-naive fixpoint evaluation. Up to 8 head facts per match with cross-head fresh-ID references for tree rewriting (B.7). Up to 8 per-match identity pools (B.6 / B.9). Structural hash-consing: equivalent subtrees collapse to one CAM entry (B.10).
C.1 — C.5.1 — Complete
ALU bridge, aggregation, top-K, BC, recursion, negation
Q16.16 ALU bridge with d/dx[x³] verified. compute_count, sum, min, max, argmax (C.6). compute_topk with parallel insertion sort (C.7). Backward chaining with SLD rule unfolding (C.5). Recursive reasoning via FC + BC hybrid — ancestor program enumerates all descendants of alice on real silicon (C.5.1). Negation-as-failure for ground and unbound variables (C.3 / C.8). Goal cursor (C.4).
Phase D + D.1 — Complete
CORDIC sin/cos + Q4.12 fadd/fsub/fmul + Taylor exp
14-iteration sequential CORDIC in rotation mode — sin and cos in Q4.12 simultaneously, 17 cycles, ±3 LSB across all 4 quadrants. Q4.12 fadd / fsub / fmul through the ALU. Taylor-series exp() engine: 5 cycles, ±6 LSB at exp(±1). Synth + impl + timing met at 100 MHz with WNS = +46 ps, WHS = +10 ps.
C.9 + C.9.1 — Complete
Probabilistic primitives + native confidence propagation
Q0.16 probabilistic ops on silicon: pmul = a×b, pnot = 1-a, psum = noisy-OR (C.9). Per-fact confidence storage parallel to CAM entries. C.9.1 wires confidence into rule firing: head_conf = product of body confs × rule_conf via a 4-deep combinational multiply tree. The chip emits graded beliefs natively, not binary facts.
C.10 — Complete
Rule discovery on silicon — ILP without training
The chip enumerates candidate rules from a template, fires each one in score-mode (no inserts), and counts how many derivations match known positive examples. Demo: chip discovered the grandparent rule from a raw family-tree dataset in microseconds, with no training, no gradients, no model weights.
C.11 — Complete
Proof trees — every fact has a receipt
Every CAM entry stores a 48-bit provenance record: which rule fired and the addresses of the body facts that satisfied each slot. The host walks the tree recursively to get a complete derivation chain back to your input data. The substrate that backs the “every NXPU answer is provable” claim.
C.12 — Complete
Epsilon-pruning — chip refuses low-confidence claims
Set min_conf threshold. Derivations whose composed head_conf falls below epsilon are NOT inserted into CAM. Two effects: results-quality stays high (low-conf noise is suppressed before the host sees it), and probabilistic forward chains die early instead of producing a combinatorial flood of near-zero-confidence facts.
C.13 + C.15 — Complete
Train/test holdout + min-support filters for ILP
Discovered rules are scored against BOTH a training set AND a held-out test set in a single firing (C.13). A rule that fits training but fails holdout is overfit, rejected. Minimum support filter (C.15) rejects rules that fit too few examples to be patterns rather than coincidences. The chip refuses to claim rules it can’t justify.
C.14 — Complete
Open-world flag — chip can say “I don’t know”
Per-predicate flag toggles between closed-world (NaF treats absence as false) and open-world (absence means UNKNOWN, not false). For open-world predicates the chip refuses to satisfy a negated body atom on missing data. Demo: chip refused to declare patient_b “safe to prescribe” when it had no allergy data on him.
Phase E — Causal Discovery on Silicon — Next
Lift CSE conditional-independence tests onto chip
Phase 0’s Causal Structure Engine beat the PC algorithm on the Sachs benchmark (F1 = 0.786) in software. Lifting the conditional- independence primitives onto silicon makes causal-graph discovery a hardware operation: feed observational data, get a causal DAG. ~3–4 weeks of RTL.
DRAM Tiers — First scale unlock
From demo scale to real-dataset scale
Xilinx MIG IP integration. CAM-as-hot-set cache controller. Streaming rule loader. Moves the chip from 256 facts and 16 rules (demo) to millions of facts and thousands of rules (production). The threshold at which the chip can ingest real datasets: FAERS, SNOMED, full clinical decision-support knowledge bases.
Perception Coupling
Wire the Neural Mesh into the fact stream
16 LIF spiking neurons with STDP already on die. Wiring them to the fact-producer path lets raw signal streams be structured into facts on-chip — closes the host-encoding gap. The difference between “Datalog coprocessor” and “reasoning chip” deployable on raw inputs.
ASIC Tape-Out — Out-Year
10 nm, 500 MHz–1 GHz, ~100 mW
Current design uses 23.9% of an xczu7ev. Substantial in-place expansion room before tape-out is contemplated. Projections at 10 nm: ~100 mW, 1–2 mm², 1 billion queries/sec.
009

Replay every silicon TB
on your own machine.

Everything is open-source on github.com/dyber-pqc/NXPU. Clone the repo, point it at your Vivado install, and run any of the 34 testbenches against the same RTL we run on real silicon. The examples/ directory has a working .nxp program for every major capability. Read them, modify them, write your own.

STEP 1 · CLONE
git clone https://github.com/dyber-pqc/NXPU.git
cd NXPU
pip install -e .
You get the full RTL tree, the HAL Python compiler, the example programs, and every silicon testbench.
STEP 2 · COMPILE A PROGRAM
# A medical-safety demo (open-world reasoning)
python -m nxpu.hal.nx_to_tb \
    examples/open_world.nxp \
    -o tb_open_world_gen.v
The HAL parses your .nxp source, allocates symbols, encodes rule registers, and emits a self-contained Verilog testbench that drives the chip’s AXI bus.
STEP 3 · RUN AGAINST RTL
# Vivado xsim: real RTL, real silicon path
vivado -mode batch \
       -source nxpu-rtl/vivado/scripts/run_open_world_tb.tcl

--- PASS 1: allergy is OPEN-WORLD ---
  -> safe_to_prescribe in CAM: 0
--- PASS 2: allergy is CLOSED-WORLD (NaF) ---
  -> safe_to_prescribe in CAM: 1
PASS: open-world flag prevents hallucination
      from absence of evidence
That’s the same RTL that ran on the FPGA — bit-identical. You can also run on a Xilinx ZCU104 dev board if you have one.
STEP 4 · BROWSE THE DEMOS
examples/diagnostic_conf.nxp     # calibrated diagnosis
examples/discover_grandparent.nxp # rule discovery
examples/open_world.nxp           # I-don't-know logic
examples/ancestor.nxp             # recursive Datalog
examples/pharma_safety.nx         # drug interactions
examples/algebra_power.nxp        # symbolic d/dx
Six lines of NXLang typically maps to one silicon TB. Edit the data, re-compile, re-run, see new results in seconds.
SILICON TESTBENCHES YOU CAN REPLAY (ALL PASS, REAL RTL)
run_proof_tree_tb — every derived fact has a proof tree
run_diagnostic_conf_tb — native confidence propagation
run_discover_grandparent_tb — chip discovers rule from data
run_holdout_tb — train/test split for ILP
run_min_conf_tb — chip refuses low-confidence claims
run_min_support_tb — coincidence rejection in discovery
run_open_world_tb — chip says “I don’t know”
run_ancestor_tb — recursive ancestor closure
run_ancestor_bc_tb — recursive backward chaining
run_silicon_reasoning — symbolic d/dx[x³]
run_algebra_power_eval — differentiate then evaluate
run_cordic_tb — CORDIC sin/cos in 17 cycles
run_phase_d_ext_tb — Q4.12 fixed-point + Taylor exp
run_probabilistic_tb — pmul / pnot / psum noisy-OR
run_aggregation_tb — sum / count / min / max / argmax
run_topk_tb — parallel insertion-sort top-K
run_unbound_neg_tb — negation-as-failure (closed-world)
run_hash_cons_tb — structural deduplication
run_tree_rewrite_tb — algebraic tree rewriting
+ 14 more — full list in repo / vivado/scripts/
OPEN INVITATION

We’re looking for early users in healthcare, finance, defense, legal, and pharma — any regulated domain where LLM hallucinations are a liability. If you have a dataset, write a few .nxp rules and let the chip reason on it. If you don’t have a dataset, give the chip your domain’s positive and negative examples and let it discover the rules itself.

Bug reports, pull requests, feature requests — all welcome. Email nxpu@dyber.org for technical briefings, partnership conversations, or pilot deployments.

Schedule a
technical briefing.

Bring your rule set or your KB. We’ll show you the chip running it — on real silicon, with the proof chain, in microseconds. POC engagements typically scope at $250k–$500k over 6 months.

Star on GitHub Schedule Briefing IP Licensing
nxpu@dyber.org  ·  github.com/dyber-pqc/NXPU