Verification

This section tracks the unit-level verification harness that accompanies the v002 RTL. Functional / formal / silicon checks at the system level are planned but not yet in place — their scope is listed at the end of this page.

1. Current test suite

All tests run under Vivado xsim via the unified runner hw/sim/run_verification.sh in the RTL repo. Each test generates a .pccx trace that pccx-lab can visualise on the Timeline.

Testbench

Scope

Status

tb_GEMM_dsp_packer_sign_recovery

Dual-channel W4A8 pack + post-MAC sign recovery, 1024 cycles

PASS

tb_mat_result_normalizer

BF16 alignment / exponent delay, 256 cycles

PASS

tb_GEMM_weight_dispatcher

HP weight stream → INT4 tile dispatch, 128 cycles

PASS

tb_FROM_mat_result_packer

32 × 16-bit staggered capture → 128-bit AXIS pack, 4 cycles

PASS

tb_barrel_shifter_BF16

BF16 mantissa barrel shift, 512 cycles

PASS

tb_ctrl_npu_decoder

64-bit VLIW decode → typed structs, 6 cycles

PASS

Running the suite

cd pccx-FPGA-NPU-LLM-kv260/hw/sim
bash run_verification.sh

The script is idempotent — each tb writes to its own hw/sim/work/<tb_name>/ directory so concurrent runs do not stomp on each other. A one-line PASS / FAIL summary prints at the end along with the path of each emitted .pccx trace.

Last verified against

Commit 773bd82 @ pccxai/pccx-FPGA-NPU-LLM-kv260 (2026-04-21). Six testbenches PASS; tb_GEMM_fmap_staggered_delay is parked — see run_verification.sh for the reason.

2. Gaps (tracked as open items)

Area

Planned next step

System-level smoke

tb_NPU_top_smoke — AXI-Lite decode → MEMSET → MEMCPY → GEMV → MEMCPY readback, golden compare against llm-lite reference. Blocked on the ACP fanout / writeback wiring (see NPU Top-Level).

CVO bridge

tb_cvo_bridge — cvo_uop → READ burst → CVO echo → WRITE burst; covers the L2 port-B direct-address path once the arbiter lands.

mem_dispatcher uop table

Regression guard for MEMSET + LOAD + STORE + CVO routing.

Driver-RTL cross check

Host-only test: encode via uca_* API, dump 64-bit VLIW, cross-check against isa_pkg.sv bit layout.

Formal

SymbiYosys / JasperGold properties for the ISA decoder and memory arbitration (future).

3. Planned scope (system level)

Once the unit-level gaps above are closed, the full verification section will cover:

  • Verification plan — per-module functional coverage targets and sign-off criteria.

  • Testbench architecture — UVM / cocotb harness layout, reference model, and stimulus generators (golden data from the llm-lite CPU reference).

  • Coverage dashboard — functional + code + assertion coverage, rolled up by core (GEMM / GEMV / CVO / MEM).

  • Formal results — SymbiYosys / JasperGold properties for the controller, ISA decoder, and memory arbitration.

  • Silicon-level checks — post-implementation simulation, on-board bring-up scripts, and golden-trace comparison against the host-side C driver.

External tool interface — status surface boundaries

The verification workflow is incrementally surfacing JSON status summaries that external tools can consume without parsing free-form logs. These boundaries live outside this docs site:

  • pccx-lab — JSON / Sail / hybrid source-intake status summary boundaries, plus sample boundaries for plugins and external tool interfaces. Treated as an evolving boundary; no stability promise is made until the v002.0 verification evidence is published.

  • systemverilog-ide — module-graph reports (path, edge, port connection audit, reachability, span, file-layout, order) exposed as a separate boundary; consumed by reviewers but not part of the pccx documentation surface.

These boundaries are referenced here for traceability only. They do not constitute verification evidence on their own.

See also

Hardware Architecture

The architectural contracts this section verifies.

Instruction Set Architecture (ISA)

ISA-level invariants the testbench checks.

RTL Source Reference (v002)

RTL modules under test.