Skip to content

Accordo

Accordo automatically validates GPU kernel correctness by capturing and comparing kernel outputs from reference and optimized implementations.

Features

  • Automatic kernel extraction: Uses KernelDB to extract kernel signatures from binaries
  • Snapshot-based validation: Capture once, compare against multiple optimizations
  • Configurable tolerance: Set precision requirements for floating-point comparisons (atol, rtol, equal_nan)
  • Performance tracking: Measure and compare execution times

Installation

Accordo compiles C++ code (via KernelDB) during installation. You need cmake, libdwarf-dev, and libzstd-dev first:

Terminal window
# System prerequisites (Debian/Ubuntu)
sudo apt-get update && sudo apt-get install -y cmake libdwarf-dev libzstd-dev
# Install
pip install "git+https://github.com/AMDResearch/intellikit.git#subdirectory=accordo"

Quick start

from accordo import Accordo
# Create validator for a specific kernel
validator = Accordo(binary="./app_ref", kernel_name="reduce_sum")
# Capture snapshots from reference and optimized binaries
ref = validator.capture_snapshot(binary="./app_ref")
opt = validator.capture_snapshot(binary="./app_opt")
# Compare with allclose-style controls
result = validator.compare_snapshots(ref, opt, atol=1e-6, rtol=1e-5)
if result.is_valid:
print(f"PASS: {result.num_arrays_validated} arrays matched")
else:
print(result.summary())

Testing multiple optimizations

validator = Accordo(binary="./ref", kernel_name="matmul")
ref = validator.capture_snapshot(binary="./ref")
for opt_binary in ["./opt_v1", "./opt_v2", "./opt_v3"]:
opt = validator.capture_snapshot(binary=opt_binary)
result = validator.compare_snapshots(ref, opt, atol=1e-6, rtol=1e-5)
print(f"{opt_binary}: {'PASS' if result.is_valid else 'FAIL'}")

CLI reference

accordo validate \
--kernel-name NAME \
--ref-binary PATH_TO_EXECUTABLE \
--opt-binary PATH_TO_EXECUTABLE \
[--tolerance FLOAT] # legacy alias for --atol
[--atol FLOAT] # absolute tolerance (default: 1e-08)
[--rtol FLOAT] # relative tolerance (default: 1e-05)
[--equal-nan] # treat NaN == NaN
[--timeout SECONDS] # per snapshot, default: 30
[--working-dir DIR] # default: .
[--kernel-args 'n1:t1,n2:t2,...']
[--log-level DEBUG|INFO|WARNING|ERROR]

API reference

Accordo(binary, kernel_name, **options)

Parameters:

  • binary (str | list) — binary path to extract kernel signature from
  • kernel_name (str) — name of the kernel to validate
  • kernel_args (list[tuple] | None) — manual kernel args as [(name, type), ...]. Auto-extracted if None.
  • working_directory (str) — working directory (default: ".")
  • force_rebuild (bool) — force rebuild even if library exists (default: False)
  • parallel_jobs (int) — number of parallel build jobs (default: 16)
  • log_level (str) — logging level (default: "WARNING")

Methods:

  • capture_snapshot(binary, timeout_seconds=30, dispatch_id=None) -> Snapshot
  • compare_snapshots(reference, optimized, tolerance=None, *, atol=1e-08, rtol=1e-05, equal_nan=False) -> ValidationResult

Snapshot

AttributeTypeDescription
arrayslist[np.ndarray]Captured output arrays (first dispatch)
dispatch_arrayslist[list[np.ndarray]] | NonePer-dispatch output arrays
execution_time_msfloatExecution time
grid_sizedict | NoneKernel grid dimensions
block_sizedict | NoneKernel block dimensions

ValidationResult

AttributeTypeDescription
is_validboolWhether validation passed
num_arrays_validatedintTotal arrays checked
num_mismatchesintFailed comparisons
mismatcheslist[ArrayMismatch]Detailed mismatch info

Methods:

  • summary() -> str — human-readable validation summary

Requirements

  • Python >= 3.8
  • ROCm toolchain
  • KernelDB (automatically installed)