iree-tools CLI Reference

Synopsis

uv run python main.py <command> [options] <input>

Commands

`compile`

Compile a StableHLO MLIR file to IREE VMFB for both host and rv32 targets.

uv run python main.py compile <input.mlir>

Produces:

out/<stem>_host.vmfb — host-native bytecode (x86_64/arm64)
out/<stem>_rv32.vmfb — RISC-V 32-bit bytecode (not runnable on NPU without IREE runtime)

`verify`

Compile for host and run iree-run-module to produce reference output.

uv run python main.py verify <input.mlir>

If the MLIR function has arguments, IREE fills them with default values (zeros). The output serves as a correctness reference for comparing against simulator results.

`generate-c`

Transpile StableHLO MLIR to C source code compatible with coralnpu_v2_binary.

uv run python main.py generate-c <input.mlir> [-o output.cc]

Argument Required Description

Argument	Required	Description
`input`	Yes	Path to StableHLO MLIR file
`-o`, `--output`	No	Output C file path (default: `out/<stem>.cc`)

input

Yes

Path to StableHLO MLIR file

-o, --output

Output C file path (default: out/<stem>.cc)

`build-elf`

Generate C source, write BUILD.bazel, and invoke Bazel to produce a bare-metal ELF.

uv run python main.py build-elf <input.mlir>

This command:

Parses the MLIR and generates C source
Writes .cc and BUILD.bazel to ../coralnpu/examples/generated/<stem>/
Runs bazel build in the coralnpu workspace

Output ELF path: ../coralnpu/bazel-bin/examples/generated/<stem>/coralnpu_v2_<stem>.elf

`simulate`

Run an ELF binary on the MPACT behavioral simulator.

uv run python main.py simulate <input.elf> [--output-sizes sym=N ...]

Argument Required Description

Argument	Required	Description
`input`	Yes	Path to ELF binary
`--output-sizes`	No	Output symbol sizes as `name=count` pairs. Default: `output_0=16`

input

Yes

Path to ELF binary

--output-sizes

Output symbol sizes as name=count pairs. Default: output_0=16

Example:

uv run python main.py simulate program.elf --output-sizes output_0=64 output_1=32

`run-all`

Execute the full end-to-end pipeline: compile → verify → generate C → build ELF → simulate → compare.

uv run python main.py run-all <input.mlir>

Steps executed:

Compile MLIR for host → out/<stem>_host.vmfb
Run on host via iree-run-module → reference output
Transpile MLIR → C source
Write generated files to coralnpu/examples/generated/<stem>/
Build ELF via Bazel
Run ELF on MPACT simulator
Print host vs. simulator outputs for comparison

IREE Compiler Flags

Used internally by the compile and verify commands:

Host Compilation

Flag Value

Flag	Value
`--output-format`	`vm-bytecode`
`--iree-input-type`	`stablehlo`
`--iree-hal-target-device`	`local`
`--iree-hal-local-target-device-backends`	`llvm-cpu`

--output-format

vm-bytecode

--iree-input-type

stablehlo

--iree-hal-target-device

local

--iree-hal-local-target-device-backends

llvm-cpu

RV32 Compilation

Flag Value

Flag	Value
`--iree-llvmcpu-target-triple`	`riscv32-pc-linux-elf`
`--iree-llvmcpu-target-cpu`	`generic-rv32`
`--iree-llvmcpu-target-cpu-features`	`+m,+f`
`--iree-llvmcpu-target-abi`	`ilp32`
`--iree-llvmcpu-debug-symbols`	`false`
`--iree-stream-partitioning-favor`	`min-peak-memory`

--iree-llvmcpu-target-triple

riscv32-pc-linux-elf

--iree-llvmcpu-target-cpu

generic-rv32

--iree-llvmcpu-target-cpu-features

+m,+f

--iree-llvmcpu-target-abi

ilp32

--iree-llvmcpu-debug-symbols

false

--iree-stream-partitioning-favor

min-peak-memory

The +f CPU feature is required. Without it, the compiler emits soft-float code that needs mulsf3/addsf3 builtins not available in the bare-metal link environment.

Python Module Reference

`mlir_parser.py`

Function Purpose

Function	Purpose
`parse_module(text: str) → Module`	Parse complete StableHLO MLIR text into IR dataclasses
`parse_tensor_type(text: str) → TensorType`	Parse `tensor<1x3x4x4xf32>` into shape + element type
`parse_dense_values(text: str) → list[float]`	Extract float values from nested bracket notation
`parse_constant(line: str) → ConstantOp \| None`	Parse a `stablehlo.constant` line
`parse_convolution(line: str) → ConvolutionOp \| None`	Parse a `stablehlo.convolution` line with all attributes
`parse_binary_op(line: str) → BinaryOp \| None`	Parse `stablehlo.add/multiply/subtract/divide`
`parse_convert(line: str) → ConvertOp \| None`	Parse `stablehlo.convert` (type conversion)
`parse_return(line: str) → ReturnOp \| None`	Parse `return` statement

parse_module(text: str) → Module

Parse complete StableHLO MLIR text into IR dataclasses

parse_tensor_type(text: str) → TensorType

Parse tensor<1x3x4x4xf32> into shape + element type

parse_dense_values(text: str) → list[float]

Extract float values from nested bracket notation

parse_constant(line: str) → ConstantOp | None

Parse a stablehlo.constant line

parse_convolution(line: str) → ConvolutionOp | None

Parse a stablehlo.convolution line with all attributes

parse_binary_op(line: str) → BinaryOp | None

Parse stablehlo.add/multiply/subtract/divide

parse_convert(line: str) → ConvertOp | None

Parse stablehlo.convert (type conversion)

parse_return(line: str) → ReturnOp | None

Parse return statement

`ir.py`

Class Fields

Class	Fields
`TensorType`	`shape: list[int]`, `element_type: str`
`ConstantOp`	`result_name: str`, `values: list[float]`, `result_type: TensorType`
`ConvertOp`	`result_name: str`, `operand: str`, `input_type: TensorType`, `result_type: TensorType`
`ConvolutionOp`	`result_name: str`, `lhs: str`, `rhs: str`, `lhs_type: TensorType`, `rhs_type: TensorType`, `result_type: TensorType`, `strides: list[int]`, `padding: list[list[int]]`, `rhs_dilate: list[int]`, `batch_group_count: int`, `feature_group_count: int`
`BinaryOp`	`op: str`, `result_name: str`, `lhs: str`, `rhs: str`, `result_type: TensorType`
`ReturnOp`	`values: list[str]`, `types: list[TensorType]`
`FuncDef`	`name: str`, `args: list[tuple[str, TensorType]]`, `return_types: list[TensorType]`, `body: list[Op]`
`Module`	`functions: list[FuncDef]`

TensorType

shape: list[int], element_type: str

ConstantOp

result_name: str, values: list[float], result_type: TensorType

ConvertOp

result_name: str, operand: str, input_type: TensorType, result_type: TensorType

ConvolutionOp

result_name: str, lhs: str, rhs: str, lhs_type: TensorType, rhs_type: TensorType, result_type: TensorType, strides: list[int], padding: list[list[int]], rhs_dilate: list[int], batch_group_count: int, feature_group_count: int

BinaryOp

op: str, result_name: str, lhs: str, rhs: str, result_type: TensorType

ReturnOp

values: list[str], types: list[TensorType]

FuncDef

name: str, args: list[tuple[str, TensorType]], return_types: list[TensorType], body: list[Op]

Module

functions: list[FuncDef]

`codegen.py`

Function Purpose

Function	Purpose
`generate_c(module: Module) → str`	Generate complete C source from IR module
`_generate_convolution(…)`	Emit 1×1 optimized or general 7-loop convolution
`_generate_binary_op(…)`	Emit element-wise `for` loop

generate_c(module: Module) → str

Generate complete C source from IR module

_generate_convolution(…)

Emit 1×1 optimized or general 7-loop convolution

_generate_binary_op(…)

Emit element-wise for loop

`bazel_builder.py`

Function Purpose

Function	Purpose
`write_generated_files(name, c_source) → (cc_path, build_path)`	Write `.cc` + `BUILD.bazel` to `coralnpu/examples/generated/<name>/`
`build_elf(name) → elf_path`	Invoke `bazel build` and return the ELF path

write_generated_files(name, c_source) → (cc_path, build_path)

Write .cc + BUILD.bazel to coralnpu/examples/generated/<name>/

build_elf(name) → elf_path

Invoke bazel build and return the ELF path

`simulator.py`

Function Purpose

Function	Purpose
`run_elf(elf_path, input_data, output_symbols, output_sizes) → dict`	Load ELF, write input data, run simulator, read output data and cycle count

run_elf(elf_path, input_data, output_symbols, output_sizes) → dict

Load ELF, write input data, run simulator, read output data and cycle count