QuantPolicy

Controls how quantized tensors are handled during weight loading.

Shared across all weight loaders (LLaMA, Gemma, etc.).

Entries

Link copied to clipboard

Keep quantized payloads as raw bytes (Int8 tensor) with quantized shape.

Link copied to clipboard

Dequantize to FP32 on load.

Link copied to clipboard

Mixed mode: dequantize F32/F16/BF16 tensors to FP32, but keep quantized weight tensors (Q4_0, Q8_0, etc.) as raw bytes for native kernel consumption.

Properties

Link copied to clipboard

Returns a representation of an immutable list of all enum entries, in the order they're declared.

Functions

Link copied to clipboard

Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)

Link copied to clipboard

Returns an array containing the constants of this enum type, in the order they're declared.