skainet-data-transform/sk.ainet.data.transform/Transform

Transform

A type-safe transformation operation that converts input of type I to output of type O.

Transforms are the building blocks of data preprocessing pipelines in SKaiNET. They can be composed together using the then infix function to create complex preprocessing chains while maintaining full type safety.

Design Principles

Type Safety: Generic types I and O ensure that transforms can only be composed when their types are compatible.
Composability: Transforms can be chained using then to create pipelines:
```
 val pipeline = loadImage then resize(224, 224) then toTensor then normalize
```
Content copied to clipboard
Shape Awareness: Each transform can compute its output shape from the input shape, enabling shape inference without executing the full pipeline.
Immutability: Transforms should be stateless and immutable. Any configuration should be provided at construction time.

Example Usage

// Define a simple scaling transform
class Scale(private val factor: Float) : Transform<Float, Float> {
    override fun apply(input: Float): Float = input * factor
    override fun getOutputShape(inputShape: Shape): Shape = inputShape
}

// Compose transforms
val pipeline = Scale(2.0f) then Scale(0.5f)
val result = pipeline.apply(10.0f)  // Returns 10.0f

Parameters

The input type this transform accepts

The output type this transform produces

Inheritors

Functions

apply

abstract fun apply(input: I): O

Applies this transformation to the given input.

centerCrop

fun Transform<I, PlatformBitmapImage>.centerCrop(size: Int): Transform<I, PlatformBitmapImage>

Chains an ImageCenterCrop transform that extracts a centered square.

clamp

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.clamp(ctx: ExecutionContext, min: Float, max: Float): Transform<I, Tensor<T, V>>

Chains a Clamp transform that restricts values to a range.

crop

fun Transform<I, PlatformBitmapImage>.crop(top: Int = 0, bottom: Int = 0, left: Int = 0, right: Int = 0): Transform<I, PlatformBitmapImage>

Chains an ImageCrop transform that removes pixels from edges.

flatten

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.flatten(ctx: ExecutionContext, startDim: Int = 0, endDim: Int = -1): Transform<I, Tensor<T, V>>

Chains a Flatten transform.

getOutputShape

abstract fun getOutputShape(inputShape: Shape): Shape

Computes the output shape that would result from applying this transform to data with the given input shape.

normalize

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.normalize(ctx: ExecutionContext, mean: FloatArray, std: FloatArray, channelAxis: Int = -1): Transform<I, Tensor<T, V>>

Chains a Normalize transform that applies channel-wise normalization.

pad

fun Transform<I, PlatformBitmapImage>.pad(top: Int = 0, bottom: Int = 0, left: Int = 0, right: Int = 0, red: Int = 0, green: Int = 0, blue: Int = 0): Transform<I, PlatformBitmapImage>

Chains an ImagePad transform that adds pixels to edges.

rescale

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.rescale(ctx: ExecutionContext, scale: Float = 255.0f): Transform<I, Tensor<T, V>>

Chains a Rescale transform that divides values by a scale factor.

reshape

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.reshape(ctx: ExecutionContext, vararg dims: Int): Transform<I, Tensor<T, V>>

Chains a Reshape transform using vararg dimensions.

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.reshape(ctx: ExecutionContext, shape: Shape): Transform<I, Tensor<T, V>>

Chains a Reshape transform.

resize

fun Transform<I, PlatformBitmapImage>.resize(width: Int, height: Int, interpolation: Interpolation = Interpolation.BILINEAR): Transform<I, PlatformBitmapImage>

Chains an ImageResize transform.

rotate

fun Transform<I, PlatformBitmapImage>.rotate(degrees: Float, interpolation: Interpolation = Interpolation.BILINEAR): Transform<I, PlatformBitmapImage>

Chains an ImageRotate transform.

scaleAndShift

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.scaleAndShift(ctx: ExecutionContext, scale: Float, offset: Float = 0.0f): Transform<I, Tensor<T, V>>

Chains a ScaleAndShift transform: output = input * scale + offset

squeeze

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.squeeze(ctx: ExecutionContext, dim: Int? = null): Transform<I, Tensor<T, V>>

Chains a Squeeze transform that removes size-1 dimensions.

then

open infix fun <N> then(next: Transform<O, N>): Transform<I, N>

Chains this transform with another transform, creating a pipeline.

toTensor

fun Transform<I, PlatformBitmapImage>.toTensor(ctx: ExecutionContext): Transform<I, Tensor<FP16, Float>>

Chains an ImageToTensor transform that converts the image to a tensor.

unsqueeze

fun <I, T : DType, V> Transform<I, Tensor<T, V>>.unsqueeze(ctx: ExecutionContext, dim: Int): Transform<I, Tensor<T, V>>

Chains an Unsqueeze transform that adds a dimension.