Audio

class Audio<T : DType, V>(source)

Audio wrapper that combines tensor data with audio-specific metadata.

This class provides a type-safe representation of audio data for use in data processing pipelines. It wraps an underlying tensor and tracks sample rate, channel layout, and other audio properties.

Example:

val audio = Audio.fromTensor(tensor, 16000, ChannelLayout.MONO)
println("Duration: ${audio.duration} seconds")
println("Samples: ${audio.sampleCount}")

Parameters

T

The DType of the underlying tensor

V

The value type of tensor elements

Types

Link copied to clipboard
object Companion

Properties

Link copied to clipboard

Batch size (1 if not batched).

Link copied to clipboard

Number of audio channels (1 for mono, 2 for stereo, etc.).

Link copied to clipboard

Duration of the audio in seconds.

Link copied to clipboard

Duration in milliseconds.

Link copied to clipboard

Whether this audio has a batch dimension.

Link copied to clipboard

Whether this is mono (single channel) audio.

Link copied to clipboard

Whether this is stereo (two channel) audio.

Link copied to clipboard

Memory layout of the audio data.

Link copied to clipboard

Number of audio samples (per channel).

Link copied to clipboard

Sample rate in Hz (samples per second).

Link copied to clipboard

The shape of the underlying tensor.

Link copied to clipboard
val tensor: Tensor<T, V>

The underlying tensor data containing audio samples.

Functions

Link copied to clipboard
open override fun toString(): String
Link copied to clipboard
fun withLayout(newLayout: ChannelLayout): Audio<T, V>

Create a copy with different layout (metadata only).

Link copied to clipboard
fun withSampleRate(newSampleRate: Int): Audio<T, V>

Create a copy with different sample rate (metadata only).