skainet-data-media/sk.ainet.data.media/Audio

Audio

class Audio<T : DType, V>(source)

Audio wrapper that combines tensor data with audio-specific metadata.

This class provides a type-safe representation of audio data for use in data processing pipelines. It wraps an underlying tensor and tracks sample rate, channel layout, and other audio properties.

Example:

val audio = Audio.fromTensor(tensor, 16000, ChannelLayout.MONO)
println("Duration: ${audio.duration} seconds")
println("Samples: ${audio.sampleCount}")

Parameters

T

The DType of the underlying tensor

V

The value type of tensor elements

Types

object Companion

Properties

val batchSize: Int

Batch size (1 if not batched).

val channelCount: Int

Number of audio channels (1 for mono, 2 for stereo, etc.).

val duration: Double

Duration of the audio in seconds.

val durationMs: Double

Duration in milliseconds.

val isBatched: Boolean

Whether this audio has a batch dimension.

val isMono: Boolean

Whether this is mono (single channel) audio.

val isStereo: Boolean

Whether this is stereo (two channel) audio.

val layout: ChannelLayout

Memory layout of the audio data.

val sampleCount: Int

Number of audio samples (per channel).

val sampleRate: Int

Sample rate in Hz (samples per second).

val shape: Shape

The shape of the underlying tensor.

val tensor: Tensor<T, V>

The underlying tensor data containing audio samples.

Functions

open override fun toString(): String

fun withLayout(newLayout: ChannelLayout): Audio<T, V>

Create a copy with different layout (metadata only).

fun withSampleRate(newSampleRate: Int): Audio<T, V>

Create a copy with different sample rate (metadata only).