LICENCE Maven Central

SKaiNET logo

Vision

SKaiNET aims to democratize "Edge AI / On-device AI" by bridging the gap between high-level application development and low-level hardware optimization. We believe AI should be portable, type-safe, and developer-friendly, enabling seamless intelligence in everything from mobile apps to IoT devices without sacrificing performance.

For architecture details see ARCHITECTURE.md.

Quickstart

Add the core dependencies (Gradle Kotlin DSL):

dependencies {
implementation("sk.ainet.core:SKaiNET-lang-core:0.19.0")
implementation("sk.ainet.core:SKaiNET-backend-cpu:0.19.0")
}

Hello Neural Net

val model = nn {
input(28 * 28)
dense(out = 128)
relu()
dense(out = 10)
}

Core Tensor Ops

val a = tensor(shape(2, 2)) { float(1f, 2f, 3f, 4f) }
val b = tensor(shape(2, 2)) { float(5f, 6f, 7f, 8f) }

val c = a matMul b
val d = c.relu()

GGUF Model Loading

// Recommended: streaming reader — memory-efficient, supports quantized types
val source = JvmRandomAccessSource.open("model.gguf")
StreamingGGUFReader.open(source).use { reader ->
println("Tensors: ${reader.tensorCount}")

// Load specific tensor on demand (no whole-file loading)
val bytes = reader.loadTensor("token_embd.weight")

// Or get a TensorStorage descriptor with encoding/placement metadata
val storage = reader.loadTensorStorage("token_embd.weight")
}

More examples: SKaiNET-examples | SKaiNET-notebook

Ecosystem

SKaiNET is a modular ecosystem. While this repository contains the core engine, specialized high-level libraries are maintained in standalone repositories:

Project Description
SKaiNET-LLM Llama, Gemma, and BERT inference runtimes
SKaiNET-transformers Pre-built transformer architectures and layers
SKaiNET-examples Sample projects and integration demos

Explore

Goal Start here
Examples and sample projects SKaiNET-examples
Interactive notebooks SKaiNET-notebook
LLM inference (Llama, Gemma) SKaiNET-LLM

Features

Kotlin Multiplatform

  • Targets: JVM, macOS (Native), JS, WASM (Browser + WasmWasi)

  • Single codebase shared across all platforms via Kotlin Multiplatform

Optimized Execution

  • ComputeGraphExecutor: Optimized engine with fusion passes and trace-to-DAG bridging.

  • SDPA & Gather: High-performance Scaled Dot-Product Attention and indexing operations.

  • TurboQuant: Runtime KV-cache compression (~8x at 4-bit) for long-context LLM inference. Presets: safe-lowbit, balanced, experimental-max. See TurboQuantUsage for integration guide.

Agentic AI Infrastructure

  • ComputeGraph: Unified framework for defining agentic workflows and tool-calling loops.

  • Java facade: JavaAgentLoop (in skainet-lang-java)

Neural Network DSL

  • Sequential: nn { input(); dense(); relu(); dense() }

  • DAG / Graph: arbitrary wiring with dag { } for ResNet, YOLO-style architectures

  • Layers: Dense, Conv1d/2d/3d, MaxPool, AvgPool, BatchNorm, Dropout, LeakyReLU, ELU

  • KAN (Kolmogorov–Arnold Networks) layer (experimental)

  • Autograd engine with reverse-mode gradients, SGD and Adam/AdamW optimizers

Data and I/O

  • Built-in loaders: MNIST, Fashion-MNIST, CIFAR-10

  • Formats: GGUF, ONNX, SafeTensors, JSON, Image (JPEG, PNG)

  • Type-safe transform DSL: resize, crop, normalize, toTensor

Java 21+ Support

  • SKaiNET entry point, TensorJavaOps, builder-pattern model definition

  • Maven BOM (sk.ainet:skainet-bom) for one-line version management

Edge AI: Arduino / C99 Export

  • Export trained models to standalone, optimized C99 with static memory allocation

  • Ready-to-use Arduino library output

Compiler: MLIR / StableHLO

  • Lower Kotlin DSL to MLIR StableHLO dialect

  • Optimization passes: constant folding, operation fusion, dead code elimination

  • Valid IREE-compilable output with streaming API and public HloGenerator

What's New in 0.19.0

  • Qwen / GPT-2 Byte-Level BPE Tokenizer — Full GPT-2-style pipeline (byte-to-unicode, pretokenization regex, merge-rank BPE, atomic special-token splitting). Builds from GGUF metadata or HuggingFace tokenizer.json; verified against Qwen2.5-0.5B reference token IDs.

  • LLaMA / SentencePiece Tokenizer — llama.cpp SPM pipeline with whitespace escape, score-priority BPE (SPM rule, opposite of GPT-2 merge-rank), and <0xNN> byte fallback. Builds from GGUF (tokenizer.ggml.model == "llama") and HuggingFace Unigram tokenizer.json.

  • TokenizerFactory Per-Architecture Dispatch — Tokenizer selection is now per-architecture, not per file format. Qwen/GPT-2 → byte-level BPE, LLaMA/Gemma/TinyLlama → SentencePiece, regardless of whether weights come from GGUF or SafeTensors.

  • Byte-Level BPE Fix for Qwen/GPT-2 — Previously these models encoded text into garbage tokens because GgufModelMetadata ignored tokenizer.ggml.merges entirely, blocking chat mode and tool calling. (#463)

  • LLaMA GGUF Tokenization FixTokenizerFactory previously threw UnsupportedTokenizerException for LLaMA-family GGUFs; the new SentencePiece path closes that gap. (#464)

  • GGUF UInt Field Fix — UINT32 fields (e.g. tokenizer.ggml.bos_token_id) are Kotlin UInt value classes, not subclasses of Number, and were silently dropped by as? Number casts. Fixed via a toIntFlexible helper that handles every signed and unsigned numeric type GGUF can produce.

See CHANGELOG.md for the full release history.

Roadmap

  • Q1 2026: Comprehensive documentation ✅

  • Q2 2026: TurboQuant KV-cache compression ✅ (shipped in 0.18.0); Qwen/LLaMA tokenizers ✅ (shipped in 0.19.0)

  • Q3 2026: Agentic AI enhancements ✅ (tool calling shipped in 0.13.0; ongoing)

  • Q4 2026: Federated learning support for multi-device training

Contributing & Community

We love contributions! Whether it's a new operator, documentation, or a bug fix:

  1. Read our CONTRIBUTING.md.

  2. Check the Good First Issues.

  3. Open a discussion or issue on GitHub.

Browse the full codebase documentation on DeepWiki.

Contributors (0.14.0)

  • Dhia Chemingui (@dhiaspaner) — Android KMP plugin migration (#385, #386)

License

MIT — see LICENCE.

All modules:

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
Link copied to clipboard