CLI Reference
skainet (Unified CLI)
skainet -m <model.gguf> [options] [prompt]
Options
| Flag | Default | Description |
|---|---|---|
|
(required) |
Path to |
|
|
Number of tokens to generate |
|
|
Sampling temperature (0 = greedy) |
|
Interactive multi-turn chat mode |
|
|
Interactive agent mode with tool calling |
|
|
Tool calling demo (interactive or single-shot with prompt) |
|
|
(auto) |
Force chat template: |
|
|
Cap context length (reduces memory) |
|
Show help text |
Examples
# Text generation
skainet -m model.gguf "The meaning of life is"
# Chat
skainet -m model.gguf --chat
# Tool calling demo (interactive)
skainet -m model.gguf --demo
# Tool calling demo (single-shot, for testing)
skainet -m model.gguf --demo "What is 2 + 2?"
# Low temperature, more tokens
skainet -m model.gguf -s 128 -k 0.3 "Explain quantum computing"
kllama
kllama -m <model> [-t <tokenizer>] [-s <steps>] [-k <temp>] [-p <systemprompt>] [--chat] [--agent] [--demo] [--template=NAME] <prompt>
Same options as skainet, plus:
| Flag | Description |
|---|---|
|
Path to external tokenizer file (auto-detected for GGUF) |
|
System prompt prepended to user message |
|
Force compute backend (see |
|
List available compute backends and exit |
Supports .gguf, .safetensors, and .bin (Karpathy) model formats.
Gradle Tasks
| Task | Description |
|---|---|
|
Unified CLI (auto-detects architecture) |
|
LLaMA/Qwen/Mistral CLI |
|
Qwen CLI (basic generation) |
|
Gemma CLI |
|
Apertus CLI |
|
Voxtral TTS CLI |
|
BERT embeddings CLI |