Tool Calling with Any Model
This tutorial shows how to use ChatSession to add tool calling to any model runtime, not just kllama.
How Tool Calling Works
The tool calling pipeline is decoupled from the model runtime:
Any model that implements InferenceRuntime and has a Tokenizer can use tool calling.
Step 1: Create a ChatSession
val session = ChatSession(
runtime = myRuntime, // any InferenceRuntime<T>
tokenizer = myTokenizer, // any Tokenizer
metadata = ModelMetadata(family = "qwen", architecture = "qwen3")
)
The ChatSession auto-detects the right chat template from ModelMetadata.
Step 2: Run a Single Tool Calling Round
val tools = listOf(myCalculatorTool, myFilesTool)
val response = session.runSingleTurn(
prompt = "What is 2 + 2?",
tools = tools,
maxTokens = 256,
temperature = 0.7f
)
println(response) // "2 + 2 = 4"
Step 3: Build a Multi-Turn Agent
val registry = ToolRegistry()
registry.register(CalculatorTool())
registry.register(ListFilesTool())
val agentLoop = session.createAgentLoop(registry, maxTokens = 512)
val messages = mutableListOf(
ChatMessage(role = ChatRole.SYSTEM, content = "You are a helpful assistant."),
ChatMessage(role = ChatRole.USER, content = "List files in /tmp and count them")
)
val response = agentLoop.runWithEncoder(
messages = messages,
encode = { session.encode(it) }
)
The agent loop automatically:
-
Formats the conversation using the chat template
-
Generates tokens until EOS
-
Parses tool calls from the output
-
Executes tools and appends results
-
Repeats until no more tool calls or max rounds reached
Step 4: Implement a Custom Tool
class WeatherTool : Tool {
override val definition = ToolDefinition(
name = "get_weather",
description = "Get current weather for a city",
parameters = buildJsonObject {
put("type", "object")
putJsonObject("properties") {
putJsonObject("city") {
put("type", "string")
put("description", "City name")
}
}
putJsonArray("required") { add(JsonPrimitive("city")) }
}
)
override fun execute(arguments: JsonObject): String {
val city = arguments["city"]?.jsonPrimitive?.content
?: return "Error: missing city"
return "Weather in $city: 22C, sunny"
}
}
Supported Chat Templates
Tool calling support is auto-detected from model metadata:
| Family | Template | Format |
|---|---|---|
Qwen2/3 |
|
JSON in |
LLaMA 3 |
|
JSON in |
Gemma |
|
Gemma-specific format |
ChatML/Hermes |
|
JSON in |