Package-level declarations
Types
Link copied to clipboard
Link copied to clipboard
class SentencePieceTokenizer(tokens: List<String>, scores: List<Float>, val unknownTokenId: Int? = null, val bosTokenId: Int? = null, val eosTokenId: Int? = null, val addSpacePrefix: Boolean = true) : Tokenizer
SentencePiece tokenizer for LLaMA, Gemma, TinyLlama, Mistral-v0.1 and other models whose GGUF tokenizer.ggml.model is "llama" and whose HuggingFace tokenizer.json has model.type == "Unigram".
Link copied to clipboard
object TokenizerFactory
Selects the right Tokenizer implementation for a model.
Link copied to clipboard