TokenizerFactory
Selects the right Tokenizer implementation for a model.
Tokenizer selection is per-architecture, not per file format. A Qwen model needs byte-level BPE whether its weights come from .gguf or .safetensors; a LLaMA model needs SentencePiece regardless of format. Callers pass either a GGUF metadata field map or a HuggingFace tokenizer.json string, and this factory inspects the tokenizer type (tokenizer.ggml.model or model.type) to dispatch.
Currently supported:
Byte-level BPE (Qwen, GPT-2, Mistral-Nemo) — via QwenByteLevelBpeTokenizer. Dispatched when
tokenizer.ggml.model == "gpt2"ormodel.type == "BPE".SentencePiece (LLaMA, Gemma, TinyLlama, Mistral v0.1) — via SentencePieceTokenizer. Dispatched when
tokenizer.ggml.model == "llama"ormodel.type == "Unigram".
WordPiece (BERT) still throws UnsupportedTokenizerException.