fromTokenizerJson

Build from a parsed HuggingFace tokenizer.json root object where model.type == "Unigram".

HF Unigram stores the vocab as a JSON array of [token, score] pairs, indexed by id. The unknown token id is at model.unk_id.