Load customize dictionary into jieba
YingTing 2018/08/05
Custom dictionary can be included in the jieba default dictionary. Jieba is able to identify new words. Jieba can ensure a higher accuracy by adding our own new words dictionary.
Usage:
jieba.load_userdict(file_name)
file_name
is a file-like object, or the file path of the custom dictionary
If file_name
is in binary mode, the dictionary must be UTF-8 encoded.
For example, a dictionary is store in "dict.txt" file like
創新辦3 i
雲計算5
凱特琳nz
台中
The following block shows one sentence segmented by native jieba and segmented by jieba after including customize dictionary "dict.txt" jieba.load_userdict("dict.txt")
[Before]:李小福/是/創新/辦/主任/也/是/雲/計算/方面/的/專家/
[After]: 李小福/是/創新辦/主任/也/是/雲計算/方面/的/專家/