当词元过滤器中的type为keep时,表示
只保留具有预定义单词集中的文本的token。可以在设置中定义一组单词,或者从包含每行一个单词的文本文件加载。
| keep_words | 要保留的单词列表 |
| keep_words_path | 一个文字文件的路径 |
| keep_words_case | 一个布尔值,表示是否小写单词(默认为false
) |
PUT /keep_words_example
{
"settings" : {
"analysis" : {
"analyzer" : {
"example_1" : {
"tokenizer" : "standard",
"filter" : ["standard", "lowercase", "words_till_three"]
},
"example_2" : {
"tokenizer" : "standard",
"filter" : ["standard", "lowercase", "words_in_file"]
}
},
"filter" : {
"words_till_three" : {
"type" : "keep",
"keep_words" : [ "one", "two", "three"]
},
"words_in_file" : {
"type" : "keep",
"keep_words_path" : "analysis/example_word_list.txt"
}
}
}
}
}