Elastic中的映射与分析

Elastic中的mapping设置，主要是为了能够使搜索更加准确，其中的内容包括：哪些字段是需要分词的？哪些字段是需要整体准确被搜索的？字段都使用什么分词算法？等等

1、测试分析器

GET www.129.com/_analyze?analyzer=standard&text=Text to analyze
GET www.129.com/_analyze?text=Text to analyze

2、核心简单字段类型

类型	表示的数据类型
String	`string`
Whole number	`byte`, `short`, `integer`, `long`
Floating point	`float`, `double`
Boolean	`boolean`
Date	`date`
Object	object

在上面的字段类型中，其他string是需要进行分词的，其他的类型不分词。

3、创建映射

index参数控制字符串以何种方式被索引。它包含以下三个值当中的一个：

值	解释
`analyzed`	首先分析这个字符串，然后索引。换言之，以全文形式索引此字段。
`not_analyzed`	索引这个字段，使之可以被搜索，但是索引内容和指定值一样。不分析此字段。
`no`	不索引这个字段。这个字段不能为搜索到。

string类型字段默认值是analyzed。如果我们想映射字段为确切值，其他简单类型（long、double、date等等）也接受index参数，但相应的值只能是no和not_analyzed，它们的值不能被分析。对象object需要对每个分项单独建mapping，注意一维数组，二维数组的索引区别，详见
http://es.xiaoleilu.com/052_Mapping_Analysis/50_Complex_datatypes.html

analyzer参数控制采用哪种分词算法，默认为standard

PUT http://www.129.com/test

{"mappings":{"test":{"properties":{"storeId":{"type":"long"},"title":{"type":"string","analyzer":"standard"},"description":{"type":"string","analyzer":"standard"},"createTime":{"type":"long"},"updateTime":{"type":"long"},"storeTypeName":{"type":"string","index":"not_analyzed"},"storePhone":{"type":"string","index":"not_analyzed"},"storeAddress":{"type":"string"},"provinceId":{"type":"long"},"provinceName":{"type":"string","index":"not_analyzed"},"cityName":{"type":"string"},"countyId":{"type":"long"},"countyName":{"type":"string"},"categoryId":{"type":"long"},"categoryName":{"type":"string","index":"not_analyzed"}}}}}

4、新增字段映射

[put] http://www.129.com/test/_mapping/test/

{"properties":{"brand":{"type":"string","index":"not_analyzed"},"brandName":{"type":"string","analyzer":"standard"}}}

5、查看映射

GET http://www.129.com/test/_mapping/test
GET www.129.com/_analyze?analyzer=standard&text=Text to analyze

发表评论取消回复