{"query":{"bool":{"must":[{"term":{"architect.keyword":{"value":"郭锋"}}},{"range":{"NRunTime":{"lte":100}}}]}},"size":10,"from":100}
from相当于offset,size相当于每页多少个,上边例子中代表从第100个数据开始(第11页),查询出10条数据
{
"query":{"bool":{"must":{"multi_match":{"query":"search_key","type":"best_fields","fields":["column1","column2"],//字段column1、column2模糊匹配search_key"analyzer":"ik_smart"//汉字按ik_smart分词}},"filter":{//filterrangelte(小于)hte(大于)"range":{"column3":{"lte":1//小于}}}}},"stored_fields":["column1","column2","column3","column4","column5"],"highlight":{//高亮显示"fields":{"column1":{},"column2":{}}}}
match:会将查询字段分隔,比如查询javaspark,采用match会分词java/spark,将es中包含java、spark、以及java***spark的查询出来
match_phrase:不会讲查询字段分隔,比如查询javaspark,采用match_phrase会将es中包含***javaspark***的内容查询出来
match提高查询召回率,match_phrase提高查询精度
match查询例子:
match_phrase查询例子:
GET/forum/article/_search{"query":{"match_phrase":{"content":{"query":"javaspark","slop":50}}}}结果:{"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.5753642,"hits":[{"_index":"forum","_type":"article","_id":"5","_score":0.5753642,"_source":{"articleID":"DHJK-B-1395-#Ky5","userID":3,"hidden":false,"postDate":"2017-03-01","tag":["elasticsearch"],"tag_cnt":1,"view_cnt":10,"title":"thisissparkblog","content":"sparkisbestbigdatasolutionbasedonscala,anprogramminglanguagesimilartojavaspark","sub_title":"haha,helloworld","author_first_name":"Tonny","author_last_name":"PeterSmith","new_author_last_name":"PeterSmith","new_author_first_name":"Tonny"}}]}}ViewCode
结果发现match_phrase只会返回既包含java又包含spark的数据,召回率降低
通过analyzer指定分词器类型
系统默认分词器:
①standard分词器:单词切分,将词汇转为小写、去掉标点符号------按单词分
POST_analyze{"analyzer":"standard","text":"The2QUICKBrown-Foxesjumpedoverthelazydog'sbone."}text分词结果:[the,2,quick,brown,foxes,jumped,over,the,lazy,dog's,bone]ViewCode
②whitespace分词器:空格分词,对字符没有lowcase化-----按空格分
POST_analyze{"analyzer":"whitespace","text":"The2QUICKBrown-Foxesjumpedoverthelazydog'sbone."}分词结果:[The,2,QUICK,Brown-Foxes,jumped,over,the,lazy,dog's,bone.]ViewCode
③simple分词器:通过非字母字符来分隔文本信息,有lowcase化,该分词器去掉数字类型字符----按非首字母分
POST_analyze{"analyzer":"simple","text":"The2QUICKBrown-Foxesjumpedoverthelazydog'sbone."}分词结果:[the,quick,brown,foxes,jumped,over,the,lazy,dog,s,bone]ViewCode
④stop分词器:通过非字母字符来分隔文本信息,同时去掉英文中a、an、the等常用字符,通过stopwords也可以自己设置需要过滤掉的单词,该分词器去掉数字类型字符----按非首字母分,去a、an、the
PUTmy_index{"settings":{"analysis":{"analyzer":{"my_stop_analyzer":{"type":"stop","stopwords":["the","over"]}}}}}POSTmy_index/_analyze{"analyzer":"my_stop_analyzer","text":"The2QUICKBrown-Foxesjumpedoverthelazydog'sbone."}结果:[quick,brown,foxes,jumped,lazy,dog,s,bone]
中文分词器:
①ik-max-world:会将文本做最细粒度的拆分;尽可能多的拆分出词语
ik-smart:做最粗粒度的拆分;已被拆出词语将不会再次被其他词语占用
ik分词器热刺更新配置:
修改IK的配置文件:ES目录/plugins/ik/config/ik/IKAnalyzer.cfg.xml
ik-max-worldVSik-smart例子;
aggs
"aggs":{"NAME":{#指定结果的名称"AGG_TYPE":{#指定具体的聚合方法,TODO:#聚合体内制定具体的聚合字段}}TODO:#该处可以嵌套聚合}例子:
{"size":0,"aggs":{"sum_install":{"date_histogram":{"field":"yyyymmdd","interval":"day"},"aggs":{"types":{"terms":{"field":"type.keyword","size":10},"aggs":{"install":{"sum":{"field":"install"}}}}}}}}ViewCode
作用查询每天,不同type对应install总量
range:field:ltegte
must:and操作should:or操作
{"query":{"bool":{"must":[{"range":{"recive_time":{"gte":"2017-12-25T01:00:00.000Z","lte":"2017-12-25T02:10:00.000Z"}}},{"bool":{"should":[{"range":{"live_delay":{"gte":1500}}},{"range":{"stream_break_count.keyword":{"gte":1}}}]}}]}}}must的两个条件都必须满足,should中的两个条件至少满足一个就可以
PUT/my_index/_mapping/my_type
{"properties":{"new_field_name":{"type":"string"}}}
给新添加字段赋值
POSTmy_index/_update_by_query//批量更新用_update_by_query语法
{"script":{"lang":"painless","inline":"ctx._source.new_field_name='02'"}}//通过painless更新对象值
aggs+avg+max+min+order+cardinality(等价于count(distinct(a)))+filter+sort
terms相当于sql中的groupby
aggs:基于搜索查询,可以嵌套组合复杂查询语法
"aggs":{"NAME":{#指定结果的名称"AGG_TYPE":{#指定具体的聚合方法,TODO:#聚合体内制定具体的聚合字段}}TODO:#该处可以嵌套聚合}
lve
2.2统计不同颜色的mysql实现selectcolor,count(color)ascntfromcarsgroupbycolororderbycntdesc;返回结果:
red4green2blue23、统计不同颜色车的平均价格3.1统计不同颜色车的平均价格DSL实现:
terms相当于groupbyGET/cars/transactions/_search{"size":0,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}返回聚合结果:
3.2统计不同颜色车的平均价格sql实现:selectcolor,count(color)ascnt,avg(price)asavg_pricefromcarsgroupbycolororderbycntdesc;colorcntavg_pricered432500.0000green221000.0000blue220000.00004、每种颜色汽车制造商的分布4.1统计每种颜色汽车制造商的分布dsl实现GET/cars/transactions/_search{"size":0,"aggs":{"colors":{"terms":{"field":"color.keyword"},"aggs":{"make":{"terms":{"field":"make.keyword"}}}}}}返回结果:
4.2统计每种颜色汽车制造商的分布sql实现说明:和dsl的实现不严格对应
selectcolor,makefromcarsorderbycolor;colormakebluetoyotabluefordgreenfordgreentoyotaredbmwredhondaredhondaredhonda5、统计每个制造商的最低价格、最高价格5.1统计每个制造商的最低、最高价格的DSL实现GET/cars/transactions/_search{"size":0,"aggs":{"make_class":{"terms":{"field":"make.keyword"},"aggs":{"min_price":{"min":{"field":"price"}},"max_price":{"max":{"field":"price"}}}}}}聚合结果:
5.2统计每个制造商的最低、最高价格的sql实现selectmake,min(price)asmin_price,max(price)asmax_pricefromcarsgroupbymake;makemin_pricemax_pricebmw8000080000ford2500030000honda1000020000toyota1200015000二、聚合进阶1、条形图聚合1.1分段统计每个区间的汽车销售价格总和GET/cars/transactions/_search{"size":0,"aggs":{"price":{"histogram":{"field":"price","interval":20000},"aggs":{"revenue":{"sum":{"field":"price"}}}}}}汽车销售价格区间:定义为20000;分段统计price和用sum统计。
1.2多维度度量不同制造商的汽车指标GET/cars/transactions/_search{"size":0,"aggs":{"makes":{"terms":{"field":"make.keyword","size":10},"aggs":{"stats":{"extended_stats":{"field":"price"}}}}}}输出截取片段:
2.2按月份统计制造商汽车销量sql实现SELECTmake,count(make)ascnt,CONCAT(YEAR(sold),',',MONTH(sold))ASdata_timeFROM`cars`GROUPBYYEAR(sold)DESC,MONTH(sold)查询结果如下:makecntdata_timebmw12014,1ford12014,2ford12014,5toyota12014,7toyota12014,8honda12014,10honda22014,112.3包含12月份的处理DSL实现以上2.1中没有12月份的统计结果显示。
GET/cars/transactions/_search{"size":0,"aggs":{"sales":{"date_histogram":{"field":"sold","interval":"month","format":"yyyy-MM-dd","min_doc_count":0,"extended_bounds":{"min":"2014-01-01","max":"2014-12-31"}}}}}2.4以季度为单位统计DSL实现GET/cars/transactions/_search{"size":0,"aggs":{"sales":{"date_histogram":{"field":"sold","interval":"quarter","format":"yyyy-MM-dd","min_doc_count":0,"extended_bounds":{"min":"2014-01-01","max":"2014-12-31"}},"aggs":{"per_make_sum":{"terms":{"field":"make.keyword"},"aggs":{"sum_price":{"sum":{"field":"price"}}}},"top_sum":{"sum":{"field":"price"}}}}}}2.5基于搜索的(范围限定)聚合操作2.5.1基础查询聚合
GET/cars/transactions/_search{"query":{"match":{"make.keyword":"ford"}},"aggs":{"colors":{"terms":{"field":"color.keyword"}}}}对应的sql实现:
selectmake,colorfromcarswheremake="ford";结果返回如下:makecolorfordgreenfordblue三、过滤聚合1.过滤操作统计全部汽车的平均价钱以及单品平均价钱;
GET/cars/transactions/_search{"size":0,"query":{"match":{"make.keyword":"ford"}},"aggs":{"single_avg_price":{"avg":{"field":"price"}},"all":{"global":{},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}等价于:
selectmake,color,avg(price)fromcarswheremake="ford";selectavg(price)fromcars;2、范围限定过滤(过滤桶)我们可以指定一个过滤桶,当文档满足过滤桶的条件时,我们将其加入到桶内。
GET/cars/transactions/_search{"size":0,"query":{"match":{"make":"ford"}},"aggs":{"recent_sales":{"filter":{"range":{"sold":{"from":"now-100M"}}},"aggs":{"average_price":{"avg":{"field":"price"}}}}}}mysql的实现如下:
select*,avg(price)fromcarswhereperiod_diff(date_format(now(),'%Y%m'),date_format(sold,'%Y%m'))>30andmake="ford";mysql查询结果如下:idpricecolormakesoldavg330000greenford2014-05-1827500.00003、后过滤器只过滤搜索结果,不过滤聚合结果——post_filter实现
GET/cars/transactions/_search{"query":{"match":{"make":"ford"}},"post_filter":{"term":{"color.keyword":"green"}},"aggs":{"all_colors":{"terms":{"field":"color.keyword"}}}}post_filter会过滤搜索结果,只展示绿色ford汽车。这在查询执行过后发生,所以聚合不受影响。
小结选择合适类型的过滤(如:搜索命中、聚合或两者兼有)通常和我们期望如何表现用户交互有关。选择合适的过滤器(或组合)取决于我们期望如何将结果呈现给用户。
在filter过滤中的non-scoring查询,同时影响搜索结果和聚合结果。filter桶影响聚合。post_filter只影响搜索结果。四、多桶排序4.1内置排序GET/cars/transactions/_search{"size":0,"aggs":{"colors":{"terms":{"field":"color.keyword","order":{"_count":"asc"}}}}}4.2按照度量排序以下是按照汽车平均售价的升序进行排序。过滤条件:汽车颜色;聚合条件:平均价格;排序条件:汽车的平均价格升序。
GET/cars/transactions/_search{"size":0,"aggs":{"colors":{"terms":{"field":"color.keyword","order":{"avg_price":"asc"}},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}多条件聚合后排序如下所示:
GET/cars/transactions/_search{"size":0,"aggs":{"colors":{"terms":{"field":"color.keyword","order":{"stats.variance":"asc"}},"aggs":{"stats":{"extended_stats":{"field":"price"}}}}}}4.3基于“深度”的度量排序太复杂,不推荐!
五、近似聚合cardinality的含义是“基数”;
5.1统计去重后的数量GET/cars/transactions/_search{"size":0,"aggs":{"distinct_colors":{"cardinality":{"field":"color.keyword"}}}}类似于:
SELECTCOUNT(DISTINCTcolor)FROMcars;以下:以月为周期统计;
GET/cars/transactions/_search{"size":0,"aggs":{"months":{"date_histogram":{"field":"sold","interval":"month"},"aggs":{"distinct_colors":{"cardinality":{"field":"color.keyword"}}}}}}