Elasticsearch - 聚合
聚合框架收集搜索查询选择的所有数据,由许多构建块组成,有助于构建复杂的数据摘要。此处显示了聚合的基本结构 −
"aggregations" : { "" : { "" : { } [,"meta" : { [] } ]? [,"aggregations" : { []+ } ]? } [,"" : { ... } ]* }
有不同类型的聚合,每种都有自己的用途。本章将详细讨论它们。
指标聚合
这些聚合有助于根据聚合文档的字段值计算矩阵,有时可以从脚本生成一些值。
数字矩阵要么是单值(如平均聚合),要么是多值(如统计数据)。
平均聚合
此聚合用于获取聚合文档中存在的任何数字字段的平均值。例如,
POST /schools/_search { "aggs":{ "avg_fees":{"avg":{"field":"fees"}} } }
运行上述代码,我们得到以下结果 −
{ "took" : 41, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "schools", "_type" : "school", "_id" : "5", "_score" : 1.0, "_source" : { "name" : "Central School", "description" : "CBSE Affiliation", "street" : "Nagan", "city" : "paprola", "state" : "HP", "zip" : "176115", "location" : [ 31.8955385, 76.8380405 ], "fees" : 2200, "tags" : [ "Senior Secondary", "beautiful campus" ], "rating" : "3.3" } }, { "_index" : "schools", "_type" : "school", "_id" : "4", "_score" : 1.0, "_source" : { "name" : "City Best School", "description" : "ICSE", "street" : "West End", "city" : "Meerut", "state" : "UP", "zip" : "250002", "location" : [ 28.9926174, 77.692485 ], "fees" : 3500, "tags" : [ "fully computerized" ], "rating" : "4.5" } } ] }, "aggregations" : { "avg_fees" : { "value" : 2850.0 } } }
基数聚合
此聚合给出特定字段的不同值的数量。
POST /schools/_search?size=0 { "aggs":{ "distinct_name_count":{"cardinality":{"field":"fees"}} } }
运行上述代码,我们得到以下结果 −
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "distinct_name_count" : { "value" : 2 } } }
注意 − 基数的值为 2,因为费用中有两个不同的值。
扩展统计聚合
此聚合生成有关聚合文档中特定数值字段的所有统计信息。
POST /schools/_search?size=0 { "aggs" : { "fees_stats" : { "extended_stats" : { "field" : "fees" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 8, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "fees_stats" : { "count" : 2, "min" : 2200.0, "max" : 3500.0, "avg" : 2850.0, "sum" : 5700.0, "sum_of_squares" : 1.709E7, "variance" : 422500.0, "std_deviation" : 650.0, "std_deviation_bounds" : { "upper" : 4150.0, "lower" : 1550.0 } } } }
Max 聚合
此聚合查找聚合文档中特定数字字段的最大值。
POST /schools/_search?size=0 { "aggs" : { "max_fees" : { "max" : { "field" : "fees" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 16, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "max_fees" : { "value" : 3500.0 } } }
Min 聚合
此聚合查找聚合文档中特定数字字段的最小值。
POST /schools/_search?size=0 { "aggs" : { "min_fees" : { "min" : { "field" : "fees" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "min_fees" : { "value" : 2200.0 } } }
Sum 聚合
此聚合计算聚合文档中特定数字字段的总和。
POST /schools/_search?size=0 { "aggs" : { "total_fees" : { "sum" : { "field" : "fees" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 8, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "total_fees" : { "value" : 5700.0 } } }
还有一些其他指标聚合,用于特殊情况,例如用于地理位置的地理边界聚合和地理质心聚合。
Stats 聚合
多值指标聚合,用于计算从聚合文档中提取的数值的统计数据。
POST /schools/_search?size=0 { "aggs" : { "grades_stats" : { "stats" : { "field" : "fees" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "grades_stats" : { "count" : 2, "min" : 2200.0, "max" : 3500.0, "avg" : 2850.0, "sum" : 5700.0 } } }
聚合元数据
您可以在请求时使用元标记添加一些有关聚合的数据,并可以在响应中获取这些数据。
POST /schools/_search?size=0 { "aggs" : { "avg_fees" : { "avg" : { "field" : "fees" } , "meta" :{ "dsc" :"Lowest Fees This Year" } } } }
运行上述代码,我们得到以下结果 −
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "avg_fees" : { "meta" : { "dsc" : "Lowest Fees This Year" }, "value" : 2850.0 } } }