警告
此功能是实验性的,可能会在将来的版本中完全更改或删除。Elastic将采取最大的努力来解决此问题,但实验功能不受SLA官方功能的支持。
导数管道聚合,其计算父直方图(或日期 - 图形)聚合中指定度量的导数。指定的度量必须是数字,并且必须设置直方图min_doc_count为0(默认为直方图聚合)。
derivative(导数)
聚合结构如下:
"derivative": {
"buckets_path": "the_sum"
}
derivative(导数)
参数如下:
参数名称 | 描述 | 是否必填 | 默认值 |
---|---|---|---|
buckets_path | 想要计算导数值的桶路径,点击 the section called “buckets_path Syntaxedit”查看更多细节 |
必填 | |
gap_policy | 当数据缺口出现时采用的策略,点击the section called “Dealing with gaps in the dataedit”查看更多细节 | 可选 | skip |
format | 用于规范聚合输出值的格式 | 可选 | null |
以下代码段计算每月总销售额的导数:
POST /sales/_search
{
"size": 0,
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_path": "sales" #1
}
}
}
}
}
}
| 1 | buckets_path指示这个derivative聚合是想要得到sales_per_month日期直方图聚合中sales聚合值的导数。 |
响应可能如下所示:
{
"took": 11,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
} #1
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
},
"sales_deriv": {
"value": -490.0 #2
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2, #3
"sales": {
"value": 375.0
},
"sales_deriv": {
"value": 315.0
}
}
]
}
}
}
| 1 | 由于我们至少需要2个数据点来计算导数,因此第一个桶没有值 | | 2 | 导数的单位默认和sales聚合以及父直方图相同。所以在这种情况下,如果价格字段的单位是美元,导数的单位就是美元/月 | | 3 | doc_count表示桶中的文档数 |
可以把导数管道聚合链接到另一个管道聚合的结果,计算二级导数。如以下示例所示,它将计算总月销售额的第一和第二阶导数:
POST /sales/_search
{
"size": 0,
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_path": "sales"
}
},
"sales_2nd_deriv": {
"derivative": {
"buckets_path": "sales_deriv" #1
}
}
}
}
}
}
| 1 | 二阶导数的buckets_path指向一阶导数的名称 |
响应可能如下所示:
{
"took": 50,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
} #1
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
},
"sales_deriv": {
"value": -490.0
} #2
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375.0
},
"sales_deriv": {
"value": 315.0
},
"sales_2nd_deriv": {
"value": 805.0
}
}
]
}
}
}
| 1 | 由于我们至少需要2个数据点,所以前两个桶没有二级导数 | | 2 | 一级导数计算二级导数 |
导数聚合允许指定导数值的单位。这将在响应normalized_value中返回一个额外的字段,汇报在X轴单位下的导数。在下面的例子中,我们计算出每月销售额的导数,但要求销售的导数按天计算:
POST /sales/_search
{
"size": 0,
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"sum": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_path": "sales",
"unit": "day" #1
}
}
}
}
}
}
| 1 | unit指定用于导数计算的X轴的单位 |
响应可能如下所示:
{
"took": 50,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"sales_per_month": {
"buckets": [
{
"key_as_string": "2015/01/01 00:00:00",
"key": 1420070400000,
"doc_count": 3,
"sales": {
"value": 550.0
} #1
},
{
"key_as_string": "2015/02/01 00:00:00",
"key": 1422748800000,
"doc_count": 2,
"sales": {
"value": 60.0
},
"sales_deriv": {
"value": -490.0, #2
"normalized_value": -15.806451612903226 #3
}
},
{
"key_as_string": "2015/03/01 00:00:00",
"key": 1425168000000,
"doc_count": 2,
"sales": {
"value": 375.0
},
"sales_deriv": {
"value": 315.0,
"normalized_value": 11.25
}
}
]
}
}
}
| 1,2 | value值是原单位:按月 |
| 3 | normalized_value值是请求的单位:按天
|