i have following object value
attribute nested object type:
{ "metadata": { "tenant": "home", "timestamp": "2016-03-24t23:59:38z" }, "value": { { "key": "foo", "int_value": 100 }, { "key": "bar", "str_value": "taco" } } }
this type of object has following mapping:
{ "my_index": { "mappings": { "my_doctype": { "properties": { "metadata": { "properties": { "tenant": { "type": "string", "index": "not_analyzed" }, "timestamp": { "type": "date", "format": "dateoptionaltime" } } }, "value": { "type": "nested", "properties": { "str_value": { "type": "string", "index": "not_analyzed" }, "int_value": { "type": "long" }, "key": { "type": "string", "index": "not_analyzed" } } } } } } } }
with setup, perform aggregation performs following result:
- perform
term
aggregation onstr_value
attribute of objectskey
set"bar"
- in each bucket created above aggregation, calculate
sum
ofint_value
attributeskey
set"foo"
- have results laid out in
date_histogram
given time range.
with goal in mind, have been able term
, date_histogram
aggregations work on nested objects, have not had luck performing second level of calculation. here current query attempting working:
{ "query": { "match_all": {} }, "aggs": { "filters": { "filter": { "bool": { "must": [ { "term": { "metadata.org": "gw" } }, { "range": { "metadata.timestamp": { "gte": "2016-03-24t00:00:00.000z", "lte": "2016-03-24t23:59:59.999z" } } } ] } }, "aggs": { "intervals": { "date_histogram": { "field": "metadata.timestamp", "interval": "1d", "min_doc_count": 0, "extended_bounds": { "min": "2016-03-24t00:00:00z", "max": "2016-03-24t23:59:59z" }, "format": "yyyy-mm-dd't'hh:mm:ss'z'" }, "aggs": { "nested_type": { "nested": { "path": "value" }, "aggs": { "key_filter": { "filter": { "term": { "value.key": "bar" } }, "aggs": { "groupby": { "terms": { "field": "value.str_value" }, "aggs": { "other_nested": { "reverse_nested": { "path": "value" }, "aggs": { "key_filter": { "filter": { "term": { "value.key": "foo" } }, "aggs": { "amount_sum": { "sum": { "field": "value.int_value" } } } } } } } } } } } } } } } } } }
the result expecting receive elasticsearch following:
{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 7, "max_score": 0.0, "hits": [] }, "aggregations": { "filters": { "doc_count": 2, "intervals": { "buckets": [ { "key_as_string": "2016-03-24t00:00:00z", "key": 1458777600000, "doc_count": 2, "nested_type": { "doc_count": 5, "key_filter": { "doc_count": 2, "groupby": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "taco", "doc_count": 1, "other_nested": { "doc_count": 1, "key_filter": { "doc_count": 1, "amount_sum": { "value": 100.0 } } } } ] } } } } ] } } } }
however, innermost object (...groupby.buckets.key_filter.amount_sum
) having value
return 0.0
instead of 100.0
.
i think due fact nested objects indexed separate documents, filtering 1 key
attribute's value not allowing me query against key
.
would have idea on how type of query work?
for bit more context, reason document structure because not control content of json documents indexed, different tenant
s may have conflicting key names different values (e.g. {"tenant": "abc", "value": {"foo": "a"} }
vs. {"tenant": "xyz", "value": {"foo": 1} }
. method trying use 1 laid out elasticsearch blog post, recommends transform objects don't control structure , use nested objects (specifically nested fields each data type
section of article). open learn of better way handle situation of not controlling document's json structure if there 1 can perform aggregations.
thank you!
edit: using elasticsearch 1.5.
solved situation utilizing reverse_nested
aggregation in correct way described here: http://www.shayne.me/blog/2015/2015-05-18-elasticsearch-nested-docs/
Comments
Post a Comment