json - How to query an elasticsearch aggregation with a term and sum on different nested objects? -


i have following object value attribute nested object type:

{     "metadata": {         "tenant": "home",         "timestamp": "2016-03-24t23:59:38z"     },     "value": {         { "key": "foo", "int_value": 100 },         { "key": "bar", "str_value": "taco" }     } } 

this type of object has following mapping:

{     "my_index": {         "mappings": {             "my_doctype": {                 "properties": {                     "metadata": {                         "properties": {                             "tenant": {                                 "type": "string",                                 "index": "not_analyzed"                             },                             "timestamp": {                                 "type": "date",                                 "format": "dateoptionaltime"                             }                         }                     },                     "value": {                         "type": "nested",                         "properties": {                             "str_value": {                                 "type": "string",                                 "index": "not_analyzed"                             },                             "int_value": {                                 "type": "long"                             },                             "key": {                                 "type": "string",                                 "index": "not_analyzed"                             }                         }                     }                 }             }         }     } } 

with setup, perform aggregation performs following result:

  • perform term aggregation on str_value attribute of objects key set "bar"
  • in each bucket created above aggregation, calculate sum of int_value attributes key set "foo"
  • have results laid out in date_histogram given time range.

with goal in mind, have been able term , date_histogram aggregations work on nested objects, have not had luck performing second level of calculation. here current query attempting working:

{   "query": {     "match_all": {}   },   "aggs": {     "filters": {       "filter": {         "bool": {           "must": [             {               "term": {                 "metadata.org": "gw"               }             },             {               "range": {                 "metadata.timestamp": {                   "gte": "2016-03-24t00:00:00.000z",                   "lte": "2016-03-24t23:59:59.999z"                 }               }             }           ]         }       },       "aggs": {         "intervals": {           "date_histogram": {             "field": "metadata.timestamp",             "interval": "1d",             "min_doc_count": 0,             "extended_bounds": {               "min": "2016-03-24t00:00:00z",               "max": "2016-03-24t23:59:59z"             },             "format": "yyyy-mm-dd't'hh:mm:ss'z'"           },           "aggs": {             "nested_type": {               "nested": {                 "path": "value"               },               "aggs": {                 "key_filter": {                   "filter": {                     "term": {                       "value.key": "bar"                     }                   },                   "aggs": {                     "groupby": {                       "terms": {                         "field": "value.str_value"                       },                       "aggs": {                         "other_nested": {                           "reverse_nested": {                             "path": "value"                           },                           "aggs": {                             "key_filter": {                               "filter": {                                 "term": {                                   "value.key": "foo"                                 }                               },                               "aggs": {                                 "amount_sum": {                                   "sum": {                                     "field": "value.int_value"                                   }                                 }                               }                             }                           }                         }                       }                     }                   }                 }               }             }           }         }       }     }   } } 

the result expecting receive elasticsearch following:

{   "took": 1,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 7,     "max_score": 0.0,     "hits": []   },   "aggregations": {     "filters": {       "doc_count": 2,       "intervals": {         "buckets": [           {             "key_as_string": "2016-03-24t00:00:00z",             "key": 1458777600000,             "doc_count": 2,             "nested_type": {               "doc_count": 5,               "key_filter": {                 "doc_count": 2,                 "groupby": {                   "doc_count_error_upper_bound": 0,                   "sum_other_doc_count": 0,                   "buckets": [                     {                       "key": "taco",                       "doc_count": 1,                       "other_nested": {                         "doc_count": 1,                         "key_filter": {                           "doc_count": 1,                           "amount_sum": {                             "value": 100.0                           }                         }                       }                     }                   ]                 }               }             }           }         ]       }     }   } } 

however, innermost object (...groupby.buckets.key_filter.amount_sum) having value return 0.0 instead of 100.0.

i think due fact nested objects indexed separate documents, filtering 1 key attribute's value not allowing me query against key.

would have idea on how type of query work?

for bit more context, reason document structure because not control content of json documents indexed, different tenants may have conflicting key names different values (e.g. {"tenant": "abc", "value": {"foo": "a"} } vs. {"tenant": "xyz", "value": {"foo": 1} }. method trying use 1 laid out elasticsearch blog post, recommends transform objects don't control structure , use nested objects (specifically nested fields each data type section of article). open learn of better way handle situation of not controlling document's json structure if there 1 can perform aggregations.

thank you!

edit: using elasticsearch 1.5.

solved situation utilizing reverse_nested aggregation in correct way described here: http://www.shayne.me/blog/2015/2015-05-18-elasticsearch-nested-docs/


Comments