Get count distinct by nested objects from Elasticsearch

I have index with following mapping

{
    "mappings": {
        "properties": {
            "typed_obj": {
                "type": "nested",
                "properties": {
                    "id": {"type": "keyword"},
                    "type": {"type": "keyword"}
                }
            }
        }
    }
}

and documents

{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "two"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "3", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "4", "type": "two"}]}

How can i group typed_obj by type and calculate unique id ? Smth like

{
 "type": "one",
 "count": 3
},
{
 "type": "two",
 "count": 2
}

I made up query with agg

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "obj_nested": {
            "nested": {
                "path": "typed_obj"
            },
            "aggs": {
                "by_type_and_id": {
                    "multi_terms": {
                        "terms": [
                            {
                                "field": "typed_obj.type"
                            },
                            {
                                "field": "typed_obj.id"
                            }
                        ]
                    }
                }
            }
        }
    },
    "size": 0
}

and it returns

"buckets": [
                    {
                        "key": [
                            "one",
                            "1"
                        ],
                        "key_as_string": "one|1",
                        "doc_count": 4
                    },
                    {
                        "key": [
                            "one",
                            "2"
                        ],
                        "key_as_string": "one|2",
                        "doc_count": 1
                    },
                    {
                        "key": [
                            "one",
                            "3"
                        ],
                        "key_as_string": "one|3",
                        "doc_count": 1
                    },
                    {
                        "key": [
                            "two",
                            "2"
                        ],
                        "key_as_string": "two|2",
                        "doc_count": 1
                    },
                    {
                        "key": [
                            "two",
                            "4"
                        ],
                        "key_as_string": "two|4",
                        "doc_count": 1
                    }
                ]

In backend app i can group keys by first element (it is typed_obj type) and then retriev length, but my question is - is it possible to get types count without obtain from index all id+type pairs ?

1 answer

  • answered 2022-05-06 12:30 Sagar Patel

    You need to use Cardinality aggregation to count distinct values.

    Query:

    {
      "query": {
        "match_all": {}
      },
      "aggs": {
        "obj_nested": {
          "nested": {
            "path": "typed_obj"
          },
          "aggs": {
            "type":{
              "terms": {
                "field": "typed_obj.type",
                "size": 10
              },
              "aggs": {
                "id": {
                  "cardinality": {
                    "field": "typed_obj.id"
                  }
                }
              }
            }
          }
        }
      },
      "size": 0
    }
    

    Response

    "aggregations" : {
        "obj_nested" : {
          "doc_count" : 8,
          "type" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "one",
                "doc_count" : 6,
                "id" : {
                  "value" : 3
                }
              },
              {
                "key" : "two",
                "doc_count" : 2,
                "id" : {
                  "value" : 2
                }
              }
            ]
          }
        }
      }
    

    Note:

    A single-value metrics aggregation that calculates an approximate count of distinct values.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum