Elasticsearch aggregations - OR in buckets

Say I have 5 docs:

{
  "owner": "joe",
  "color": "black"
},
{
  "owner": "joe",
  "color": "red"
},
{
  "owner": "joe",
  "color": "blue"
},
{
  "owner": "jack",
  "color": "black"
},
{
  "owner": "jack",
  "color": "white"
}

and aggregations:

{
  aggs: {
    owner: {
      "terms": {
        "field": "owner"
      }
    },
    color: {
      "terms": {
        "field": "color"
      }
    }
  }
}

to aggregate docs by owner and color.

If I run match all query I got:

owner
joe: 3
jack: 2

color
black: 2
red: 1
blue: 1
white: 1

What I want to achieve is: if I filter docs by owner: joe I want to get 3 docs where owner is joe, the color aggregation:

color
black: 1
red: 1
blue: 1

BUT I'd like to get the owner aggregation:

owner
joe: 3 [selected]
jack: 2 [possible to extend]

So get the number of other buckets that can be selected to extend the final result. So something like "OR" between the buckets.

How can I achieve this?

2 answers

  • answered 2021-01-11 05:40 ESCoder

    As far as I can understand, you want to aggregate on the owner as well as color (where the owner is equal to joe) You can use filter aggregation to achieve your required use case -

    {
      "size": 0,
      "aggs": {
        "owner": {
          "terms": {
            "field": "owner.keyword"
          }
        },
        "filtered_aggregation": {
          "filter": {
            "term": {
              "owner": "joe"
            }
          },
          "aggs": {
            "color": {
              "terms": {
                "field": "color.keyword"
              }
            }
          }
        }
      }
    }
    

    Search Result:

    "aggregations": {
        "owner": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "joe",
              "doc_count": 3
            },
            {
              "key": "jack",
              "doc_count": 2
            }
          ]
        },
        "filtered_aggregation": {
          "doc_count": 3,
          "color": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "black",
                "doc_count": 1
              },
              {
                "key": "blue",
                "doc_count": 1
              },
              {
                "key": "red",
                "doc_count": 1
              }
            ]
          }
        }
      }
    

  • answered 2021-01-11 05:54 Val

    The usual way to achieve this is by using a post_filter. The query below will return:

    • only joe's colors (using filtered_colors)
    • only joe's documents (using post_filter)
    • all owners that you can filter on (using a all_owners)

    Query:

    POST owners/_search
    {
      "aggs": {
        "filtered_colors": {
          "filter": {
            "term": {
              "owner.keyword": "joe"
            }
          },
          "aggs": {
            "color": {
              "terms": {
                "field": "color.keyword"
              }
            }
          }
        },
        "all_owners": {
          "terms": {
            "field": "owner.keyword"
          }
        }
      },
      "post_filter": {
        "term": {
          "owner.keyword": "joe"
        }
      }
    }