Skip to main content

Search Aggregation

There are several situations where specific data is not too relevant but needs lifting to a higher level. That is where aggregation comes into play. Aggregation allows grouping entities by one or more properties and then do math operations on that group.

For example, the following search will select and count compute instances that are older than 3 years:

> search is(instance) and age > 3y | count
​total matched: 21
​total unmatched: 0

The count command is actually a special type of aggregation.

Aggregation Functions

Aggregations can utilize the following functions:

  • sum
  • min
  • max
  • avg

Function arguments can be variable names (e.g., min(path.to.prop)), static values (e.g., sum(1)), or even calculations using simple expressions (min(path.to.prop * 3 + 2)).

Each grouping function can have an as <name> clause to give the function result a specific name: <function>(..) as <name>. If this as <name> clause is omitted, a name is derived from the function name and property path.

Example: min(memory), sum(1) as count, avg(instance_cores) as average_cores.

The aggregate command tells Fix Inventory to aggregate the search results based on the defined criteria. Each result of the search is then passed to the defined aggregation function(s).

The above example using the count could also be rewritten with aggregate like so:

> search is(instance) and age > 3y | aggregate sum(1) as count
​count: 21

Every element is counted as 1, so sum(1) is the number of elements.

It is also possible to define multiple aggregation functions. Let's count both instances and cores:

> search is(instance) and age > 3y | aggregate
sum(1) as count,
sum(instance_cores) as cores
​count: 21
​cores: 66

We could even compute the average, minimum, and maximum number of available cores:

> search is(instance) and age > 3y | aggregate
sum(1) as count,
sum(instance_cores) as cores,
min(instance_cores) as min_cores,
max(instance_cores) as max_cores,
avg(instance_cores) as avg_cores
​count: 21
​cores: 66
​min_cores: 1
​max_cores: 4
​avg_cores: 3.14

Aggregation Groups

The real power of aggregations is in defining groups and applying functions on those groups.

A group is defined using a property path. The value of this path is looked up in every document. All documents with the same value form a group.

For example, instances can be grouped by status:

> search is(instance) and age > 3y | aggregate instance_status: sum(1) as count
​group:
​ instance_status: running
​count: 1
​---
​group:
​ instance_status: stopped
​count: 15
​---
​group:
​ instance_status: terminated
​count: 5

Grouping variable can be named using as. By default, the last part of the path is used as the variable name.

Additional aggregation functions also get applied on each group:

> search is(instance) and age > 3y | aggregate instance_status as status:
sum(1) as count,
sum(instance_cores) as cores,
min(instance_cores) as min_cores,
max(instance_cores) as max_cores,
avg(instance_cores) as avg_cores
​group:
​ status: running
​count: 1
​cores: 4
​min_cores: 4
​max_cores: 4
​avg_cores: 4
​---
​ group:
​status: stopped
​count: 15
​cores: 51
​min_cores: 1
​max_cores: 4
​avg_cores: 3.4
​---
​ group:
​status: terminated
​count: 5
​cores: 11
​min_cores: 1
​max_cores: 4
​avg_cores: 2.2

Groups can also be defined using multiple grouping variables:

> search is(instance) and age > 3y | aggregate
instance_status as status, instance_type as type:
sum(1) as count,
sum(instance_cores) as cores
​group:
​ status: running
​ type: n1-standard-4
​count: 1
​cores: 4
​---
​group:
​ status: stopped
​ type: m5.xlarge
​count: 12
​cores: 48
​---
​group:
​ status: stopped
​ type: t2.micro
​count: 3
​cores: 3
​---
​group:
​ status: terminated
​ type: n1-standard-2
​count: 5
​cores: 11