Querying spans

Introduction

Uptrace provides a powerful query language that supports joining, grouping, and aggregating multiple metrics in a single query.

To better understand this document, we recommend you to learn about OpenTelemetry Metricsopen in new window first.

Timeseries

A timeseries is a metric with an unique set of attributes, for example, each host has a separate timeseries for the same metric name:

-- metric name { attr1, attr2... }
system.filesystem.usage { host.name='host1' } -- timeseries 1
system.filesystem.usage { host.name='host2' } -- timeseries 2

Furthermore, you can also use attributes to create more detailed and rich metrics, for example, you can use state attribute to report the number of free and used bytes in the filesystem:

system.filesystem.usage { host.name='host1', state = 'free' } -- timeseries 1
system.filesystem.usage { host.name='host1', state = 'used' } -- timeseries 2

system.filesystem.usage { host.name='host2', state = 'free' } -- timeseries 3
system.filesystem.usage { host.name='host2', state = 'used' } -- timeseries 4

With just 2 attributes, you can write a number of useful queries:

-- the filesystem size (free+used bytes) on each host
system.filesystem.usage | group by host.name

-- the number of free bytes on each host
filter(system.filesystem.usage, state = 'free') as free | group by host.name

-- the size of your dataset (on all hosts)
filter(system.filesystem.usage, state = 'used') as dataset_size

Writing queries

You start creating a query by selecting metric names and giving them a short alias, for example:

-- metric aliases always start with a dollar sign
system.filesystem.usage as $fs_usage
system.network.packets as $packets

Because Uptrace supports multiple metrics in the same query, you must use the alias to reference the metric and metric attributes, for example:

-- unlike metric aliases, column aliases don't start with a dollar
$fs_usage as disk_size

filter($fs_usage, state = "used") as used_space
-- the same as previous query
$fs_usage as used_space | where $fs_usage.state = 'used'

-- disk size on the specified device
$fs_usage | where $fs_usage.host.name = 'host1' and $fs_usage.device = '/dev/sdd1'

per_min($packets) as packets_per_min | group by $packets.host.name

You can use multiple metrics to construct arithmetic expressions, for example, given the following 2 metrics:

api.user_cache.hits as $hits
api.user_cache.misses as $misses

You can write a query to plot the number of hits, misses, and calculate the hit rate:

per_min($hits) as hits | per_min($misses) as misses | hits / (hits + misses) as hit_rate

Instruments

OpenTelemetry provides different instrumentsopen in new window that support different aggregation functions in Uptrace:

Otel Instrument NameUptrace NameAggregations
Counteropen in new window, CounterObserveropen in new windowcounterper_min, per_sec
UpDownCounteropen in new window, UpDownCounterObserveropen in new windowadditivesum of last values
GaugeObserveropen in new windowgaugelast value
Histogramopen in new windowhistogrampercentiles, min, max, per_min, per_sec, count, avg

Dashboards

Uptrace supports 2 types of dashboards:

  • A grid-based dashboard looks like a classical grid of charts.
  • A table-based dashboard is a table of items where each item leads to a separate grid-based dashboard for the item, for example, a table of hostnames with some metrics for each hostname.

In other words, table-based dashboards allow to parameterize grid-based dashboards with attributes from the table. For example, Uptrace uses a table-based dashboard to monitor number of sampled and dropped spans for each project:

select
  filter($spans, type = 'spans') as sampled_spans,
  filter($spans, type = 'dropped') as dropped_spans
from uptrace.projects.spans as $spans
group by project_id
project_idmetricmetricLink to a grid-based dashboard
1sampled_spans$dropped_spansDash with where project_id = 1
2sampled_spans$dropped_spansDash with where project_id = 2
...
999sampled_spans$dropped_spansDash with where project_id = 999

Alternatively, you could also create a single grid-based dashboard and then clone it for each project. Obviously, that does not scale well, but still can be an option if you have only a few items, for example, you could create dashboards in such a way for different database clusters or availability zones.

See also

Last Updated: