What is tracing?
Tracing allows you to see how a request progresses through different services and components, timings of every operation, any logs and errors as they occur.
A typical web application consists of multiple components written in different languages and running on different platforms:
- Load balancer (e.g. nginx).
- Frontend code (e.g. React).
- Backend monolith or microservices.
- At least one database.
- Task/job queue.
Distributed tracing collects data from such diverse environments and allows you to:
- Monitor performance of each operation (for example, SQL query), individual components (database), and the whole request round trip.
- Monitor logs and errors no matter where they come from.
- Tie everything together into a single trace.
What is OpenTelemetry?
OpenTelemetryopen in new window is a vendor-neutral API for distributed traces and metrics. It specifies how to collect and send telemetry data to backend platforms. With OpenTelemetry, you can instrument your application once and then quickly add or change OpenTelemetry-compatible backends without changing the instrumentation. OpenTelemetry is available for most programming languages and provides tracing interoperability across different languages and environments.
Uptrace uses OpenTelemetry to collect traces, logs, and errors. The outline of the process is the following:
- You instrument your application with OpenTelemetry API.
- OpenTelemetry SDK exports the collected data to Uptrace.
- Uptrace uses that data to help you profile, monitor, and debug your application.
A span represents an operation (unit of work) in a trace. A span could be a remote procedure call (RPC), a database query, or potentially interesting code. A span has:
- A parent span.
- A span name (operation name).
- A span kind.
- Start and end time (
duration = end_time - start_time).
- A status that reports whether operation succeeded or failed.
- A set of key-value attributes describing the operation.
- A timeline of events.
- A list of links to other spans.
- A span context that propagates trace ID and other data between different services.
A trace is a tree of spans that shows the path that a request makes through an app. A span is an operation that your app performs handling a request.
Uptrace uses span names and some attributes to group similar spans together. To group spans properly, give them short and concise names. The total number of unique span names should be less than 1000. Otherwise, you will have too many span groups and your Uptrace experience will suffer.
The following names are good because they are short, distinctive, and help grouping similar spans together:
|Good. A route name with placeholders instead of params.|
|Good. A function name without arguments.|
|Good. A database query with placeholders.|
The following span names are bad because they contain params and args:
|Bad. Contains a variable param |
|Bad. Contains a variable |
|Bad. Contains a variable arg |
To record contextual information, you can annotate spans with attributes that carry information specific to the operation. For example, an HTTP endpoint may have such attributes as
http.method = GET and
http.route = /projects/:id.
You can name attributes as you want, but for common operations you should use semantic attributesopen in new window convention. It defines a list of common attribute keys with their meaning and possible values.
You can annotate spans with events that have start time and arbitrary number of attributes. The main difference between events and spans is that events don't have end time (and therefore no duration).
Events usually represent exceptions, errors, logs, and messages (such as in RPC). But you can create custom events as well.
Span kind must have one of the following values:
serverfor server operations.
clientfor client operations.
producerfor message producers.
consumerfor message consumers and async processing in general.
internalfor internal (nested) spans to further instrument spans.
Status code reports whether operation succeeded or failed. It must have one of the following values:
unset- the default value which allows backends to assign the status.
Trace context is a request-scoped data such as:
- trace id - unique trace identificator;
- span id - unique span identificator;
- trace flags - various flags such as sampled, deferred, and debug.
OpenTemetry propagates context between functions within a process (in-process propagation) and even from one service to another (distributed propagation). Distributed tracing uses context for span correlation, for example, assembling spans from multiple services into a single trace.