OpenTelemetry Go distro for Uptrace

uptrace-goopen in new window configures opentelemetry-goopen in new window to export spans and metrics to Uptrace using OpenTelemetry protocol (OTLP).

Resources

From the blog:

Installation

To install uptrace-go:

go get github.com/uptrace/uptrace-go

Configuration

You configure Uptrace client using a DSN (Data Source Name, e.g. https://<token>@api.uptrace.dev/<project_id>) from the project settings page.

import "github.com/uptrace/uptrace-go/uptrace"

uptrace.ConfigureOpentelemetry(
    // copy your project DSN here or use UPTRACE_DSN env var
    //uptrace.WithDSN(""),

    uptrace.WithServiceName("myservice"),
    uptrace.WithServiceVersion("v1.0.0"),
)

The full list of available configuration options can be found in config.goopen in new window.

OptionDescription
DSNA data source that specifies Uptrace project credentials. For example, https://<key>@api.uptrace.dev/<project_id>.
ServiceNameservice.name resource attribute. For example, myservice.
ServiceVersionservice.version resource attribute. For example, 1.0.0.
ResourceAttributesAny other resource attributes.
ResourceResource contains attributes representing an entity that produces telemetry. Resource attributes are copied to all spans and events.

You can also use environment variables to configure the client:

Env varDescription
UPTRACE_DSNA data source that is used to connect to uptrace.dev. For example, https://<key>@api.uptrace.dev/<project_id>.
OTEL_RESOURCE_ATTRIBUTESKey-value pairs to be used as resource attributes. For example, service.name=myservice,service.version=1.0.0.
OTEL_PROPAGATORSPropagators to be used as a comma separated list. The default is tracecontext,baggage.

Tracing

All the code below is also available as a runnable exampleopen in new window.

Creating a tracer

To start creating spans, you need a tracer. You create a tracer by specifying a tracer name (AKA instrumentation library name):

import "go.opentelemetry.io/otel"

var tracer = otel.Tracer("app_or_package_name")

You can have as many tracers as you want, but usually you need only one tracer for each app/library. Use tracer names to identify the library that produces the spans.

Creating a span

Once you have a tracer, creating spans is easy:

import "go.opentelemetry.io/otel/trace"

// Create a span with name "operation-name" and kind="server".
ctx, span := tracer.Start(ctx, "operation-name", trace.WithSpanKind(trace.SpanKindServer))

// End the span when the operation we are measuring is done.
defer span.End()

doSomeWork()

Adding span attributes

To record contextual information, you can annotate spans with attributes that carry information specific to the operation. For example, an HTTP endpoint may have such attributes as http.method = GET and http.route = /projects/:id.

// To avoid expensive computations, check that span is recording
// before setting any attributes.
if span.IsRecording() {
	span.SetAttributes(
		attribute.String("http.method", "GET"),
		attribute.String("http.route", "/projects/:id"),
	)
}

You can name attributes as you want, but for common operations you should use semantic attributesopen in new window convention. It defines a list of common attribute keys with their meaning and possible values.

Adding span events

You can annotate spans with events that have start time and arbitrary number of attributes. The main difference between events and spans is that events don't have end time (and therefore no duration).

Events usually represent exceptions, errors, logs, and messages (such as in RPC). But you can record custom events as well.

import "go.opentelemetry.io/otel/trace"

span.AddEvent("log", trace.WithAttributes(
	attribute.String("log.severity", "error"),
	attribute.String("log.message", "User not found"),
	attribute.String("enduser.id", "123"),
))

Recording errors

OpenTelemetry provides a shortcut to record an error.

if err != nil {
    // Record the error and update the span status.
    span.RecordError(err)
	span.SetStatus(codes.Error, err.Error())
}

Occasionally you may want to record an error when there are no active spans. In such case use uptrace.ReportError to report the error without creating an association with a span:

uptrace.ReportError(ctx, errors.New("Hello from Uptrace!"))

Trace context and the active span

OpenTelemetry stores the current active span in a context.Contextopen in new window. You propagate the context by passing it as a first function argumentopen in new window. You can nest contexts inside each other and OpenTelemetry will automatically activate the parent span context when you end the span.

Usually you obtain the context from an HTTP request:

// Get the context.
ctx = req.Context()

// Set the context.
req = req.WithContext(ctx)

tracer.Start sets the active span for you, but you can also activate the span manually:

import "go.opentelemetry.io/otel/trace"

// Get the active span from a context.
span = trace.SpanFromContext(ctx)

// Save the active span in a context.
ctx = trace.ContextWithSpan(ctx, span)

Sampling

To collect only half of the traces:

import sdktrace "go.opentelemetry.io/otel/sdk/trace"

// Sample 50% of all traces.
sampler := sdktrace.ParentBased(sdktrace.TraceIDRatioBased(0.5))

uptrace.ConfigureOpentelemetry(
    uptrace.WithTraceSampler(sampler),
)

Metrics

TIP

To use metrics, you need uptrace-goopen in new window v1.0.0+.

To get started with metrics, you need a MeterProvider which provides access to Meters:

import "go.opentelemetry.io/otel/metric/global"

// Meter can be a global/package variable.
var Meter = metric.Must(global.Meter("app_or_package_name"))

Using the meter, you can create instruments and use them to measure operations. The simplest Counter instrument looks like this:

import "go.opentelemetry.io/otel/metric"

counter := Meter.NewInt64Counter("app_or_package_name.component1.counter1",
	metric.WithDescription("Optional description"),
    metric.WithUnit(""), // optional metric unit
)

counter.Add(ctx, 1)
counter.Add(ctx, 10)

You can find more examplesopen in new window at GitHub.

Instruments

Instruments can be synchronous or asynchronous, additive (summable numbers) or grouping (histograms or non-summable numbers). Additive instruments that measure non-decreasing numbers are also called monotonic.

A single instrument can produce multiple metric series (timeseries). A metric series is a metric with an unique set of attributes. For example, each host has a separate timeseries for the same metric name.

Synchronous instruments are invoked together with operations they are measuring. For example, to measure the number of requests, you can call counter.Add(ctx, 1) whenever there is a new request. Synchronous measurements can be associated with a context.

For synchronous instruments the difference between additive and grouping instruments is that additive instruments produce summable timeseries and grouping instruments produce a histogram. Summable timeseries are timeseries which values when added together produce a sensible timeseries. For example, you can sum number of requests from different hosts to get the total number of requests, but you usually don't want to sum request duration.

InstrumentPropertiesAggregationExample
Counteradditive, monotonicsum -> deltanumber of requests, request size
UpDownCounteradditivelast value -> sumnumber of connections
Histogramgroupinghistogramrequest duration, request size

Asynchronous instruments (observers) periodically invoke a callback function to collect measurements. For example, you can use observers to periodically measure memory or CPU usage. Asynchronous measurements can't be associated with a context.

When choosing between UpDownCounterObserver (additive) and GaugeObserver (grouping), choose UpDownCounterObserver for summable timeseries and GaugeObserver otherwise. For example, to measure system.memory.usage (bytes), you should use UpDownCounterObserver, because it makes sense to sum timeseries from different hosts to get the total memory usage. But to measure system.memory.utilization (percents), you should use GaugeObserver, because the sum of timeseries from different hosts would be incorrect (> 100%, for example, 90% + 90% = 180%).

InstrumentPropertiesAggregationExample
CounterObserveradditive, monotonicsum -> deltaCPU time
UpDownCounterObserveradditivelast value -> sumMemory usage (bytes)
GaugeObservergroupinglast valueMemory utilization (%)

Counter

Counter is a synchronous instrument which measures additive non-decreasing values.

// counter demonstrates how to measure non-decreasing numbers, for example,
// number of requests or connections.
func counter(ctx context.Context) {
	counter := meter.NewInt64Counter("app_or_package_name.component1.requests",
		metric.WithDescription("Number of requests"),
	)

	for {
		counter.Add(ctx, 1)
		time.Sleep(time.Millisecond)
	}
}

You can get more interesting results by adding attributes to your measurements:

// counterWithLabels demonstrates how to add different attributes ("hits" and "misses")
// to measurements. Using this simple trick, you can get number of hits, misses,
// sum = hits + misses, and hit_rate = hits / (hits + misses).
func counterWithLabels(ctx context.Context) {
	counter := meter.NewInt64Counter("app_or_package_name.component1.cache",
		metric.WithDescription("Cache hits and misses"),
	)
	// Bind the counter to some labels.
	hits := counter.Bind(attribute.String("type", "hits"))
	misses := counter.Bind(attribute.String("type", "misses"))

	for {
		if rand.Float64() < 0.3 {
			misses.Add(ctx, 1)
		} else {
			hits.Add(ctx, 1)
		}

		time.Sleep(time.Millisecond)
	}
}

UpDownCounter

UpDownCounter is a synchronous instrument which measures additive values which increase or decrease with time.

// upDownCounter demonstrates how to measure numbers that can go up and down, for example,
// number of goroutines or customers.
//
// See upDownCounterObserver for a better example how to measure number of goroutines.
func upDownCounter(ctx context.Context) {
	counter := meter.NewInt64Counter("app_or_package_name.component1.goroutines",
		metric.WithDescription("Number of goroutines"),
	)

	for {
		counter.Add(ctx, int64(runtime.NumGoroutine()))

		time.Sleep(time.Second)
	}
}

Histogram

Histogram is a synchronous instrument which produces a histogram from recorded values.

// valueRecorder demonstrates how to record a distribution of individual values, for example,
// request or query timings. With this instrument you get total number of records,
// avg/min/max values, and heatmaps/percentiles.
func valueRecorder(ctx context.Context) {
	durRecorder := meter.NewInt64Histogram("app_or_package_name.component1.request_duration",
		metric.WithUnit("microseconds"),
		metric.WithDescription("Duration of requests"),
	)

	for {
		dur := time.Duration(rand.NormFloat64()*10000+100000) * time.Microsecond
		durRecorder.Record(ctx, dur.Microseconds())

		time.Sleep(time.Millisecond)
	}
}

CounterObserver

CounterObserver is an asynchronous instrument which measures additive non-decreasing values.

// sumObserver demonstrates how to measure monotonic (non-decreasing) numbers,
// for example, number of requests or connections.
func sumObserver(ctx context.Context) {
	// stats is our data source updated by some library.
	var stats struct {
		Hits   int64 // atomic
		Misses int64 // atomic
	}

	var hitsCounter, missesCounter metric.Int64CounterObserver

	batchObserver := meter.NewBatchObserver(
		// SDK periodically calls this function to grab results.
		func(ctx context.Context, result metric.BatchObserverResult) {
			result.Observe(nil,
				hitsCounter.Observation(atomic.LoadInt64(&stats.Hits)),
				missesCounter.Observation(atomic.LoadInt64(&stats.Misses)),
			)
		})

	hitsCounter = batchObserver.NewInt64CounterObserver("app_or_package_name.component2.cache_hits")
	missesCounter = batchObserver.NewInt64CounterObserver("app_or_package_name.component2.cache_misses")

	for {
		if rand.Float64() < 0.3 {
			atomic.AddInt64(&stats.Misses, 1)
		} else {
			atomic.AddInt64(&stats.Hits, 1)
		}

		time.Sleep(time.Millisecond)
	}
}

UpDownSubOserver

UpDownSubOserver is an asynchronous instrument which measures additive values which can increase or decrease with time.

// upDownCounterObserver demonstrates how to measure numbers that can go up and down,
// for example, number of goroutines or customers.
func upDownCounterObserver(ctx context.Context) {
	_ = meter.NewInt64UpDownCounterObserver("app_or_package_name.component2.goroutines",
		func(ctx context.Context, result metric.Int64ObserverResult) {
			num := runtime.NumGoroutine()
			result.Observe(int64(num))
		},
		metric.WithDescription("Number of goroutines"),
	)
}

GaugeObserver

GaugeObserver is an asynchronous instrument which measures non-additive values for which sum does not produce a meaningful (correct) result.

// valueObserver demonstrates how to measure numbers that can go up and down,
// for example, number of goroutines or customers.
func valueObserver(ctx context.Context) {
	_ = meter.NewInt64GaugeObserver("app_or_package_name.component2.goroutines2",
		func(ctx context.Context, result metric.Int64ObserverResult) {
			num := runtime.NumGoroutine()
			result.Observe(int64(num))
		},
		metric.WithDescription("Number of goroutines"),
	)
}

What is next?

By now, you should be able to use OpenTelemetry API to instrument your app. To help with that, we've created examplesopen in new window that show how to use OpenTelemetry instrumentationsopen in new window for popular frameworks and libraries.