What is tracing?
When operating distributed systems, you need to be able to analyze requests that pass through several different services or infrastructure components. For example, to understand which service is responsible for the error, how long it took each one of them to complete the request, how a particular request was executed, and which services have contributed to the process. In cases like these, metrics and logs are not enough: they do not get you the full request execution context, because the request has to go through several infrastructure layers before the response is returned to the client.
Basic terms in
Request tracing
Request tracing is there for analysis of distributed requests. It allows you to visualize and track the execution path of a specific request as it passes through many different services and infrastructure components. Request execution path is a sequence of operations called spans.
Span
A span is a basic distributed tracing element representing a single operation in your system. For example, this may be a database query, an HTTP request, or a function call. Each span comes with a name, start and end time, labels, logs, and execution context. Spans may contain links to other spans to join them together into a trace.
Trace
A trace is a combination of spans forming the execution path of a particular request.
Assigning tracing
You can use a trace to get answers to the following questions:
- Which service or component threw an error when processing a distributed request?
- Which operations or services have slowed down the request?
- How did the execution of a particular distributed request proceed?