What is distributed tracing?
The term distributed tracing describes a process to separate the various processes in microservices structures and to make them individually recognizable. Tools for this are profile IDs. The aim is to identify the causes of errors more quickly.
Distributed tracing (often also: Distributed Request Tracing) is a system that can identify processes in microservices architectures. As the German translation of “distributed tracing” suggests, it is primarily a matter of separation. In order to make the behavior of the processes recognizable and explainable, it must be possible to differentiate between them.
Explanation of the problem
An increasing number of applications rely on microservices. For example, they use native cloud applications almost without exception. The idea behind it is the paradigm of the modular structure: Apps work better if the individual functions are provided separately via special services. The background is the specialization of the individual services. In addition, a service can be exchanged more easily if necessary.
To give an example using a metaphor: If you want to build a house (in our case: an application), you don’t just commission 30 random people. Instead, skilled craftsmen such as plumbers, bricklayers and electricians are used. Nobody builds everything, but because of their specialization they do the best possible job (so they are our microservices).
Now a problem arises: for example, a wall is damp. However, bricklayers, plumbers, carpenters and electricians worked on it. It is therefore not possible to say who made the mistake. From the outside it can only be seen that there is a problem. To translate the picture back, if microservices infrastructures fail, there can be a multitude of potential culprits. Because numerous services use the point where the problem arose.
Distributed tracing as a solution to the problem
This problem can be solved using distributed tracing. The following sequence occurs for this:
- Each process in each usage request is given its own unique profile ID.
- Services must use this ID to “sign” their work.
- The ID appears in all logs.
- A connected system uses the ID to save all important data such as start and end.
All processes can therefore be identified and analyzed using the unique profile ID. Let us return to the metaphor for this: the craftsmen have to unmistakably sign and log each work step. If, for example, it turns out that only the plumber worked with water, the service has been found where an error occurred.
Implementation of distributed tracing
Distributed tracing doesn’t work without additional code. It is therefore an analysis and optimization system that is used by the developer. It is therefore excellent for debugging . However, there are no standards or usable templates because the infrastructures are too individual. As a rule, distributed tracing grows with the development of the actual app over time. However, microservices providers often provide their own empirical values that can help.