The difference between observability and conventional monitoring starts at the data level. Monitoring typically relies on a set of pre-configured dashboards designed to alert you to anticipated performance issues. They track and evaluate known (expected) types of problems that may be encountered. Monitoring tools are thus designed to answer known issues.
Observability, on the other hand, provides us with information that allows us to detect different types of actual or potential problems that we have not yet encountered. It can therefore answer unexpected questions, so-called "unknown unknowns".
The concept of observability is based on three pillars: metrics, logs and traces.
Metrics are numerical representations of data such as CPU usage, RAM occupancy, etc., measured at regular time intervals. Mathematical models and predictions can be applied to them. They are usually used for basic analysis and evaluation of system performance.
Logs are machine-generated records of events of various kinds, usually containing a timestamp and dates related to a specific record. They may also carry information about the level (severity) of the event, identification of the source application, the name or IP address of the server, and more.
Traces allow you to use a unique identifier to link the sequence of calls to individual services, systems and applications to the user request that led to the start of the processing. In the event of a problem, it is possible to trace the end-to-end path of the request, find its real cause, or possibly identify a bottleneck in the course of the complete process.
To collect, reduce and clean all the data and then send only the truly valuable information to the target analytics system, a tool called an "observability pipeline" is useful. This term refers to the control layer placed between the various data sources and the target systems for data analysis and processing. It allows any data in any format to be received, its information value extracted and then routed to any target. The result is higher performance and lower ICT infrastructure costs.