Catchpoint this week added a visualization capability to its Internet performance monitoring (IPM) platform that makes it simpler to surface in real time the status of any application or service.
The Internet Stack Map is an extension of the company’s Internet Performance Monitoring (IPM) that monitors the various networking services that modern distributed applications depend on. Catchpoint then applies analytics and machine learning algorithms to telemetry data collected via open source OpenTelemetry agent software to pinpoint the root cause of an issue that is adversely impacting application performance and availability.
Catchpoint CEO Mehdi Daoudi said as applications become more distributed, organizations lack any meaningful visibility into the root cause of any issue involving networking services. The IPM platform from Catchpoint now makes it possible to visualize Internet services using data collected via OpenTelemetry agents that are rapidly becoming a de facto IT standard, he added.
Advanced under the auspices of the Cloud Native Computing Foundation (CNCF), OpenTelemetry is a collection of application programming interfaces (APIs), software development kits (SDKs) and other tools to instrument IT environments in a way that provides a standard method for collecting and exporting metrics, logs and traces that provide critical visibility into IT events.
That approach eliminates the need to rely on proprietary agent software that many organizations no longer want to deploy and support, said Daoudi. It also makes it more cost effective to instrument application portfolios that in addition to becoming more complex continue to expand, he noted.
Many of those applications are also now more latency sensitive than ever, added Daoudi. The only way to ensure application performance is to collect enough telemetry data to enable platforms to apply machine learning algorithms to identify issues and surface recommendations for remediating them by combining core monitoring capabilities developed by Catchpoint and capabilities for monitoring API and microservices originally developed by Thundra.io, he noted.
It’s not clear to what degree IT teams are instrumenting application environments, but historically those efforts have been limited. Instrumenting every application was simply cost prohibitive. With the rise of OpenTelemetry, there is now a standard free approach for instrumenting application environments that multiple observability and monitoring platforms can employ.
Ultimately, the goal should be to reduce the meantime-to-resolution for any IT incident to about a millisecond, said Daoudi. The IT industry as whole may not be able to yet achieve that goal, but as observability and AI continue to improve, the speed at which any issue can be resolved is steadily improving – even as the size of the average IT environment remains relatively stagnant.
In the meantime, IT teams should be defining a set of best practices for managing IT incidents that will inevitably occur. Whether because of a network outage or a cybersecurity attack, IT teams, regardless of how complex IT has become, are expected to provide a level of application resiliency that enables the organization to continue to operate. The challenge, as always, is IT teams are unable to effectively manage that which they can’t see in the first place.