Services are becoming more and more digitalized today, i.e. mobile online / web or both. Any service disruption can lead to near permanent loss of customers who experience the same, especially in B2C business. Not all business can afford 100% uptime especially those that are part of start-up idea.
Given the modern component based architectures and technology stacks ensures there are different components interact with each other to ensure ultimate user experience. From Mobile app, CDNs, API gateways, authentication layers, application layers, cache layers and database layers to name a few. Each of them having their own infrastructure, network & scaling mechanism. With all best monitoring of their individual availability and tracking metric of their performance to ensure they are their peaks still there are times when the services are disrupted.
Moore’s Law always catches up even if you have designed for best of availability. In such scenarios it is crucial, to have a quick way to troubleshoot the issues that impacts customer experience. In order to do that you need not just individual component level minute by minute monitoring or metrics but a traceable, trackable, coherent comprehensive end to end, Real Time Monitoring dashboard that quickly helps to narrow down the point of concern that has impacted the services.
As a cloud service provider, Amazon provides Amazon CloudWatch, an integrated tool for real-time monitoring and metrics tracking for each service including for the serverless services and extendable through custom metrics. Now they add an extra layer of real-time monitoring tool called “Service Lens”.