Causal Discovery for Cloud Microservice Architectures

Published: 12 Dec 2024, Last Modified: 06 Mar 2025AAAI 2025 Workshop AICT OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Cloud Computing, AI, causality, causal discovery, causal AI, Applied AI
TL;DR: We develop a constraint-based causal discovery framework to quickly and accurately identify drivers of latency in cloud microservices applications
Abstract: The use of microservices-based architectures is becoming more prominent due to their advantageous characteristics, such as manageability, scalability, and flexibility. However, their management can be complex, and their performance can be affected by high latencies, which can alter the Service Level Objective (SLO). In order to identify the causes of high latency, we present a causal modelling framework which is capable of analysing and reconstructing latency within a microservice-based architectures. To this end, we employ causal discovery to identify the causes of latency. Our model integrates domain knowledge to impose constraints on the causal graph, ensuring the accuracy of the discovered relationships as well as accelerating the causal discovery. To validate our approach, we reconstruct the latency metrics using machine learning techniques, and we demonstrate the effectiveness of our approach by accurately capturing the interrelations between the the resources of microservices. Our framework provides an enhanced understanding of the causes of latency leading to SLO violations and paves the way for sophisticated mechanisms enabling proactive management of cloud resources.
Submission Number: 2
Loading