Document Type



Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence



Publication Details

IEEE Access.


One desired aspect of a self-adapting microservices architecture is the ability to continuously monitor the operational environment, detect and observe anomalous behavior, and provide a reasonable policy for self-scaling, self-healing, and self-tuning the computational resources in order to dynamically respond to a sudden change in its operational environment. The behaviour of a microservices architecture is continuously changing overtime, which makes it a challenging task to use a statistical model to identify both the normal and abnormal behaviour of the services running. The performance of the microservices cluster could fluctuate around the demand to accommodate scalability, orchestration and load balancing demands. To achieve the desired high levels of self-adaptability, this research implements microservices architectures model following the MAPE-K model. Our proposed architecture employs Markov decision process (MDP) to identify the transition from one cluster state to another. Our proposed architecture employs a deep Q- learning network (DQN) for dynamically selecting the adaptation action that yield the highest reward. This paper evaluates the effectiveness of using DQN and MDP agent to achieve high level of self-adaptability of microservice architecture. We argue in this paper that such integration between DQN and MDP in MAPE-K model offers microservice architecture with self-adaptability against the contextual changes in the operational environment. The self-adaptation property is achieved by allowing the MDP agent to explore the observation space and lets the DQN to select the adaptation policy with the highest reward, then the MDP agent executes the adaptation action and observes the changes. We believe integrating DQN into the adaptation action selection process improves the effectiveness of the adaptation and reduces the adaptation risk including resources over-provisioning and thrashing. The proposed model preserves the cluster state and preventing multiple actions to taking place at the same time. Our model also guarantees that the executed adaptation action fits the current execution context and achieves the adaptation goals.