AI/ML-Driven Microservices Architecture for Scalable Cloud Computing Applications
Keywords:
Microservices Architecture, AI/ML, Cloud Computing, Auto-Scaling, Kubernetes, LSTM, Anomaly Detection, Scalability, DevOps, ContainerizationAbstract
The intersection of Artificial Intelligence (AI), Machine Learning (ML), and microservices architecture has provided a trail of transformational opportunity to develop highly scalable, resilient, and intelligent applications in the cloud (Cahill 2019). In conventional monolith systems, scalability, ability to maintain fault isolation and dynamism in managing resources with changing workload are well documented weaknesses. The present paper presents and discusses a microarchitecture of AI/ML-driven microservices that combines predictive auto-scaling on the basis of Long Short-Term Memory (LSTM)-based predictive algorithms, anomaly detection, and ensemble classification models in a Kubernetes setup deployed in a container. The architecture is compared against the traditional monolithic and the standard microservices system on various dimensions of performance, such as response latency, throughput, fault recovery time and resource utilization. Experimental findings reveal that the AI/ML-enhanced pipeline can decrease response latency by as much as 87 percent, increase throughput by 370 percent as well as maintain close to linear horizontal scalability with up to 32 nodes. The explainability of the models based on SHAP guarantees the transparency of regulations. The framework has enormous implications on cloud-native financial systems, real time analytics systems and IoT based enterprise applications.
References
[1] Ahmad, H., Treude, C., Wagner, M., & Szabo, C. (2025). Towards resource-efficient reactive and proactive auto-scaling for microservice architectures. Journal of Systems and Software. https://doi.org/10.1016/j.jss.2025.00058
[2] Cloud Microservices in Focus: Architecture, Industry Practices and Emerging Innovation. (2025). International Research Journal on Advanced Engineering Hub (IRJAEH), 3(12), 4255–4267. https://doi.org/10.47392/IRJAEH.2025.0623
[3] Dogani, J., Namvar, R., & Khunjush, F. (2023). Auto-scaling techniques in container-based cloud and edge/fog computing: Taxonomy and survey. Computer Communications, 209, 120–150. https://doi.org/10.1016/j.comcom.2023.06.010
[4] Hebbar, K. S. (2025). Workload-aware machine learning for microservice scaling in Kubernetes. International Journal of Computational and Experimental Science and Engineering, 11(4). https://doi.org/10.22399/ijcesen.2025.11.4
[5] MDPI Software Editorial. (2025). Designing microservices using AI: A systematic literature review. Software, 4(1), 6. https://doi.org/10.3390/software4010006
[6] PMC Review. (2024). Auto-scaling techniques in cloud computing: Issues and research directions. Sensors, 24(17), 5551. https://doi.org/10.3390/s24175551
[7] Putapu, A. (2025). AI-enhanced microservices: Integrating machine learning pipelines in Java cloud environments. International Journal of Computing and Engineering, 7(19), 37–50. https://doi.org/10.47941/ijce.3059
[8] Rachamala, N. R. (2023). Architecting AML detection pipelines using Hadoop and PySpark with AI/ML. Journal of Information Systems Engineering and Management, 8(4). https://www.jisem-journal.com/
[9] Rachamala, N. R., Kotha, S. R., & Talluri, M. (2021). Building composable microservices for scalable data-driven applications. International Journal of Communication Networks and Information Security, 13(3), 534–542. https://doi.org/10.48047/IJCNIS.13.3.534-542
[10] Santos, J., Wauters, T., Volckaert, B., & De Turck, F. (2023). GymHPA: Efficient auto-scaling via reinforcement learning for complex microservice-based applications in Kubernetes. In NOMS 2023–2023 IEEE/IFIP Network Operations and Management Symposium. https://doi.org/10.1109/noms56928.2023.10154298
[11] Santos, J., Reppas, E., Wauters, T., Volckaert, B., & De Turck, F. (2024). Gwydion: Efficient auto-scaling for complex containerised applications in Kubernetes through reinforcement learning. Journal of Network and Computer Applications, 232, 104011. https://doi.org/10.1016/j.jnca.2024.104011
[12] Springer Nature (2025). AI techniques in the microservices life-cycle: A systematic mapping study. Computing. https://doi.org/10.1007/s00607-025-01432-z
[13] Wang, Z., Zhu, S., Li, J., Jiang, W., Ramakrishnan, K. K., Yan, M., Zhang, X., & Liu, A. X. (2024). DeepScaling: Autoscaling microservices with stable CPU utilization for large-scale production cloud systems. IEEE/ACM Transactions on Networking, 32(5), 3961–3976. https://doi.org/10.1109/tnet.2024.3400953
[14] Zarai, O., Mcharfi, Z., & El Asri, B. (2025). Intelligent autoscaling strategies for cloud-native microservices: A systematic review. In 2025 International Conference on Intelligent Systems: Theories and Applications (SITA), 1–9. https://doi.org/10.1109/sita67914.2025.11273589
[15] Designing Cloud-Native Enterprise Systems by Modernising Applications with Microservices and Kubernetes Platforms. (2025). International Journal of Research and Applied Innovations, 8(5), 13052–13063. https://doi.org/10.15662/IJRAI.2025.0805015
