Inventory Optimization in Retail Supply Chains Using Deep Reinforcement Learning
DOI:
https://doi.org/10.62177/amit.v1i3.470Keywords:
Inventory Optimization, Deep Reinforcement Learning, Retail Supply Chains, Markov Decision Process, DDPG, Intelligent Replenishment, Demand Forecasting, Stockout MinimizationAbstract
Inventory management is a critical component of retail supply chains, directly affecting operational efficiency, customer satisfaction, and profitability. Traditional approaches to inventory optimization often rely on heuristic rules or static mathematical models, which struggle to cope with the high-dimensional, stochastic, and dynamic nature of modern retail environments. This paper proposes a novel framework utilizing deep reinforcement learning (DRL) to optimize inventory control decisions in end-to-end retail supply chains. The supply chain system is modeled as a Markov Decision Process (MDP), where the agent observes states such as stock levels, sales trends, supplier lead times, and demand forecasts. A DRL agent, trained with the Deep Deterministic Policy Gradient (DDPG) algorithm, learns to generate real-time replenishment and ordering strategies that maximize long-term performance by minimizing costs and avoiding stockouts. Experimental evaluations using both simulated and real-world retail data demonstrate that the proposed method outperforms classical baselines such as economic order quantity (EOQ) and safety stock models in terms of inventory turnover, service level, and total cost. The results suggest that DRL can serve as a robust and adaptive solution to inventory optimization under uncertainty.
Downloads
References
. Mohamed, A. E. (2024). Inventory management. In Operations Management-Recent Advances and New Perspectives. IntechOpen.
. Yusof, Z. B. (2024). Analyzing the Role of Predictive Analytics and Machine Learning Techniques in Optimizing Inventory Management and Demand Forecasting for E-Commerce. International Journal of Applied Machine Learning, 4(11), 16-31.
. Jin, J., Xing, S., Ji, E., & Liu, W. (2025). XGate: Explainable Reinforcement Learning for Transparent and Trustworthy API Traffic Management in IoT Sensor Networks. Sensors (Basel, Switzerland), 25(7), 2183.
. Attah, R. U., Garba, B. M. P., Gil-Ozoudeh, I., & Iwuanyanwu, O. (2024). Enhancing supply chain resilience through artificial intelligence: Analyzing problem-solving approaches in logistics management. International Journal of Management & Entrepreneurship Research, 5(12), 3248-3265.
. Nguyen, T. T., Nguyen, N. D., & Nahavandi, S. (2020). Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE transactions on cybernetics, 50(9), 3826-3839.
. Zhang, Q., Chen, S., & Liu, W. (2025). Balanced Knowledge Transfer in MTTL-ClinicalBERT: A Symmetrical Multi-Task Learning Framework for Clinical Text Classification. Symmetry, 17(6), 823.
. Emma, O., Bryant, O., & Jordan, N. (2024). Data-Driven Decision-Making in Supply Chain Management Using Deep Reinforcement Learning.
. Wu, B., Shi, Q., & Liu, W. (2025). Addressing Sensor Data Heterogeneity and Sample Imbalance: A Transformer-Based Approach for Battery Degradation Prediction in Electric Vehicles. Sensors.
. Rolf, B., Jackson, I., Müller, M., Lang, S., Reggelin, T., & Ivanov, D. (2023). A review on reinforcement learning algorithms and applications in supply chain management. International Journal of Production Research, 61(20), 7151-7179.
. Wang, J., Tan, Y., Jiang, B., Wu, B., & Liu, W. (2025). Dynamic Marketing Uplift Modeling: A Symmetry-Preserving Framework Integrating Causal Forests with Deep Reinforcement Learning for Personalized Intervention Strategies. Symmetry, 17(4), 610.
. Hosseinifard, Z., Shao, L., & Talluri, S. (2022). Service‐level agreement with dynamic inventory policy: The effect of the performance review period and the incentive structure. Decision Sciences, 53(5), 802-826.
. Pourmohammad-Zia, N. (2021). A review of the research developments on inventory management of growing items. Journal of Supply Chain Management Science, 2(3-4), 71-84.
. Yang, J., Li, P., Cui, Y., Han, X., & Zhou, M. (2025). Multi-Sensor Temporal Fusion Transformer for Stock Performance Prediction: An Adaptive Sharpe Ratio Approach. Sensors, 25(3), 976.
. Hannah, D. P., Tidhar, R., & Eisenhardt, K. M. (2021). Analytic models in strategy, organizations, and management research: A guide for consumers. Strategic Management Journal, 42(2), 329-360.
. Long, L. N. B., Cuong, T. N., Kim, H. S., & You, S. S. (2024). Sustainability and robust decision-support strategy for multi-echelon supply chain system against disruptions. International Journal of Logistics Research and Applications, 27(11), 1953-1983.
. Tekle, S. L., Bonaccorso, B., & Naim, M. (2025). Simulation-based optimization of water resource systems: a review of limitations and challenges. Water Resources Management, 39(2), 579-602.
. Guo, L., Hu, X., Liu, W., & Liu, Y. (2025). Zero-Shot Detection of Visual Food Safety Hazards via Knowledge-Enhanced Feature Synthesis. Applied Sciences.
. Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107-115.
. Kharfan, M., Chan, V. W. K., & Firdolas Efendigil, T. (2021). A data-driven forecasting approach for newly launched seasonal products by leveraging machine-learning approaches. Annals of Operations Research, 303(1), 159-174.
. Kurani, A., Doshi, P., Vakharia, A., & Shah, M. (2023). A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting. Annals of Data Science, 10(1), 183-208.
. Yusof, Z. B. (2024). Analyzing the Role of Predictive Analytics and Machine Learning Techniques in Optimizing Inventory Management and Demand Forecasting for E-Commerce. International Journal of Applied Machine Learning, 4(11), 16-31.
. Gutierrez, J. C., Polo Triana, S. I., & León Becerra, J. S. (2024). Benefits, challenges, and limitations of inventory control using machine learning algorithms: literature review. OPSEARCH, 1-33.
. Nguyen, T. T., Nguyen, N. D., & Nahavandi, S. (2020). Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE transactions on cybernetics, 50(9), 3826-3839.
. Han, X., Yang, Y., Chen, J., Wang, M., & Zhou, M. (2025). Symmetry-Aware Credit Risk Modeling: A Deep Learning Framework Exploiting Financial Data Balance and Invariance. Symmetry (20738994), 17(3).
. Meisheri, H., Sultana, N. N., Baranwal, M., Baniwal, V., Nath, S., Verma, S., ... & Khadilkar, H. (2022). Scalable multi-product inventory control with lead time constraints using reinforcement learning. Neural Computing and Applications, 34(3), 1735-1757.
. Pérez-Dattari, R., Celemin, C., Ruiz-del-Solar, J., & Kober, J. (2019, May). Continuous control for high-dimensional state spaces: An interactive learning approach. In 2019 International Conference on Robotics and Automation (ICRA) (pp. 7611-7617). IEEE.
. Kalusivalingam, A. K., Sharma, A., Patel, N., & Singh, V. (2020). Leveraging Deep Reinforcement Learning and Real-Time Stream Processing for Enhanced Retail Analytics. International Journal of AI and ML, 1(2).
. Yang, Y., Wang, M., Wang, J., Li, P., & Zhou, M. (2025). Multi-Agent Deep Reinforcement Learning for Integrated Demand Forecasting and Inventory Optimization in Sensor-Enabled Retail Supply Chains. Sensors (Basel, Switzerland), 25(8), 2428.
. Kalusivalingam, A. K., Sharma, A., Patel, N., & Singh, V. (2020). Optimizing Industrial Systems Through Deep Q-Networks and Proximal Policy Optimization in Reinforcement Learning. International Journal of AI and ML, 1(3).
Downloads
Issue
Section
License
Copyright (c) 2025 Min-Jae Park, Olivia Turner, Thomas Becker

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.