کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
479420 1445990 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system
ترجمه فارسی عنوان
الگوریتم های تقریبی برنامه ریزی پویا جدید برای پروسه های تصمیم گیری مارکوف بدون در نظر گرفتن مقیاس بزرگ و کاربرد آنها برای بهینه سازی سیستم تولید و توزیع
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی


• We propose new approximate dynamic programming algorithms.
• These algorithms can solve large-scale undiscounted Markov decision processes.
• Optimal control problems for production systems with 35, 973, 840 states were solved.
• The kanban, base stock, CONWIP, hybrid and extended kanban systems are considered.
• We show numerical comparisons between optimal controls and optimized pull controls.

Undiscounted Markov decision processes (UMDP's) can formulate optimal stochastic control problems that minimize the expected total cost per period for various systems. We propose new approximate dynamic programming (ADP) algorithms for large-scale UMDP's that can solve the curses of dimensionality. These algorithms, called simulation-based modified policy iteration (SBMPI) algorithms, are extensions of the simulation-based modified policy iteration method (SBMPIM) (Ohno, 2011) for optimal control problems of multistage JIT-based production and distribution systems with stochastic demand and production capacity. The main new concepts of the SBMPI algorithms are that the simulation-based policy evaluation step of the SBMPIM is replaced by the partial policy evaluation step of the modified policy iteration method (MPIM) and that the algorithms starts from the expected total cost per period and relative value estimated by simulating the system under a reasonable initial policy.For numerical comparisons, the optimal control problem of the three-stage JIT-based production and distribution system with stochastic demand and production capacity is formulated as a UMDP. The demand distribution is changed from a shifted binomial distribution in Ohno (2011) to a Poisson distribution and near-optimal policies of the optimal control problems with 35,973,840 states are computed by the SBMPI algorithms and the SBMPIM. The computational result shows that the SBMPI algorithms are at least 100 times faster than the SBMPIM in solving the numerical problems and are robust with respect to initial policies. Numerical examples are solved to show an effectiveness of the near optimal control utilizing the SBMPI algorithms compared with optimized pull systems with optimal parameters computed utilizing the SBOS (simulation-based optimal solutions) from Ohno (2011).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: European Journal of Operational Research - Volume 249, Issue 1, 16 February 2016, Pages 22–31
نویسندگان
, , , ,