کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
476833 | 1446074 | 2012 | 11 صفحه PDF | دانلود رایگان |

The paper investigates a problem faced by a make-to-order (MTO) firm that has the ability to reject or accept orders, and set prices and lead-times to influence demands. Inventory holding costs for early completed orders, tardiness costs for late delivery orders, order rejection costs, manufacturing variable costs, and fixed costs are considered. In order to maximize the expected profits in an infinite planning horizon with stochastic demands, the firm needs to make decisions from the following aspects: which orders to accept or reject, the trade-off between price and lead-time, and the potential for increased demand against capacity constraints. We model the problem as a Semi-Markov Decision Problem (SMDP) and develop a reinforcement learning (RL) based Q-learning algorithm (QLA) for the problem. In addition, we build a discrete-event simulation model to validate the performance of the QLA, and compare the experimental results with two benchmark policies, the First-Come-First-Serve (FCFS) policy and a threshold heuristic policy. It is shown that the QLA outperforms the existing policies.
► In this study, we investigate the joint pricing, lead-time and scheduling decisions simultaneously in MTO systems.
► We model the problem as a Semi-Markov Decision Problem (SMDP).
► We develop a reinforcement learning (RL) based Q-learning algorithm (QLA).
Journal: European Journal of Operational Research - Volume 221, Issue 1, 16 August 2012, Pages 99–109