Article ID Journal Published Year Pages File Type
10348782 Simulation Modelling Practice and Theory 2005 18 Pages PDF
Abstract
This paper presents a methodology that, for the problem of scheduling of a single server on multiple products, finds a dynamic control policy via intelligent agents. The dynamic (state dependent) policy optimizes a cost function based on the WIP inventory, the backorder penalty costs and the setup costs, while meeting the productivity constraints for the products. The methodology uses a simulation optimization technique called Reinforcement Learning (RL) and was tested on a stochastic lot-scheduling problem (SELSP) having a state-action space of size 1.8 × 107. The dynamic policies obtained through the RL-based approach outperformed various cyclic policies. The RL approach was implemented via a multi-agent control architecture where a decision agent was assigned to each of the products. A Neural Network based approach (least mean square (LMS) algorithm) was used to approximate the reinforcement value function during the implementation of the RL-based methodology. Finally, the dynamic control policy over the large state space was extracted from the reinforcement values using a commercially available tree classifier tool.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, ,