Semiconductor final test scheduling with Sarsa(λ, k) algorithm

Article ID	Journal	Published Year	Pages	File Type
478480	European Journal of Operational Research	2011	13 Pages	PDF

Abstract

Semiconductor test scheduling problem is a variation of reentrant unrelated parallel machine problems considering multiple resource constraints, intricate {product, tester, kit, enabler assembly} eligibility constraints, sequence-dependant setup times, etc. A multi-step reinforcement learning (RL) algorithm called Sarsa(λ, k) is proposed and applied to deal with the scheduling problem with throughput related objective. Allowing enabler reconfiguration, the production capacity of the test facility is expanded and scheduling optimization is performed at the bottom level. Two forms of Sarsa(λ, k), i.e. forward view Sarsa(λ, k) and backward view Sarsa(λ, k), are constructed and proved equivalent in off-line updating. The upper bound of the error of the action-value function in tabular Sarsa(λ, k) is provided when solving deterministic problems. In order to apply Sarsa(λ, k), the scheduling problem is transformed into an RL problem by representing states, constructing actions, the reward function and the function approximator. Sarsa(λ, k) achieves smaller mean scheduling objective value than the Industrial Method (IM) by 68.59% and 76.89%, respectively for real industrial problems and randomly generated test problems. Computational experiments show that Sarsa(λ, k) outperforms IM and any individual action constructed with the heuristics derived from the existing heuristics or scheduling rules.

► We propose a multi-step reinforcement learning algorithm called Sarsa(λ,k). ► We construct forward view Sarsa(λ,k) and backward view Sarsa(λ,k) and prove their equivalence in off-line updating. ► We provide the upper bound of the error of the action-value function in tabular Sarsa(λ,k) when solving deterministic problems. ► Sarsa(λ,k) outperforms the Industrial Method and any individual action.

Keywords

Scheduling Semiconductor Reinforcement learning