Stochastic scheduling concerns scheduling problems involving random attributes, such as random processing times, random due dates, random weights, and stochastic machine breakdowns. Major applications arise in manufacturing systems, computer systems, communication systems, logistics and transportation, and machine learning, among others.
The objective of the stochastic scheduling problems can be regular objectives such as minimizing the total flowtime, the makespan, or the total tardiness cost of missing the due dates; or can be irregular objectives such as minimizing both earliness and tardiness costs of completing the jobs, or the total cost of scheduling tasks under likely arrival of a disastrous event such as a severe typhoon.[1]
The performance of such systems, as evaluated by a regular performance measure or an irregular performance measure, can be significantly affected by the scheduling policy adopted to prioritize over time the access of jobs to resources. The goal of stochastic scheduling is to identify scheduling policies that can optimize the objective.
Stochastic scheduling problems can be classified into three broad types: problems concerning the scheduling of a batch of stochastic jobs, multi-armed bandit problems, and problems concerning the scheduling of queueing systems[2] . These three types are usually under the assumption that complete information is available in the sense that the probability distributions of the random variables involved are known in advance. When such distributions are not fully specified and there are multiple competing distributions to model the random variables of interest, the problem is referred to as incomplete information. The Bayesian method has been applied to treat stochastic scheduling problems with incomplete information.
In this class of models, a fixed batch of
n
m
The simplest model in this class is the problem of sequencing a set of
n
Gi( ⋅ )
pi
i
Let
wi\ge0
i
\tilde{C}i
\Pi
E\pi[ ⋅ ]
\pi\in\Pi
The optimal solution in the special deterministic case is given by the Shortest Weighted Processing Time rule of Smith:[3] sequence jobs in nonincreasing order of the priority index
wipi
In general, the rule that assigns higher priority to jobs with shorter expected processing time is optimal for the flowtime objective under the following assumptions: when all the job processing time distributions are exponential;[5] when all the jobs have a common general processing time distribution with a nondecreasing hazard rate function;[6] and when job processing time distributions are stochastically ordered.[7]
Multi-armed bandit models form a particular type of optimal resource allocation (usually working with time assignment), in which a number of machines or processors are to be allocated to serve a set of competing projects (termed as arms). In the typical framework, the system consists of a single machine and a set of stochastically independent projects, which will contribute random rewards continuously or at certain discrete time points, when they are served. The objective is to maximize the expected total discounted rewards over all dynamically revisable policies.[1]
The first version of multi-bandit problems was formulated in the area of sequential designs by Robbins (1952).[8] Since then, there had not been any essential progress in two decades, until Gittins and his collaborators made celebrated research achievements in Gittins (1979),[9] Gittins and Jones (1974),[10] Gittins and Glazebrook (1977),[11] and Whittle (1980)[12] under the Markov and semi-Markov settings. In this early model, each arm is modeled by a Markov or semi-Markov process in which the time points of making state transitions are decision epochs. The machine can at each epoch pick an arm to serve with a reward represented as a function of the current state of the arm being processed, and the solution is characterized by allocation indices assigned to each state that depends only on the states of the arms. These indices are therefore known as Gittins indices and the optimal policies are usually called Gittins index policies, due to his reputable contributions.
Soon after the seminal paper of Gittins, the extension to branching bandit problem to model stochastic arrivals (also known as the open bandit or arm acquiring bandit problem) was investigated by Whittle (1981).[13] Other extensions include the models of restless bandit, formulated by Whittle (1988),[14] in which each arm evolves restlessly according to two different mechanisms (idle fashion and busy fashion), and the models with switching costs/delays by Van Oyen et al. (1992),[15] who showed that no index policy is optimal when switching between arms incurs costs/delays.
Models in this class are concerned with the problems of designing optimal service disciplines in queueing systems, where the jobs to be completed arrive at random epochs over time, instead of being available at the start. The main class of models in this setting is that of multiclass queueing networks (MQNs), widely applied as versatile models of computer communications and manufacturing systems.
The simplest types of MQNs involve scheduling a number of job classes in a single server. Similarly as in the two model categories discussed previously, simple priority-index rules have been shown to be optimal for a variety of such models.
More general MQN models involve features such as changeover times for changing service from one job class to another (Levy and Sidi, 1990),[16] or multiple processing stations, which provide service to corresponding nonoverlapping subsets of job classes. Due to the intractability of such models, researchers have aimed to design relatively simple heuristic policies which achieve a performance close to optimal.
The majority of studies on stochastic scheduling models have largely been established based on the assumption of complete information, in the sense that the probability distributions of the random variables involved, such as the processing times and the machine up/downtimes, are completely specified a priori.
However, there are circumstances where the information is only partially available. Examples of scheduling with incomplete information can be found in environmental clean-up,[17] project management,[18] petroleum exploration,[19] sensor scheduling in mobile robots,[20] and cycle time modeling,[21] among many others.
As a result of incomplete information, there may be multiple competing distributions to model the random variables of interest. An effective approach is developed by Cai et al. (2009), to tackle this problem, based on Bayesian information update. It identifies each competing distribution by a realization of a random variable, say
\Theta
\Theta
\Theta