Table Of ContentSubmitted to Operations Research
manuscript (Please, provide the mansucript number!)
An Overloaded Multiclass FIFO Queue with
Abandonments
Otis B. Jennings
FuquaSchoolofBusiness,DukeUniversity,Durham,NC27708,[email protected]
Josh E Reed
LeonardN.SternSchoolofBusiness,NewYorkUniversity,NewYork,NY10012,[email protected]
In this paper, we consider a single-server queue fed by K independent renewal arrival streams, each rep-
resenting a different job class. Jobs are processed in a FIFO fashion, regardless of class. The total amount
of work arriving to the system exceeds the server’s capacity. That is, the nominal traffic intensity of the
system is assumed to be greater than one. Jobs arriving to the system grow impatient and abandon the
queue after a random amount of time if service has not yet begun. Interarrival, service and abandonment
times are assumed to be generally distributed and class specific.
We approximate this system using both fluid and diffusion limits. To this end, we consider a sequence
of systems indexed by n in which the arrival and service rates are proportional to n; the abandonment
distributionremainsfixedacrossthesequence.Inourfirstmainresult,weshowthatinthelimitasntends
toinfinity,thevirtualwaitingtimeprocessforthosejobsthatdonotabandonfromthequeueconvergestoa
limitingdeterministicprocess.ThislimitmaybecharacterizedasthesolutiontoafirstorderODE.Specific
examplesarethenpresentedforwhichtheODEmaybeexplicitlysolved.Inoursecondmainresult,werefine
thedeterministicfluidapproximationbyshowingthatthefluidcenteredanddiffusionscaledvirtualwaiting
timeprocessweaklyconvergestoanOrnstein-Uhlenbeckprocesswhosedriftandinfinitesimalvarianceboth
varyovertime.Thisprocessmayalsobesolvedforexplicitly,thusyieldingapproximationstothetransient
as well as steady state behavior of the queue.
Key words: Queueing, Abandonment, Supercritical Loading, Overloaded Queue, Ornstein-Uhlenbeck
Process, Diffusion Limit, Virtual Waiting Time Process
1. Introduction
Increasingly, queueing models have included abandonment in their formulations, and for good
reason. From a modeling standpoint, abandonment is an undeniable reality in many industrial
settings. Two examples of queues with abandonment are call centers and organ transplant waiting
1
Author: AnOverloadedAbandonmentQueue
2 ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!)
lists. In the call center context, abandonment is straightforward. Individuals seeking to speak with
an agent are placed on hold and may eventually lose patience, hanging up before the actual service
can begin. See Brown et al. (4) for an extensive review of call centers and modeling complexities.
For organ transplants, those for kidneys being a prime example, needy individuals are placed on a
waiting list, remaining there until they reach the top of the list and a viable organ match is found.
Abandonment in this case is either death or reaching a state of health for which transplantation is
no longer an option. Overloaded queueing models in this context are particularly relevant, as the
need for kidneys exceeds the supply of donors. For more on queueing models of organ transplant
waiting lists, in particular for kidneys, see Su and Zenios (12, 13) and references therein.
The usual course of action taken when introducing abandonment into queueing models is to
assume that a customer’s patience – measured in how long one is willing to wait for service – is
exponentially distributed. Analysis of models with exponential abandonment was first undertaken
by Ancker and Gafarian (1). Although simplifying the analysis considerably, this assumption has
been shown in many cases to be untenable. In this paper, we stray from the exponential aban-
donment assumption by allowing the jobs arriving to our system to have general abandonment
distributions. In particular, we consider a single-server, multiclass first-in-fist-out (FIFO) queue
where each job class has a general abandonment distribution. Our main quantity of interest is the
virtual waiting time process, which tracks the amount of time until all jobs currently in the queue
depart, either due to a service completion or abandonment. One may think of the virtual wait-
ing time as the amount of time that a hypothetical customer would wait before being processed,
assuming she was sufficiently patient and eventually received service. By studying the virtual wait-
ing time process, one is able to obtain key performance measures related to the operation of the
system, such as the long run fraction of jobs who abandon or the average amount of time that jobs
that eventually receive service will have waited.
Recently, others have begun to consider queueing systems with general abandonment distri-
butions. In (? ), Bacelli, Boyer and Hebuterne consider a GI/GI/1+GI queue and determine
necessary and sufficient conditions for the system to be stable. Baccelli and Hebuterne (? ) study
Author: AnOverloadedAbandonmentQueue
ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!) 3
the M/M/1+GI queue and obtain closed form expressions for a variety of quantities of interest
related to this system. More recently, authors have begun to provide an asymptotic analysis of
queueing systems with general abandonment distributions in which the traffic intensity is close
to one. Ward and Glynn (14) provide diffusion limits for the virtual waiting time process and
the queue length process of the GI/GI/1+GI queue; these limits incorporate the density of the
abandonment distribution at the origin. Building off of the work of Ward and Glynn and by scaling
the hazard rate function of the abandonment distribution, Reed and Ward (11) provide diffusion
limits for both the virtual waiting time process and the queue length process of the GI/GI/1+GI
queue; these limits incorporate the entire abandonment distribution. In work similar to Ward and
Glynn (14), but for the many-server “Halfin-Whitt” heavy-traffic regime, Zeltyn and Mandelbaum
(10) provide an asymptotic analysis of the M/M/N +GI queue which takes as its starting point
the work of Bacelli and Hebuterne (? ).
There is a growing body of literature studying multi-server queues in an overloaded regime in
which the traffic intensity is strictly greater than one. This line of research was initiated by the
work of Whitt (17) in which formal fluid limit approximations to the queue length process of the
M/M/N+GI in an overloaded regime were conjectured to hold and proven to be true in a discrete
time setting. In subsequent work, the analysis conducted in (17) has been extended in several
directions, see for instance (? ) and (? ). Although in this paper we concentrate predominately on
the virtual waiting time process, we anticipate in future work using the analysis presented here
as a stepping stone for the studying the queue length process as well, and perhaps verifying the
conjecture originally set forth in (17).
As mentioned at the outset, in this paper we study the virtual waiting time process of the
overloaded, multiclass, single-server queue. In the first of our two main results, Theorem 1, we
provide a first order fluid limit for the virtual waiting time process which turns out to be the
solution to a first order ODE. For certain instances, this ODE may be explicitly solved; some
examples are provided for which this is the case. We also show that as t grows large the solution
to the ODE converges monotonically to a stationary state w(cid:63), a quantity dependent on the initial
Author: AnOverloadedAbandonmentQueue
4 ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!)
conditions of the system. This result is consistent with other works in which the service capacity
is significantly less than the offered load, such as Whitt (16), Garnett et al. (8) and de V´ericourt
and Jennings (? ).
Our second main result, Theorem 2, shows that the stochastic fluctuations of the virtual waiting
time process about its fluid limit may be well approximated by a time inhomogeneous Ornstein-
Uhlenbeck process. The drift of this process at time t is equal to the negative of the process
value multiplied by the traffic-intensity-weighted sum of the abandonment density functions, each
evaluatedatthevalueofthefluidlimitattimet.Thus,ourworkiscomplementarytothatofWard
and Glynn (14) and Zeltyn and Mandelbaum (? ), who also obtain approximations in which the
valueoftheabandonmentdensityplaysarole.Ourresultsuggeststhatthesteadystatedistribution
of the virtual waiting time process may be well approximated by a normal distribution with mean
w(cid:63) andavariancewhichmaybeexplicitlycalculated.Thisresultisalsocomplementarytothework
of Ward and Glynn (14) who suggest using a truncated normal distribution as an approximation
for the steady state distribution of the virtual waiting time process of the GI/GI/1+GI queue in
a critically loaded regime.
The remainder of this paper is organized as follows. In the following section, we present the
basic model which is used for the remainder of the paper. In Section 3, we provide results on the
long-run behavior of the virtual waiting time process as well the long-run fraction of time that the
serverallocatestoeachparticularjobclass.InSection4,weobtainfluidlimitresultsforthevirtual
waiting time process. The main result in this section is Theorem 1, which provides a fluid limit for
the virtual waiting time process in the overloaded regime described above. We then provide several
examplesforwhichourlimitingprocessmaybeexplicitlysolved.Inthefollowingsection,wecenter
the virtual waiting time process by its deterministic fluid limit of Theorem 1 and then scale by an
appropriate constant. Our main result in Section 5 is Theorem 2, which shows that the diffusion
scaledvirtualwaitingtimeprocessconvergestoatimeinhomogeneousOrnstein-Uhlenbeckprocess
whose time varying drift and infinitesimal variance may be explicitly calculated. Finally, in Section
6 we provide closing remarks and directions for possible future research.
Author: AnOverloadedAbandonmentQueue
ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!) 5
Unless stated otherwise, processes are assumed to be elements of D(R ,R), the space of right
+
continuousleftlimitfunctionsmappingR toR,abbreviatedD.WeassumethespaceD isendowed
+
with the usual Skorohod J topology; see, e.g., (15). Let ⇒ denote convergence in distribution for
1
both sequences of stochastic processes as well as sequences of random variables and →P denote
convergence in probability. Convergence of processes is assumed to occur in D.
The notation throughout is necessarily involved. When invoking the Skorohod Representation
Theorem, our convention is to denote alternative probability spaces as (Ω´,F´,P´) and processes on
these spaces similarly, e.g. X´. In several instances we define prelimit processes using the limiting
virtual waiting time process W. Our convention is to denote such prelimit processes by apply theˆ
marker; e.g., Mˆn. Similarly, we define some prelimit processes in terms of both the limiting virtual
waiting time process and the sequence of waiting time processes Wn, employing theˇmarker,
e.g., Mˇn. Lastly, we consider fluid scaled processes and diffusion scaled processes throughout. The
former and their limits are designated by bars and the later and their limits a denoted by tildes;
e.g., M¯n and W˜ n.
2. The model
In this section, we provide the model of the multiclass GI/GI/1+GI queue which will be used
for the remainder of the paper. We begin by assuming that the multiclass queue is fed by K
independent arrival streams, each of which is representative of a different job class. We index each
job class by k∈{1,...,K}.
Lettingk∈{1,...,K},foreachi≥1wedenotebyξ (i)thetimebetweenthe(i−1)st andith class
k
k job arrivals to the system, where we assume that {ξ (i),i≥1} is an i.i.d. sequence of random
k
variables with mean λ−1 and variance σ2 . Thus, the total number of class k arrivals to the system
k a,k
by time t≥0 is given by
(cid:40) (cid:41)
i
(cid:88)
A (t)=sup i≥0: ξ (j)≤t .
k k
j=1
We also denote by
t (i) = ξ (1)+...+ξ (i), (1)
k k k
Author: AnOverloadedAbandonmentQueue
6 ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!)
the arrival time of the ith class k job to the system. Notice the inverse relation: A (t (i))=i.
k k
The potential service time of the ith class k job to arrive to the system is given by ν (i) where
k
we assume that {ν (i),i≥1} is an i.i.d. sequence of positive random variables with mean µ−1 and
k k
variance σ2 . Note, however, that the service time ν (i) will not be realized if job i abandons the
s,k k
queue before its service can commence.
The quantity d (i) represents the amount of time that the ith class k job is willing to wait to
k
receive service before abandoning from the system. We refer to this quantity as the deadline of
the ith class k job. Thus, if d (i) time units pass before the servicing of job i can begin, then that
k
job will abandon from the system. The cumulative distribution of d (i) is denoted by F and its
k k
complement is given by G =1−F .
k k
For each k∈{1,...,K}, the sequence of arrival, service and abandonment time random variables
are each i.i.d. Moreover, these sequences are independent of each other and also independent
between job classes. Finally, we assume that jobs are processed in a FIFO fashion, without regard
to job class designation except possibly when two jobs arrive simultaneously, which we assume
occurs only on a set of measure zero. This assumption is valid, for instance, if the first arrival time
for each job class has a continuous distribution and is independent of all future arrivals.
In this paper, we assume that the queueing system is overloaded in the sense that the nominal
load exceeds the system’s capacity. Defining ρ =λ /µ to be the offered load of the kth job class
k k k
for k=1,...,K, we assume that that overall offered load of the system is greater than one, that is
K
(cid:88)
ρ= ρ >1. (2)
k
k=1
Our primary performance measure of interest is the virtual waiting time process, W ={W(t),t≥
0}, which tracks, as a function of time, the amount of time that a newly arriving job would have
to wait before being served, if the job were sufficiently patient. Note that there are three main
contributors to the dynamics of W. First, there is the initial value at time zero, denoted by W(0).
Next,astimeprogresses,ifajobisbeingserved,thentheserverworksatrate1,therebydecreasing
the virtual waiting time at a unit rate. When no jobs are present, the server necessarily idles.
Author: AnOverloadedAbandonmentQueue
ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!) 7
Finally, newly arriving jobs contribute to increases in W if they are to begin service before their
deadline expires; i.e., they do not abandon. Jobs that eventually abandon do not contribute to the
virtual waiting time.
Assuming that no two jobs arrive simultaneously, one may now express the virtual waiting time
process as the unique solution to the equation
K (cid:90) t
(cid:88)
W(t) = W(0)+ ν (A (s))·1 dA (s)−B(t), t≥0, (3)
k k {dk(Ak(s))>W(s−)} k
k=1 0
where
(cid:90) t
B(t) = 1 ds, t≥0,
{W(s)>0}
0
is the cumulative server busy time up to time t.
Equation(3)providesthefundamentalequationforthevirtualwaitingtimeprocess.However,it
is not in a form that is particularly well-suited to analysis. We therefore now provide an alternative
representation of the virtual waiting time process which will facilitate proving our limit theorems
in the sections that follow. For each k∈{1,...,K}, let
i (cid:18) (cid:19)
(cid:88) 1
M (i) = ν (j)− 1 , i≥0, (4)
ν,k k µ {dk(j)>W(tk(j)−)}
k
j=1
and
i
1 (cid:88)(cid:0) (cid:1)
M (i) = 1 −G (W(t (j)−)) , i≥0. (5)
d,k µ {dk(j)>W(tk(j)−)} k k
k
j=1
The process M = (M (i),i ≥ 1) tracks the centered service times of those jobs that do not
ν,k ν,k
abandon from the system, whereas M =(M (i),i≥1) tracks the number of jobs that do not
d,k d,k
abandon centered by its expected value. Defining the filtration Fk=(Fk,i≥1) by
i
Fk = σ(ξ (l),ν (l),d (l);l≥1,j(cid:54)=k;ξ (l),ν (l),d (l),l=1,...,i−1,ξ (i)), i≥1,
i j j j k k k k
it is clear from (4) and (5) that the processes M and M are martingales with respect to Fk.
ν,k d,k
This fact will be useful later in the paper. In particular, it will allow us to take advantage of
martingale convergence theorems when proving certain results.
Author: AnOverloadedAbandonmentQueue
8 ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!)
By (3) we may express the virtual waiting time process as
W(t) = W(0)−t+I(t) (6)
(cid:88)K (cid:20) 1 (cid:90) t (cid:21)
+ M (A (t))+M (A (t))+ G (W(s−))dA (s) ,
ν,k k d,k k µ k k
k=1 k 0
for t≥0, where
I(t)=t−B(t), t≥0,
is the cumulative idle time of the server up until time t. We define I =(I(t),t≥0) to be the idle
time process.
Furthermore, for each k∈{1,...,K}, setting
1 (cid:18)(cid:90) t (cid:90) t (cid:19)
ε (t) = G (W(s−))dA (s)− G (W(s−))d(λ s) , t≥0, (7)
k µ k k k k
k 0 0
it follows from (6) that
W(t) = W(0)−t+I(t) (8)
K
(cid:88)
+ [M (A (t))+M (A (t))+εn(t)]
ν,k k d,k k k
k=1
(cid:90) t K
(cid:88)
+ ρ G (W(s))ds, t≥0,
k k
0 k=1
where the final equality follows since
1 (cid:90) t (cid:90) t
G (W(s−))d(λ s) = ρ G (W(s))ds, t≥0. (9)
µ k k k k
k 0 0
We next turn to analyzing the virtual waiting time process on a heuristic level.
3. Service allocation and virtual waiting time
In preparation for the results to follow in future sections, we now show that there exists a unique
long run allocation of the server’s effort amongst each particular job class. We begin with the
following definition.
Definition 1 A vector α=(α ,...,α ) is called a feasible allocation if α ≥0 for each k=1,...,K,
1 K k
(cid:80) (cid:80)
and α ≤1. If α =1, then the allocation is called complete.
k k k k
Author: AnOverloadedAbandonmentQueue
ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!) 9
For each k∈{1,...,K}, one may interpret α as the long run proportion of time that the server
k
dedicates to serving class k jobs. Hence, the qualifier “feasible” is superfluous in the sense that we
donotconsiderwould-beinfeasibleallocationsinthispaper.Alsonotethata“complete”allocation
is representative of a fully utilized server.
Now suppose that the virtual waiting time has reached a limiting value w and remains constant
thereafter. In such a scenario, it is clear that all jobs arriving to the queue will experience an
identical virtual waiting time, namely w. Furthermore, the larger the value of w, the larger the
fraction of jobs that will abandon from the system. In particular, for each delay w≥0, there is
an associated survival rate vector (λ G (w),...,λ G (w)) where, for each k∈{1,...,K}, λ G (w)
1 1 K K k k
represents the long run rate of class k jobs arriving to the system that eventually enter service if
the delay is always equal to w.
We now determine which values of w are achievable. Interestingly, one of the key factors deter-
mining a delay’s achievability is the existence of a feasible allocation by which the delay may be
sustained in the long run. This in turn motivates the following definition.
Definition 2 For a given delay w≥0, a feasible allocation vector α is called
sufficient if α µ ≥λ G (w) for each k,
k k k k
partially insufficient if α µ <λ G (w) for some k,
k k k k
partially excessive if α µ >λ G (w) for some k,
k k k k
critical if α µ =λ G (w) for each k, and
k k k k
perfect if it is critical and complete.
Making use of Definition 2, the following proposition now shows that there exists a unique range
of virtual waiting times [w(cid:63),w(cid:63)] which may be sustained by a perfect allocation. Before stating
l u
this proposition, however, we need the following assumption.
Assumption (A1) For each job class k∈{1,...,K}, the deadline distribution F is continuous.
k
Furthermore, F =0 and lim F (x)=1.
k x→∞ k
Author: AnOverloadedAbandonmentQueue
10 ArticlesubmittedtoOperationsResearch;manuscriptno. (Please,providethemansucriptnumber!)
We may now state Proposition 1.
Proposition 1 Under assumption (A1), there exists a unique perfect allocation α(cid:63). Moreover,
there exists a unique critical delay range quantity [w(cid:63),w(cid:63)], where w(cid:63) ≤w(cid:63), associated with this
l u l u
perfect allocation such that
• if the delay quantity is less than w(cid:63), then any feasible allocation will be partially insufficient,
l
and
• if the delay quantity exceeds w(cid:63), then any complete allocation will be partially excessive.
u
Furthermore, if for each k∈{1,...,K} the abandonment distribution F is strictly increasing, then
k
w(cid:63)=w(cid:63) so that there exists a unique delay w(cid:63). Equivalently, F (w(cid:63))=F (w(cid:63)) for each k.
l u k l k u
W e start with the proof of uniqueness. Suppose there are two perfect allocations, α and ν.
Then, without loss of generality, there exist delays w > w such that α = λ µ−1G (w ) and
1 2 k k k k 1
ν =λ µ−1G (w )foreachk∈{1,...,K},andα <ν forsome(cid:96)∈{1,...,K}.Bythemonotonicity
k k k 2 (cid:96) (cid:96)
and decreasing nature of the G ’s and the statement that both α and ν are perfect allocations, we
k
have that
(cid:88) (cid:88)
1= α < ν =1,
k k
k k
a contradiction. Thus there is at most one perfect allocation.
We now show existence of a perfect allocation. For the remainder of the proof, consider the
function σ=(σ(w),w≥0), where σ(w)=(cid:80) λ µ−1G (w). By Assumption (A1), we have σ(0)=
k k k k
ρ>1 and lim σ(w)=0. Also by (A1), since each G is decreasing, monotonic and continuous,
w→∞ k
so is σ. It follows that there exists a w(cid:63)>0 such that σ(w(cid:63))=1. Therefore there exists a perfect
allocation α(cid:63)=(α ,k=1,...,K), where α(cid:63)=λ µ−1G (w(cid:63)). The monotonicity of the G ’s dictates
k k k k k k
Description:Josh E Reed. Leonard distribution remains fixed across the sequence. to infinity, the virtual waiting time process for those jobs that do not abandon from the queue converges to a . When invoking the Skorohod Representation.