Abstract
We study a single class of traffic acting on a symmetric set of processor-sharing queues with finite buffers, and we consider the case where the load scales with the number of servers. We address the problem of giving robust performance bounds based on the study of the asymptotic behaviour of the insensitive load balancing schemes, which have the desirable property that the stationary distribution of the resulting stochastic network depends on the distribution of job-sizes only through its mean. It was shown for small systems with losses that they give good estimates of performance indicators, generalizing henceforth Erlang formula, whereas optimal policies are already theoretically and computationally out of reach for networks of moderate size. We characterize the response of symmetric systems under those schemes at different scales and show that three amplitudes of deviations can be identified according to whether \(\rho < 1\), \(\rho = 1\), or \(\rho > 1\). A central limit scaling takes place for a sub-critical load; for \(\rho =1\), the number of free servers scales like \(n^{ {\theta \over \theta +1}}\) (\(\theta \) being the buffer depth and n being the number of servers) and is of order 1 for super-critical loads. This further implies the existence of different phases for the blocking probability. Before a (refined) critical load \(\rho _c(n)=1-a n^{- {\theta \over \theta +1}}\), the blocking is exponentially small and becomes of order \( n^{- {\theta \over \theta +1}}\) at \(\rho _c(n)\). This generalizes the well-known quality-and-efficiency-driven regime, or Halfin—Whitt regime, for a one-dimensional queue and leads to a generalized staffing rule for a given target blocking probability.
Similar content being viewed by others
Notes
Optimal in the sense that it minimizes the blocking probability or any convex criterion.
When the value of c is clear from the context, we will use the notation p instead of p(c).
We are assuming a single dispatcher.
References
Alzer, H., Baricz, A.: Functional inequalities for the incomplete gamma function. J. Math. Anal. Appl. 385(1), 167–178 (2012)
Atar, R.: A diffusion regime with nondegenerate slowdown. Oper. Res. 60(2), 490–500 (2012)
Benäim, M.: Dynamics of stochastic approximation algorithms. Lecture Notes in Math. Springer, Berlin, Séminaire de Probabilités XXXIII (1999)
Bonald, T., Jonckheere, M., Proutière, A.: Insensitive load balancing. SIGMETRICS Perform. Eval. Rev. 32(1), 367–377 (2004)
Bonald, T., Massoulié, L., Proutière, A., Virtamo, J.: A queueing analysis of max-min fairness, proportional fairness and balanced fairness. Queueing Syst. Theory Appl. 53(1–2), 65–84 (2006)
Bonald, T., Proutière, A.: Insensitive bandwidth sharing in data networks. Queueing Syst. Theory Appl. 44(1), 69–100 (2003)
Bramson, M., Lu, Y., Prabakhar, B.: Asymptotic independence of queues under randomized load balancing. Queueing Syst. 71, 247–292 (2012)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2006)
Eschenfeldt, P., Gamarnik, D.: Join the shortest queue with many servers. In: The Heavy Traffic Asymptotics (2015)
Foss, S., Stolyar, A.: Large-scale join-idle-queue system with general service times. ArXiv e-prints May (2016)
Graham, C.: Chaoticity on path space for a queueing network with selection of the shortest queue among several. J. Appl. Probab. 37(1), 198–211 (2000)
Halfin, S., Whitt, W.: Heavy-traffic limits for queues with many exponential servers. Oper. Res. 29(3), 567–588 (1981)
Hordijk, A.: Insensitive bounds for performance measures. In: Proceedings of the 12th International Teletraffic Congress (ITC 12), Torino (1988)
Hordijk, A., Koole, G.: On the optimality of the generalized shortest queue policy. Probab. Eng. Inf. Sci. 4(4), 477–487 (1990)
Hordijk, A., Ridder, A.: Insensitive bounds for the stationary distribution of non-reversible Markov chains. J. Appl. Probab. 25(1), 9–20 (1988)
http://information-technology.web.cern.ch/services/load-balancing-services
Jagerman, D.L.: Some properties of the Erlang loss function. Bell Syst. Tech. J. 53(3), 525–551 (1974)
Janssen, A.J.E.M., Van Leeuwaarden, J.S.H., Zwart, B.: Gaussian expansions and bounds for the Poisson distribution applied to the Erlang B formula. Adv. Appl. Probab. 40(1), 122–143 (2008)
Janssen, A.J.E.M., van Leeuwaarden, J.S.H., Zwart, A.P.: Corrected asymptotics for a multi-server queue in the Halfin–Whitt regime. Queueing Syst. Theory Appl. 58(4), 261–301 (2008)
Janssen, A.J.E.M., van Leeuwaarden, J.S.H., Zwart, A.P.: Refining square-root safety staffing by expanding Erlang C. Oper. Res. 59(6), 1512–1522 (2011)
Jonckheere, M.: Insensitive versus efficient dynamic load balancing in networks without blocking. Queueing Syst. 54(3), 193–202 (2006)
Jonckheere, M., Mairesse, J.: Towards an Erlang formula for multiclass networks. Queueing Syst. 66(1), 53–78 (2010)
Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems. Grundlehren der Mathematischen Wissenschaften. Springer, Berlin (2013)
Knessl, C., Yao, H.: On the finite capacity shortest queue problem. Prog. Appl. Math. 2(1), 1–34 (2011)
Le Boudec, J.Y.: The stationary behaviour of fluid limits of reversible processes is concentrated on stationary points. Netw. Hetegnereous Med. 8(2), 1529–540 (2013)
Leino, J., Virtamo, J.: Insensitive load balancing in data networks. Comput. Netw. 50(8), 1059–1068 (2006)
Lu, Y., Xie, Q., Kliot, G., Geller, A., Larus, J.R., Greenberg, A.: Join-idle-queue: a novel load balancing algorithm for dynamically scalable web services. Perform. Eval. 68(11), 1056–1071 (2011)
Maman, S.: Uncertainty in the demand for service: the case of call centers and emergency departments. Master’s Thesis, Technion, April (2009)
Mitzenmacher, M.: The power of two choices in randomized load balancing. Ph.D. Thesis (1996)
Mukherjee, D., Borst, S.C., van Leeuwaarden, J.S.H., Whiting, P.A.: Universality of load balancing schemes on the diffusion scale. J. Appl. Probab. 53(4), 1111–1124 (2016). 12
Mukhopadhyay, A., Karthik, A., Mazumdar, R.R., Guillemin, F.: Mean field and propagation of chaos in multi-class heterogeneous loss models. Perform. Eval. 91, 117–131 (2015). (Special issue: performance 2015)
Nemes, G.: An explicit formula for the coefficients in Laplace’s method. Constr. Approx. 38(3), 471–487 (2013). arXiv:1207.5222
Pla, V., Virtamo, J., Martinez-Bauset, J.: Optimal robust policies for bandwidth allocation and admission control in wireless networks. Comput. Netw. 52(17), 3258–3272 (2008)
Robert, P.: Stochastic Networks and Queues. Springer, Berlin (2003)
Sagitov, S.: Weak Convergence of Probability Measures. http://www.math.chalmers.se/~serik/WeakConv/C-space.pdf (2013)
Sparaggis, P.D., Towsley, D., Cassandras, C.G.: Extremal properties of the shortest/longest non-full queue policies in finite-capacity systems with state-dependent service rates. J. Appl. Probab. 30(1), 223–236 (1993)
Stolyar, A.: Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers. ArXiv e-prints (2015)
Vvedenskaya, N.D., Dobrushin, R.L., Karpelevich, F.I.: Queueing system with selection of the shortest of two queues: an asymptotic approach. Probl. Inf. Transm. 32(1), 15–27 (1996)
Whitt, W.: Heavy traffic approximations for service systems with blocking. Bell Syst. Tech. J. 63(4), 689–708 (1984)
Whittle, P.: Partial balance and insensitivity. J. Appl. Probab. 22(1), 168–176 (1985)
Xie, Q., Dong, X., Lu, Y., Srikant, R.: Power of d choices for large-scale bin packing: a loss model. In: Proceedings of the 2015 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS’15, pp. 321–334. ACM, New York (2015)
Acknowledgements
This work was partially supported by the Basque Center for Applied Mathematics BCAM and the Bizkaia Talent and European Commission through COFUND programme, under the project titled “High-dimensional stochastic networks and particles systems”, awarded in the 2014 Aid Programme with request Reference Number AYD-000-273, and by the STIC-AmSud Project No. 14STIC03. We thank the referees for their insightful comments which have improved the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version appeared in the proceeding of Sigmetrics 2016.
Proofs of results in Sect. 5
Proofs of results in Sect. 5
1.1 A.1 Proof of Theorem 8
Proof
We first prove a local convergence. Let \(q = \hat{p}+ \beta /\sqrt{n}\), and let \(c = \sum _k k q_k\), \(\bar{\beta } = \sum _i i\beta _i\). Since \(\sum _k q_k = \sum _k \hat{p}_k = 1\), we have
We remind the reader that in order to simplify notation, we shall use p instead of p(c). Starting from (33),
We shall compute the asymptotics of the two products separately. The first product gives
where the last equality follows from (73). For the second product, from (27),
Thus,
where the equalities (73), \(\sum _k\beta _k = \bar{\beta }\) and \(\sum _k\hat{p}_k = \hat{c}\) helped in the simplification.
Substituting the asymptotics of the two products in (77), we get
Consider the exponent on the RHS. Since \(\sum _i\beta _i = 0\), we have \(\bar{\beta } = \sum _i i \beta _i = \sum _{i=0}^{\theta -1} i \beta _i - \theta \sum _{i=0}^{\theta -1}\beta _i = -\sum _i(\theta - i)\beta _i\). Therefore,
Since the multivariate Gaussian distribution has exponent \(-\frac{1}{2}\beta \varSigma ^{-1}\beta \), we can deduce from the above equation the inverse of the covariance matrix to be that stated in the theorem and the local convergence of \(\frac{\pi (q)}{\pi (p)}\) to the Gaussian density.
Using the approximation in (32), combined with the blocking probability estimates obtained in Theorem 5, it can be easily seen that
which in turn implies that, for any \(q =\hat{p}+ \beta /\sqrt{n}\),
Generalizing slightly the previous computations, the same would hold for any \(q =\hat{p}+ (\beta + \varepsilon _n)/\sqrt{n} \), with \(\varepsilon _n\) vanishing when n goes to infinity. Hence, to derive a global convergence result of the distribution function as stated in the theorem, we can now appeal to a variant of Scheffé’s lemma (see, for instance, Theorem 1.29 in [35] with \(\delta _i(n)= {1 \over \sqrt{n}}, i =1, \ldots , k \) and \(k= \theta \)). \(\square \)
1.2 A.2 Proof of Theorem 9
Proof
Instead of defining q according to a predefined scaling as in the previous proof, we shall this time define it with an arbitrary scaling which shall be made precise later. Let \(q = \hat{p}+ {\beta }^{(n)}\), where again we have \(\sum _k {\beta }^{(n)}_k = 0\). For \(\rho = 1\), \(\hat{p}_0 = \cdots = \hat{p}_{\theta -1} = 0\), \(\hat{p}_\theta = 1\), so we shall assume that \({\beta }^{(n)}_k \ge 0\) for \(k < \theta \). Also, for \(\rho = 1\), we have \(\hat{c}= \theta \) and \(\psi (\hat{c})=1\) so that
where \({\bar{\beta }}^{(n)} = \sum _k k{\beta }^{(n)}_k <0\).
Our starting point is again (33), which for the present case reduces to
Since \({\beta }^{(n)} \sim 0\) and \({\bar{\beta }}^{(n)} < 0\), the value of \(k < \theta \) that makes the largest contribution is \(\theta -1\). For all other values of k, \(\frac{({\bar{\beta }}^{(n)})^{\theta -k}}{{\beta }^{(n)}_k} \rightarrow 0\) with respect to this fraction for \(k = \theta - 1\). That is, fluctuations under this scaling will be visible only in \({S}^{(n)}_{\theta -1}\) and \({S}^{(n)}_\theta \) and not in lower values of k. In other words, given the number in the system, we can deduce the configuration to be \({S}^{(n)}_{\theta -1} = n\theta - nc\) and \({S}^{(n)}_\theta = n-{S}^{(n)}_{\theta -1}\). Therefore, there is only one vector p in the set \({\mathcal {P}}^{(n)}_c\). As a consequence, the only possible value of q in (90) is p, which then leads to
Consider \(q_{\theta -1} = {\beta }^{(n)} \ge 0\), where \({\beta }^{(n)}\) is a scalar from now on. Since \(q_\theta = 1 - {\beta }^{(n)}\), we have \({\bar{\beta }}^{(n)} = -{\beta }^{(n)}\). Let us compute the asymptotics of the term with \(\psi \):
where the last asymptotic form is a consequence of Lemma 4. Substituting the above relation back in (94), we get
where we have used the identity \({\bar{\beta }}^{(n)} = -{\beta }^{(n)}\) which was noted previously.
Consequently, the right scaling for \({\beta }^{(n)}\) is \(z n^{-1/(\theta +1)}\), for \(z > 0\), which means that \({S}^{(n)}_{\theta - 1} = n{\beta }^{(n)}\) lives on a scale of \(n^{\theta /(\theta +1)}\). As for the proof of the central limit theorem, we can pass from local to global convergence by combining (96), the estimate of the blocking probabilities given in Theorem 7, and Theorem 1.29 in [35] with \(\delta _1(n)= {1 \over \sqrt{n}}\) and \(k= 1\). \(\square \)
1.3 A.3 Proof of Theorem 10
Proof
Following the same steps as in the proof of Theorem 9 until (93), we can arrive at the conclusion that \({S}^{(n)}_{\theta -1}\) and \({S}^{(n)}_\theta \) will be nonzero. Note that the only difference with the \(\rho = 1\) case is that now
and \(p_k\) has a factor \(\rho ^{-(\theta -k)}\). Going further until (96) leads us to
The only possible scaling is, thus, \({\beta }^{(n)} = zn^{-1}\), which means that the fluctuations of \({S}^{(n)}_\theta \) around \(n\theta \) are O(1).
We cannot carry on from this stage along the same line as that in the proof of Theorem 10, because to arrive at (94) we had assumed that the nonzero fluctuations were increasing with n. (This was needed to apply Stirling’s approximation.) So, we shall work directly with the stationary distribution. From (12),
which is a consequence of Stirling’s approximation. \(\square \)
1.4 A.4 Concavity of R
Lemma 2
The function \(R:{\mathbb {R}}_+ \rightarrow {\mathbb {R}}\) defined by
is concave.
Proof
Recall that \(g_\theta (t) = \sum _{k=0}^{\theta }\frac{t^k}{k!}.\) Rewrite \(g_\theta (t)\) in terms of the incomplete gamma function using the following steps:
where \(\tilde{\varGamma }\) is the normalized incomplete gamma function, that is, \(\tilde{\varGamma }(m,x) = \frac{\varGamma (m,x)}{\varGamma (m,0)}\).
To show the concavity of R, we shall show that its second derivative is negative. Note that \(g^\prime _\theta (t) = g_{\theta -1}(t)\) so that
and
It is shown in [1] that \(\tilde{\varGamma }\), viewed as a function of \(\theta \), is log-concave for all \(t > 0\). We can thus infer that R is concave in \((0,\infty )\). \(\square \)
1.5 A.5 Generating functions
Let
be the moment generating function of \(S^{(n)}\).
Theorem 12
Proof
From (12) and using the facts that \(x! = \int _0^\infty t^x e^{-t} dt\) and \((n\theta - \bar{s}) = \sum _k(\theta -k)s_k\),
where the last identity is a consequence of the multinomial theorem followed by a relabelling of the index inside the sum. Finally, making the transformation \(t \mapsto tn\rho \) inside the integral completes the proof. \(\square \)
Let
be the moment generating function of the number of tasks in steady state.
Lemma 3
Proof
From its definition,
\(\square \)
Theorem 13
Proof
The result is a direct consequence of Theorem 12 and Lemma 3. \(\square \)
1.6 A.6 Miscellaneous results
Lemma 4
For \(\theta \ge 1\),
Proof
Let \(h_\theta (t)=\log \left( \sum _{i=0}^\theta \frac{t^i}{i!}\right) \). The proof is based on computing the coefficients in the Taylor series expansion of h around 0. The first derivative of h is
where \(g_\theta (t) = \sum _{i=0}^\theta \frac{t^i}{i!}\), which gives the coefficient of t as 1.
For \(k\le \theta \), taking the kth derivative of \(h^{(1)}\) and evaluating it using the product rule for higher-order derivatives, we obtain the \((k+1)\)th derivative of h as
where \((g_\theta (t)^{-1})^{(k-j)}\) is the \((k-j)\)th derivative of \(g_\theta (t)^{-1}\). Assuming that the derivatives of \(g_\theta (t)^{-1}\) do not go to \(\infty \) at \(t=0\) (which can be seen to be true), at \(t=0\), the only nonzero derivative is obtained for \(k=\theta \) and \(j=k\). That is,
\(\square \)
Rights and permissions
About this article
Cite this article
Jonckheere, M., Prabhu, B.J. Asymptotics of insensitive load balancing and blocking phases. Queueing Syst 88, 243–278 (2018). https://doi.org/10.1007/s11134-017-9559-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11134-017-9559-5