A Chinese restaurant process for multiset permutations

Stark, Dudley

doi:10.15559/26-VMSTA294

Abstract

Multisets are like sets, except that they can contain multiple copies of their elements. If there are ${n_{i}}$ copies of i, $1\le i\le t$, in multiset ${M_{t}}$, then there are $\left(\genfrac{}{}{0.0pt}{}{{n_{1}}+\cdots +{n_{t}}}{{n_{1}},\dots ,{n_{t}}}\right)$ possible permutations of ${M_{t}}$. Knuth showed how to factor any multiset permutation into cycles. For fixed ${n_{i}}$, $i\ge 1$, we show how to adapt the Chinese restaurant process, which generates random permutations on n elements with weighting ${\theta ^{\# \hspace{0.1667em}\mathrm{cycles}}}$, $\theta \gt 0$, sequentially for $n=1,2,\dots $, to the multiset case, where we fix the ${n_{i}}$ and build permutations on ${M_{t}}$ sequentially for $t=1,2,\dots $. The number of cycles of a multiset permutation chosen uniformly at random, i.e. $\theta =1$, has distribution given by the sum of independent negative hypergeometric distributed random variables. For all $\theta \gt 0$, and under the assumption that ${n_{i}}=O(1)$, we show a central limit theorem as $t\to \infty $ for the number of cycles.

1 Introduction

A growth rule which received considerable attention in the literature is the following preferential attachment algorithm on permutations written as the product of cycles known as the Chinese restaurant process (CRP) [1, 5, 9, 12–14, 16, 20, 23]. For $n\ge 1$, given a permutation of $[n-1]:=\left\{1,2,\dots ,n-1\right\}$ which has been constructed at step $n-1$, element n is either appended to the permutation as a new singleton cycle with probability $\frac{\theta }{\theta +n-1}$, or inserted into a random position within the existing permutation with probability equal to $\frac{n-1}{\theta +n-1}$. Let ${S_{n}}$ be the set of permutations on $[n]$. The permutation obtained at step n has the Ewens distribution, which is the uniform distribution on $n!$ permutations in the case $\theta =1$, and, in general, has probability distribution ${\theta ^{|\pi |}}/{\theta _{(n)}}$, $\pi \in {S_{n}}$, where ${\theta _{(n)}}:=\theta (\theta +1)\cdots (\theta +n-1)$ and $|\pi |$ is the number of cycles in π.

Let ${\Pi _{n}}$ be the random permutation of $[n]$ at the nth step of the CRP. The distribution of ${\Pi _{n}}$ is invariant under relabelling. For different orders the permutations are consistent, in the sense that ${\Pi _{m}}$, for $m\lt n$, can be derived from ${\Pi _{n}}$ by removing elements $m+1,\dots ,n$ from their cycles and deleting empty cycles if necessary.

The CRP expresses every permutation π of $[n]$ uniquely as the product

\[ \pi =({x_{1,1}}\dots {x_{1,{m_{1}}}}{y_{1}})({x_{2,1}}\dots {x_{2,{m_{2}}}}{y_{2}})\cdots ({x_{\tau ,1}}\dots {x_{\tau ,{m_{\tau }}}}{y_{\tau }}),\hspace{1em}\tau \ge 1,\]

where the following two conditions are satisfied:

1. ${y_{1}}\lt {y_{2}}\lt \cdots \lt {y_{\tau }}$
2. ${y_{i}}\lt {x_{ij}}\hspace{3.33333pt}\mathrm{for}\hspace{3.33333pt}1\le j\le {m_{i}},1\le i\le \tau $.

We will see in Section 2 that there is a similar unique expression for permutations of multisets.

The number of cycles in ${\Pi _{n}}$, denoted by ${K_{n}}$, has probability generating function (p.g.f.)

\[ \mathbb{E}{z^{{K_{n}}}}=\frac{{(\theta z)_{n}}}{{(\theta )_{n}}},\]

which corresponds to the distribution of the sum of n independent Bernoulli variables ${K_{n}}={\textstyle\sum _{i=1}^{n}}{I_{i}}$, ${I_{i}}\sim \mathrm{Bernoulli}(\theta /(i+\theta -1))$; see [1]. We have ${I_{i}}=1$ exactly when the element i starts a new cycle. As $n\to \infty $,

\[ {K_{n}}\sim \theta \log n\hspace{3.33333pt}\hspace{3.33333pt}\mathrm{a}.\mathrm{s}.,\hspace{3.33333pt}\hspace{3.33333pt}\hspace{3.33333pt}\frac{{K_{n}}-\theta \log n}{\sqrt{\theta \log n}}\stackrel{d}{\to }\mathrm{N}(0,1),\]

where $\stackrel{d}{\to }$ denotes convergence in distribution.

In this paper, a CRP will be defined for multiset permutations and derived using the methods in [23]. As motivation, we note that multiset permutations are of fundamental interest in combinatorics [22] and random multiset permutations have been studied previously [10].

2 Multiset permutations and their factorisations into cycles

There are at least two different ways of defining cycles of multiset permutations [19, 22]. Knuth’s way [19] will be used in this paper. In this section, we summarise the discussion and results on multiset permutation factorisation taken from Section 5.1.2 of [19] that is needed in this paper. In particular, we are going to use Theorem 1 below to factor multiset permutations. A multiset is like a set except that it can have repetitions of identical elements, for example

\[ M=\{1,1,1,2,2,3,4,4,4,4\},\]

where the set $\{1,2,3,4\}$ is given the usual ordering. A permutation of a multiset is an arrangement of its elements in a row, for example

\[ 3\hspace{3.33333pt}1\hspace{3.33333pt}2\hspace{3.33333pt}4\hspace{3.33333pt}4\hspace{3.33333pt}1\hspace{3.33333pt}2\hspace{3.33333pt}4\hspace{3.33333pt}1\hspace{3.33333pt}4.\]

We may write this in two-line notation as

(1)

\[ \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& 1& 2& 2& 3& 4& 4& 4& 4\\ {} 3& 1& 2& 4& 4& 1& 2& 4& 1& 4\end{array}\right).\]

The intercalation product ${\pi _{1}}\perp {\pi _{2}}$ of two multiset permutations ${\pi _{1}}$ and ${\pi _{2}}$ is defined in [6, 11] and is obtained by

1. Expressing ${\pi _{1}}$ and ${\pi _{2}}$ in the two-line notation.
2. Juxtaposing these two-line notations.
3. Sorting the columns into nondecreasing order of the top line, in such a way that the sorting is stable, in the sense that left-to-right order in the bottom line is preserved when the corresponding top line elements are equal.

For example,

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& & \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& 2& 3& 4\\ {} 3& 1& 4& 1& 2\end{array}\right)\top \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 2& 4& 4& 4\\ {} 2& 4& 4& 1& 4\end{array}\right)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& 1& 2& 2& 3& 4& 4& 4& 4\\ {} 3& 1& 2& 4& 4& 1& 2& 4& 1& 4\end{array}\right)\end{array}\]

For possibly repeated elements ${x_{1}},{x_{2}},\dots ,{x_{m}}$, the cycle $({x_{1}},\dots ,{x_{n}})$ stands for the permutation obtained in two line form by sorting the columns of

\[ \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}{x_{1}}& {x_{2}}& \dots & {x_{m}}\\ {} {x_{2}}& {x_{3}}& \dots & {x_{1}}\end{array}\right)\]

in a stable manner. For example, we have

(2)

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& & \displaystyle (4\hspace{3.33333pt}2\hspace{3.33333pt}4\hspace{3.33333pt}4\hspace{3.33333pt}1\hspace{3.33333pt}3\hspace{3.33333pt}1\hspace{3.33333pt}1\hspace{3.33333pt}2\hspace{3.33333pt}4)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}4& 2& 4& 4& 1& 3& 1& 1& 2& 4\\ {} 2& 4& 4& 1& 3& 1& 1& 2& 4& 4\end{array}\right)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& 1& 2& 2& 3& 4& 4& 4& 4\\ {} 3& 1& 2& 4& 4& 1& 2& 4& 1& 4\end{array}\right)\end{array}\]

and so (1) is a cycle. For these general cycles, $({x_{1}}{x_{2}}\dots {x_{n}})$ is not always the same as $({x_{2}}\dots {x_{n}}{x_{1}})$ and it can happen that a cycle can be written as the intercalation product of other cycles, as we shall see at the end of this section.

Knuth [19] proves the following factorisation of multiset permutations into cycles.

Theorem 1 ([19]).

Let the elements of the multiset M be linearly ordered by the relation <. Every permutation π of M has a unique representation as the intercalation

(3)

\[ \pi =({x_{1,1}}\dots {x_{1,{m_{1}}}}{y_{1}})\top ({x_{2,1}}\dots {x_{2,{m_{2}}}}{y_{2}})\top \cdots \top ({x_{\tau ,1}}\dots {x_{\tau ,{n_{\tau }}}}{y_{\tau }}),\hspace{1em}\tau \ge 1,\]

where the following two conditions are satisfied:

1. ${y_{1}}\le {y_{2}}\le \cdots \le {y_{\tau }}$
2. ${y_{i}}\lt {x_{i,j}}\hspace{3.57777pt}\mathrm{for}\hspace{3.57777pt}1\le j\le {m_{i}},\hspace{0.1667em}1\le i\le \tau $.

Note that (2) is not in the form (3). The factorisation of (1) corresponding to Theorem 1 is

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& & \displaystyle (3\hspace{3.33333pt}1)\top (1)\top (2\hspace{3.33333pt}4\hspace{3.33333pt}2\hspace{3.33333pt}4\hspace{3.33333pt}4\hspace{3.33333pt}1)\top (4)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c}3& 1\\ {} 1& 3\end{array}\right)\top \left(\begin{array}{c}1\\ {} 1\end{array}\right)\top \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}2& 4& 2& 4& 4& 1\\ {} 4& 2& 4& 4& 1& 2\end{array}\right)\top \left(\begin{array}{c}4\\ {} 4\end{array}\right)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}3& 1& 1& 2& 4& 2& 4& 4& 1& 4\\ {} 1& 3& 1& 4& 2& 4& 4& 1& 2& 4\end{array}\right)\\ {} & \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}1& 1& 1& 2& 2& 3& 4& 4& 4& 4\\ {} 3& 1& 2& 4& 4& 1& 2& 4& 1& 4\end{array}\right).\end{array}\]

3 The CRP for multiset permutations

Given a fixed sequence of positive integers ${n_{i}}\ge 1$, $i\ge 1$, we will construct a CRP process on permutations of multisets indexed by integers $t\ge 0$. We define $M({n_{1}},\dots ,{n_{t}})$ to be the multiset which contains ${n_{i}}$ copies of i for $i\in [t]$, $t\ge 1$, and ∅ for $t=0$. We give the set of elements $[t]$ of $M({n_{1}},\dots ,{n_{t}})$ the usual ordering. The set of permutations of $M({n_{1}},\dots ,{n_{t}})$ is denoted by $S({n_{1}},\dots ,{n_{t}})$ and is of size

\[ |S({n_{1}},\dots ,{n_{t}})|=\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{1}},\dots ,{n_{t}}}\right),\]

where ${N_{t}}={n_{1}}+\cdots +{n_{t}}$. Note that ${S_{n}}=S(\underset{n\hspace{2.5pt}\text{times}}{\underbrace{1,1,\dots ,1}})$. The random multiset permutation ${\tilde{\Pi }_{t}}$ of our process at the tth step is an element of $S({n_{1}},\dots ,{n_{t}})$, represented in accordance with (3). At step $t\ge 1$, there are ${n_{t}}$ copies of t either to be inserted in the existing permutation ${\tilde{\Pi }_{t-1}}$ or to be put into singleton cycles. (It is not possible to put more than one copy of t in a new cycle because of condition 2 of Theorem 1.) Suppose that ${k_{t}}\in \{0\}\cup [{n_{t}}]$ of the ${n_{t}}$ copies of t start ${k_{t}}$ new singleton cycles containing t, and the ${n_{t}}-{k_{t}}$ other copies of t are inserted into the existing permutation. The ${n_{t}}-{k_{t}}$ copies of t can be inserted to the left of any of the ${y_{i}}$ or ${x_{ij}}$ in (3). A ‘stars and bars’ argument shows that the number of ways of inserting ${n_{t}}-{k_{t}}$ copies of t to the left of an element in the multiset $S({n_{1}},\dots ,{n_{t-1}})$ formatted like (3) is

(4)

\[ \left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right),\]

recalling that

\[ \left(\genfrac{}{}{0.0pt}{}{-1}{k}\right)=\left\{\begin{array}{l@{\hskip10.0pt}l}1& \mathrm{if}\hspace{3.33333pt}k=0;\\ {} 0& \mathrm{if}\hspace{3.33333pt}k\gt 1.\end{array}\right.\]

We now define the multiset CRP. Given $\theta \gt 0$, define

\[ {F_{t}}(\theta )={\sum \limits_{{k_{t}}=0}^{{n_{t}}}}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right){\theta ^{{k_{t}}}}\]

The special case

(5)

\[ {F_{t}}(1)={\sum \limits_{{k_{t}}=0}^{{n_{t}}}}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right)=\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right).\]

results from the well known identity

(6)

\[ {\sum \limits_{\rho =0}^{\nu }}\left(\genfrac{}{}{0.0pt}{}{\mu +\rho }{\rho }\right)=\left(\genfrac{}{}{0.0pt}{}{\mu +\nu +1}{\nu }\right),\hspace{1em}\nu ,\mu \ge 0\hspace{3.33333pt}\mathrm{integers};\]

see page 161 of [15]. Note that (5) is the total number of ways of extending the permutation from time $t-1$ to time t. At time $t\ge 1$, ${k_{t}}$ new singleton cycles containing t are created and the other ${n_{t}}-{k_{t}}$ copies are inserted into the existing permutation in a specified way with probability ${\theta ^{{k_{t}}}}/{F_{t}}(\theta )$. If ${n_{t}}=1$ for all t, then ${F_{t}}(\theta )=\theta +t-1$ and we recover the usual CRP. An induction argument on t shows that all possible $\pi \in S({n_{1}},\dots ,{n_{t}})$ can be constructed in this manner. This can also be seen from the identity

\[ \left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{1}},\dots ,{n_{t}}}\right)={\prod \limits_{s=1}^{t}}\left(\genfrac{}{}{0.0pt}{}{{N_{s}}}{{n_{s}}}\right)\]

Next we will find the distribution of ${\tilde{\Pi }_{t}}$.

Theorem 2.

Given $\theta \gt 0$, for all $\pi \in S({n_{1}},\dots ,{n_{t}})$ the CRP generates π at step t with probability equal to ${\theta ^{|\pi |}}/S$, where

(7)

\[ S=\sum \limits_{\pi \in S({n_{1}},\dots ,{n_{t}})}{\theta ^{|\pi |}}=\hspace{0.1667em}{\prod \limits_{s=1}^{t}}{F_{s}}(\theta ).\]

Proof.

We will use the method in [23]. An arborescence is a directed, rooted tree with edges oriented in agreement with paths from the root to the leaves. Our arborescence has vertex set $V={\textstyle\bigcup _{s=0}^{t}}\Pi ({n_{1}},\dots ,{n_{s}})$ and the root is ∅ (when $s=0$). The leaves of the arborescence are $\Pi ({n_{1}},\dots ,{n_{t}})$. Give each leaf $\pi \in \Pi ({n_{1}},\dots ,{n_{t}})$ the weight ${\theta ^{|\pi |}}$, where $|\pi |$ is the number of cycles in π. There is a directed edge from ${\pi _{1}}\in \Pi ({n_{1}},\dots ,{n_{s-1}})$ to ${\pi _{2}}\in \Pi ({n_{1}},\dots ,{n_{s}})$, $s\in [t]$, if and only if, for some ${k_{s}}\in \{0\}\cup [{n_{s}}]$, ${\pi _{2}}$ is obtained from ${\pi _{1}}$ by inserting ${n_{s}}-{k_{s}}$ copies of s to the left of elements of ${\pi _{1}}$, when ${\pi _{1}}$ is written as (1), and creating ${k_{s}}$ singleton cycles each containing s. For each $\pi \in V$, define $Q(\pi )$ to be the sum of the weights of all leaves ${\pi ^{\prime }}\in \Pi ({n_{1}},\dots ,{n_{t}})$ for which there is a directed path from π to ${\pi ^{\prime }}$. For $\pi \in \Pi ({n_{1}},\dots ,{n_{s}})$, we have

(8)

\[ Q(\pi )={\theta ^{|\pi |}}{\prod \limits_{u=s+1}^{t}}{F_{u}}(\theta ),\]

which follows immediately from the interpretation of the binomial coefficient preceding (4). Taking $\pi =\varnothing $ and $s=0$ in (8) gives (7). With ${\pi _{1}}$ and ${\pi _{2}}$ taken as above, we have $|{\pi _{2}}|=|{\pi _{1}}|+{k_{s}}$, and therefore

(9)

\[ {p_{{\pi _{1}},{\pi _{2}}}}:=\frac{Q({\pi _{2}})}{Q({\pi _{1}})}=\frac{{\theta ^{{k_{s}}}}}{{F_{s}}(\theta )}.\]

These are the transition probabilities of the CRP. Theorem 1 of [23] states that if we put transition probabilities (9) on the edges of the arborescence, and let the leaves be absorbing states, then we obtain a Markov chain whose probability of absorption on a given leaf is proportional to the weight of the leaf. By our choice of weight we are done. □

For different steps the permutations are consistent, in the sense that ${\tilde{\Pi }_{s}}$, for $s\lt t$, can be derived from ${\tilde{\Pi }_{t}}$ by removing elements $s+1,\dots ,t$ from their cycles and deleting empty cycles if necessary.

4 The distribution of the number of cycles

Let ${X_{t}}$ denote the number of new cycles added in moving from time $t-1$ to time t, $t\ge 1$, which is the choice of ${k_{t}}$ above. The ${X_{t}}$ are independent with distributions given by

(10)

\[ \mathrm{IP}({X_{t}}={k_{t}})=\frac{{\theta ^{{k_{t}}}}}{{F_{t}}(\theta )}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right),\hspace{1em}{k_{t}}\in \{0\}\cup [{n_{t}}].\]

If $\theta =1$, then, by (5), we have

(11)

\[ \mathrm{IP}({X_{t}}={k_{t}})=\frac{1}{\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right)}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right)=\frac{\left(\genfrac{}{}{0.0pt}{}{{n_{t}}}{{k_{t}}}\right)}{\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{k_{t}}}\right)}\frac{{N_{t}}-{n_{t}}}{{N_{t}}-{k_{t}}}\]

This equals the probability that starting from an urn with ${n_{t}}$ balls representing failures and ${N_{t}}-{n_{t}}$ balls representing successes, that the first ${k_{t}}$ balls pulled from the urn without replacement are failures, and the $({k_{t}}+1)$th ball pulled from an urn is a success. This description shows that ${Y_{t}}={X_{t}}+1$ has a negative hypergeometric distribution, which we now describe. The negative hypergeometric distribution [17, 18, 21] with parameters N, M and r is like the negative binomial distribution, but without replacement. There are N balls in total, M balls representing success and $N-M$ balls representing failures. The hypergeometric distribution gives the probability that the rth success happens on the κth draw, $\kappa =r,r+1,\dots ,N$. A negative hypergeometric random variable Y with these parameters has probability mass function

(12)

\[ \mathbb{P}(Y=\kappa )=\frac{\left(\genfrac{}{}{0.0pt}{}{M}{r-1}\right)\left(\genfrac{}{}{0.0pt}{}{N-M}{\kappa -r}\right)}{\left(\genfrac{}{}{0.0pt}{}{N}{\kappa -1}\right)}\cdot \frac{M-r+1}{N-\kappa +1}\]

It is easily checked that (11) is (12) with $N={N_{t}}$, $M={N_{t}}-{n_{t}}={N_{t-1}}$, $r=1$ and $\kappa =k+1$. The expectation of Y is

\[ \mathbb{E}(Y)=r\frac{N+1}{M+1},\]

the variance of Y is

\[ \mathrm{Var}(Y)=\frac{r(N-M)(N+1)(M+1-r)}{{(M+1)^{2}}(M+2)}\]

and the third centred moment (see [17, 18]) is

\[ \mathbb{E}({[Y-\mathbb{E}(Y)]^{3}})=\mathrm{Var}(Y)\cdot \frac{(2N-M+1)(M+1-2r)}{(M+1)(M+3)}.\]

Therefore,

(13)

\[ \mathbb{E}({X_{t}})=\mathbb{E}({Y_{t}})-1=\frac{{N_{t}}+1}{{N_{t-1}}+1}-1=\frac{{n_{t}}}{{N_{t-1}}+1},\]

(14)

\[ \mathrm{Var}({X_{t}})=\mathrm{Var}({Y_{t}})=\frac{{n_{t}}({N_{t}}+1){N_{t-1}}}{{({N_{t-1}}+1)^{2}}({N_{t-1}}+2)}\]

and

(15)

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}({[{X_{t}}-\mathbb{E}({X_{t}})]^{3}})& \displaystyle =& \displaystyle \mathbb{E}({[{Y_{t}}-\mathbb{E}({Y_{t}})]^{3}})\\ {} & \displaystyle =& \displaystyle \mathrm{Var}({Y_{t}})\cdot \frac{({n_{t}}+{N_{t}}+1)({N_{t-1}}-1)}{({N_{t-1}}+1)({N_{t-1}}+3)}.\end{array}\]

It follows that the number of cycles ${K_{t}}={\textstyle\sum _{s=1}^{t}}{X_{t}}$ has expectation

\[ \mathbb{E}({K_{t}})={\sum \limits_{s=1}^{t}}\frac{{n_{s}}}{{N_{s-1}}+1}\]

and variance

\[ \mathrm{Var}({K_{t}})={\sum \limits_{s=1}^{t}}\frac{{n_{s}}({N_{s}}+1){N_{s-1}}}{{({N_{s-1}}+1)^{2}}({N_{s-1}}+2)}.\]

Given sequences ${a_{t}}$ and ${b_{t}}$, we use the notation ${a_{t}}=O({b_{t}})$ to mean ${a_{t}}\le C{b_{t}}$ for a constant $C\gt 0$; ${a_{t}}\asymp {b_{t}}$ to mean ${C_{1}}{b_{t}}\le {a_{t}}\le {C_{2}}{b_{t}}$ for constants ${C_{1}}\gt 0$, ${C_{2}}\gt 0$; and ${a_{t}}\sim {b_{t}}$ to mean ${\lim \nolimits_{t\to \infty }}{a_{t}}/{b_{t}}=1$. We also define ${a_{t}}=o({b_{t}})$ to mean ${\lim \nolimits_{t\to \infty }}{a_{t}}/{b_{t}}=0$. We will put the bound the ${n_{t}}=O(1)$ to give a central limit theorem for ${K_{t}}$ for all θ.

Theorem 3.

If $\theta =1$ and ${n_{i}}=n\ge 1$ for all $i\ge 1$, then

(16)

\[ \mathbb{E}({K_{t}})\sim \log t,\hspace{1em}\mathrm{Var}({K_{t}})\sim \log t.\]

For general θ, let ${n_{1}},{n_{2}},\dots $, be given, ${n_{i}}\ge 1$ for all i, and suppose

(17)

\[ {n_{i}}=O(1).\]

Then,

(18)

\[ \mathbb{E}({K_{t}})\asymp \log t,\hspace{1em}\mathrm{Var}({K_{t}})\asymp \log t,\]

and

(19)

\[ {K_{n}}\sim \mathbb{E}({K_{n}})\hspace{3.57777pt}\hspace{3.57777pt}\mathrm{a}.\mathrm{s}.,\hspace{3.57777pt}\hspace{3.57777pt}\hspace{3.57777pt}\hspace{3.57777pt}\hspace{3.57777pt}\hspace{3.57777pt}\frac{{K_{t}}-\mathbb{E}({K_{t}})}{\sqrt{\mathrm{Var}({K_{t}})}}\stackrel{d}{\to }\mathrm{N}(0,1).\]

Proof.

First we prove the theorem for $\theta =1$. From (13), (14) and (17) we get $\mathbb{E}({X_{t}})\asymp 1/t$ and $\mathrm{Var}({X_{t}})\asymp 1/t$, which result in (18), after which the Lindeberg-Feller Theorem easily shows the central limit theorem in (19), because ${X_{t}}\le {n_{t}}=O(1)$. If we impose the condition ${n_{i}}=n$, for $i\ge 1$, then we immediately obtain (16). For θ general, we reduce to the case $\theta =1$. This follows from noting that in (10), ${\theta ^{{k_{t}}}}\asymp 1$ and ${F_{t}}(\theta )\asymp {F_{t}}(1)$ uniformly over $t\ge 1$, by (17), and so (18) still holds. For the almost sure convergence, modify the proof of (6.6) Theorem on page 44 of [8]. □

By the following theorem a Central Limit Theorem can be obtained even when ${n_{t}}$ is unbounded, which is not the case in Theorem 3.

Theorem 4.

Suppose $\theta =1$, ${n_{t}}=O({N_{t-1}})$. Then, (19) holds.

Proof.

In order to prove limiting normality, we will verify Lyapounov’s Condition in [3] with $\delta =1$, by showing that

(20)

\[ \underset{t\to \infty }{\lim }\hspace{0.1667em}\frac{1}{\mathrm{Var}{({K_{t}})^{3/2}}}{\sum \limits_{s=1}^{t}}\mathbb{E}({[{X_{s}}-\mathbb{E}({X_{s}})]^{3}})=0.\]

The assumption ${n_{t}}=O({N_{t-1}})$, (14) and (15) produces

\[ \mathbb{E}({[{X_{t}}-\mathbb{E}({X_{t}})]^{3}}\asymp \mathrm{Var}({X_{t}})\asymp {n_{t}}/{N_{t-1}},\hspace{1em}t\ge 2.\]

We have the lower bound

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\sum \limits_{s=2}^{t}}\frac{{n_{s}}}{{N_{s-1}}}& \displaystyle \ge & \displaystyle {\sum \limits_{s=2}^{t}}\ln \left(1+\frac{{n_{s}}}{{N_{s-1}}}\right)\\ {} & \displaystyle =& \displaystyle {\sum \limits_{s=2}^{t}}\left(\ln ({N_{s}})-\ln ({N_{s-1}}\right)\\ {} & \displaystyle =& \displaystyle \ln ({N_{t}})-\ln ({N_{1}})\to \infty ,\end{array}\]

where the convergence to infinity follows from the assumption ${n_{t}}\ge 1$ for all $t\ge 1$. Hence,

\[ \mathrm{Var}({K_{t}})\asymp {\sum \limits_{s=1}^{t}}\mathbb{E}({[{X_{s}}-\mathbb{E}({X_{s}})]^{3}})\asymp {\sum \limits_{s=2}^{t}}\frac{{n_{s}}}{{N_{s-1}}}\to \infty \]

and (20) holds.

For the almost sure convergence, modify the proof of (6.6) Theorem on page 44 of [8] as in the proof of Theorem 3. □

An example where Theorem 4 holds is obtained by setting $\theta =1$, ${n_{t}}=\lceil C{t^{\alpha }}\rceil $, for constants $C\gt 0$, $\alpha \gt 0$, where $\lceil x\rceil $ is the smallest integer greater or equal to x. Then, ${n_{t}}=C{t^{\alpha }}+O(1)$ and ${N_{t}}=C{t^{\alpha +1}}/(\alpha +1)+O({t^{\alpha }})$ and the assumptions of Theorem 4 hold. The expectation and variance are asymptotically

\[ \mathbb{E}({K_{t}})\sim (\alpha +1)\log t,\hspace{1em}\mathrm{Var}({K_{t}})\sim (\alpha +1)\log t.\]

Another example for $\theta =1$ is obtained through regular variation [4]. A positive, measurable function $\ell (x)$, defined on $[\eta ,\infty )$ for η real, is slowly varying if

(21)

\[ \underset{x\to \infty }{\lim }\frac{\ell (\lambda x)}{\ell (x)}=1,\hspace{1em}\forall \lambda \gt 0.\]

The Uniform Convergence Theorem [4] provides that the convergence in (21) is uniform on compact λ-sets in $(0,\infty )$. A positive, measurable function defined on $[\eta ,\infty )$, is regularly varying with real index α if

\[ \underset{x\to \infty }{\lim }\frac{f(\lambda x)}{f(x)}={\lambda ^{\alpha }},\hspace{1em}\forall \lambda \gt 0.\]

A regularly varying function with index α can be written as

\[ f(x)={x^{\alpha }}\ell (x),\]

where $\ell (x)$ is a slowly varying function. Suppose that $\ell (x)$ is locally bounded in $[\eta ,\infty )$, i.e. bounded on compact subsets of $[\eta ,\infty )$, and $\alpha \gt -1$. Karamata’s theorem gives

\[ {\int _{\eta }^{t}}{x^{\alpha }}\ell (x)\hspace{0.1667em}dx\sim {t^{\alpha +1}}\ell (t)/(\alpha +1),\hspace{1em}t\to \infty .\]

We further suppose that $\eta \le 1$, $\alpha \ge 0$, with ${\lim \nolimits_{x\to \infty }}\ell (x)=\infty $ in the case $\alpha =0$, and define

\[ {n_{t}}=\Big\lceil {\int _{t}^{t+1}}{x^{\alpha }}\ell (x)\hspace{0.1667em}dx\Big\rceil ,\hspace{1em}t\ge 1.\]

The Uniform Convergence Theorem implies that

\[ \underset{t\le x\le t+1}{\sup }\left|\frac{\ell (x)}{\ell (t)}-1\right|\le \underset{1\le \lambda \le 2}{\sup }\left|\frac{\ell (\lambda t)}{\ell (t)}-1\right|=o(1),\]

from which

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\int _{t}^{t+1}}{x^{\alpha }}\ell (x)\hspace{0.1667em}dx& \displaystyle =& \displaystyle (1+o(1))\ell (t){\int _{t}^{t+1}}{x^{\alpha }}\hspace{0.1667em}dx\\ {} & \displaystyle =& \displaystyle (1+o(1))\ell (t)({t^{\alpha }}+o({t^{\alpha }}))\end{array}\]

and

\[ {n_{t}}\sim {t^{\alpha }}\ell (t).\]

Moreover, Karamata’s theorem gives us

\[ {N_{t-1}}\sim {t^{\alpha +1}}\ell (t)/(\alpha +1).\]

Hence, the conditions of Theorem 4 hold.

We will now show that ${n_{t}}$ may be chosen in such a way that the conclusion of the central limit theorem does not hold for ${K_{t}}$. Suppose that $\theta =1$ and the ${n_{t}}$ are chosen such that

(22)

\[ {N_{t-1}}=o({n_{t}}).\]

By (13) and (14), we have

\[ \mathbb{E}({X_{t}})\sim \frac{{n_{t}}}{{N_{t-1}}},\hspace{1em}\mathrm{Var}({X_{t}})\sim {\left(\frac{{n_{t}}}{{N_{t-1}}}\right)^{2}}.\]

We further choose the ${n_{t}}$ such that

(23)

\[ {\sum \limits_{s=1}^{t-1}}\mathrm{Var}({X_{s}})=o\left(\mathrm{Var}({X_{t}})\right),\]

A particular example of such a sequence is

\[ {n_{t}}=\lceil {e^{{t^{3}}}}\rceil ={e^{{t^{3}}}}+O(1),\]

from which

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {N_{t}}& \displaystyle =& \displaystyle {\sum \limits_{s=1}^{t}}{e^{{s^{3}}}}+O(t)\\ {} & \displaystyle =& \displaystyle {e^{{t^{3}}}}\left(1+{\sum \limits_{s=1}^{t-1}}{e^{{s^{3}}-{t^{3}}}}\right)+O(t)\\ {} & \displaystyle =& \displaystyle {e^{{t^{3}}}}\left(1+O\left(t{e^{{(t-1)^{3}}-{t^{3}}}}\right)\right)+O(t)\\ {} & \displaystyle =& \displaystyle {e^{{t^{3}}}}(1+o(1)).\end{array}\]

Clearly, (22) holds and we also have

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\sum \limits_{s=1}^{t-1}}\mathrm{Var}({X_{s}})& \displaystyle =& \displaystyle O\left({\sum \limits_{s=1}^{t-1}}{e^{2{s^{3}}-2{(s-1)^{3}}}}\right)\\ {} & \displaystyle =& \displaystyle O\left({\sum \limits_{s=1}^{t-1}}{e^{6{s^{2}}-6s}}\right)\\ {} & \displaystyle =& \displaystyle O\left(t{e^{6{(t-1)^{2}}-6(t-1)}}\right)\\ {} & \displaystyle =& \displaystyle o\left({e^{6{t^{2}}-6t}}\right)\end{array}\]

and so (23) holds, as well, because $\mathrm{Var}({X_{t}})\sim {e^{6{t^{2}}-6t+2}}$.

Assuming (22) and (23), we have

(24)

\[ \frac{{K_{t}}-\mathbb{E}({K_{t}})}{\sqrt{\mathrm{Var}({K_{t}})}}=\frac{{X_{t}}-\mathbb{E}({X_{t}})}{\sqrt{\mathrm{Var}({X_{t}})}}(1+o(1))+\frac{{\textstyle\textstyle\sum _{s=1}^{t-1}}({X_{s}}-\mathbb{E}({X_{s}}))}{\sqrt{\mathrm{Var}({K_{t}})}}.\]

Suppose that

\[ \frac{{K_{t}}-\mathbb{E}({K_{t}})}{\sqrt{\mathrm{Var}({K_{t}})}}\stackrel{d}{\to }\mathrm{N}(0,1).\]

By our assumptions, the variance of the last term of (24) is $o(1)$ and so by Slutsky’s theorem [8]

(25)

\[ \underset{t\to \infty }{\lim }\mathbb{P}\left(\frac{{X_{t}}-\mathbb{E}({X_{t}})}{\sqrt{\mathrm{Var}({K_{t}})}}\le 0\right)=\underset{t\to \infty }{\lim }\mathbb{P}\left(\frac{{K_{t}}-\mathbb{E}({K_{t}})}{\sqrt{\mathrm{Var}({X_{t}})}}\le 0\right)=1/2,\]

We will show that (25) is impossible. The c.d.f. of ${X_{t}}$ is given by

(26)

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle P({X_{t}}\le {x_{t}})& \displaystyle =& \displaystyle {\sum \limits_{{k_{t}}=0}^{{x_{t}}}}\mathbb{P}({X_{t}}={k_{t}})\\ {} & \displaystyle =& \displaystyle \frac{1}{\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right)}{\sum \limits_{{k_{t}}=0}^{{x_{t}}}}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+{n_{t}}-{k_{t}}}{{n_{t}}-{k_{t}}}\right)\end{array}\]

(27)

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& \displaystyle =& \displaystyle \frac{1}{\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right)}\left({\sum \limits_{j=0}^{{n_{t}}}}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+j}{j}\right)-{\sum \limits_{j=0}^{{n_{t}}-{x_{t}}-1}}\left(\genfrac{}{}{0.0pt}{}{{N_{t-1}}-1+j}{j}\right)\right)\\ {} & \displaystyle =& \displaystyle \frac{1}{\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right)}\left(\left(\genfrac{}{}{0.0pt}{}{{N_{t}}}{{n_{t}}}\right)-\left(\genfrac{}{}{0.0pt}{}{{N_{t}}-{x_{t}}-1}{{n_{t}}-{x_{t}}-1}\right)\right)\end{array}\]

(28)

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& \displaystyle =& \displaystyle 1-\frac{({N_{t}}-{x_{t}}-1)!/({n_{t}}-{x_{t}}-1)!}{{N_{t}}!/{n_{t}}!}\\ {} & \displaystyle \ge & \displaystyle 1-{\left(\frac{{N_{t}}-{x_{t}}-1}{{N_{t}}}\right)^{{N_{t-1}}}},\end{array}\]

where we have used (11) at (26), (6) at (27), and $({N_{t}}-{x_{t}}-1-i)/({N_{t}}-i)\le ({N_{t}}-{x_{t}}-1)/{N_{t}}$ for all $0\le i\le {N_{t-1}}-1$ at (28). We therefore have

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \underset{t\to \infty }{\liminf }P({X_{t}}\le \mathbb{E}({X_{t}}))& \displaystyle \ge & \displaystyle 1-\underset{t\to \infty }{\lim }{\left(\frac{{N_{t}}-\mathbb{E}({X_{t}})-1}{{N_{t}}}\right)^{{N_{t-1}}}}\\ {} & \displaystyle =& \displaystyle 1-\underset{t\to \infty }{\lim }{\left(\frac{{N_{t}}-{n_{t}}(1+o(1)/{N_{t-1}}-1}{{N_{t}}}\right)^{{N_{t-1}}}}\\ {} & \displaystyle =& \displaystyle 1-\underset{t\to \infty }{\lim }{\left(1-(1+o(1))/{N_{t-1}}\right)^{{N_{t-1}}}}\\ {} & \displaystyle =& \displaystyle 1-{e^{-1}}\gt 1/2,\end{array}\]

contradicting (25).

5 Discussion

Potentially, improvements might be made on these results. The CRP we have defined places all ${n_{t}}$ elements labelled t at the same time. Placing them sequentially seems like a difficult problem. It would be interesting to get better estimates for $\mathbb{E}({K_{t}})$ and $\mathrm{Var}({K_{t}})$, especially when $\theta \ne 1$. Perhaps the hypotheses of Theorem 4 could be weakened.

In the case of random permutations, there is a Poisson approximation of ${K_{n}}$. The total variation distance between the law $\mathcal{L}({K_{n}})$ of ${K_{n}}$ and the $\mathrm{Poisson}(\mathbb{E}({K_{n}}))$ distribution is of order

\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {d_{\mathrm{TV}}}(\mathcal{L}({K_{n}}),\mathrm{Poisson}(\mathbb{E}({K_{n}})))& \displaystyle :=& \displaystyle \frac{1}{2}{\sum \limits_{k=1}^{n}}\left|\mathbb{P}({K_{n}}=k)-\exp \left(-\mathbb{E}({K_{n}})\right)\frac{\mathbb{E}{({K_{n}})^{k}}}{k!}\right|\\ {} & \displaystyle \asymp & \displaystyle \frac{1}{\log n};\end{array}\]

see [2]. By considering the process of cycle counts $({C_{1}}(n),{C_{2}}(n),\dots ,{C_{n}}(n))$, where ${C_{i}}(n)$ is the number of cycles of size i, we may write ${K_{n}}={\textstyle\sum _{i=1}^{n}}{C_{i}}(n)$. Moreover, [7] shows that ${B_{n}}(\cdot )\stackrel{d}{\to }B(\cdot )$, where

\[ {B_{n}}(t):=\frac{{\textstyle\textstyle\sum _{i=1}^{\lfloor {n^{t}}\rfloor }}{C_{i}}(n)-t\log n}{\sqrt{\log n}},\hspace{1em}0\le t\le 1,\]

and $B(t)$ is standard Brownian motion. If suitable approximations could be made to the process of cycle counts for multiset permutations, then similar progress might be made for the results obtained in this paper when $\theta =1$.

Authors

Abstract

1 Introduction

2 Multiset permutations and their factorisations into cycles

(1)

(2)

Theorem 1 ([19]).

(3)

3 The CRP for multiset permutations

(4)

(5)

(6)

Theorem 2.

(7)

Proof.

(8)

(9)

4 The distribution of the number of cycles

(10)

(11)

(12)

(13)

(14)

(15)

Theorem 3.

(16)

(17)

(18)

(19)

Proof.

Theorem 4.

Proof.

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

5 Discussion

Acknowledgments

References

Export citation

Copy and paste formatted citation

Download citation in file