Probabilistic analysis of vantage point trees

Bohun, Vladyslav

doi:10.15559/21-VMSTA188

Abstract

Probabilistic properties of vantage point trees are studied. A vp-tree built from a sequence of independent identically distributed points in ${[-1,\hspace{0.1667em}1]^{d}}$ with the ${\ell _{\infty }}$-distance function is considered. The length of the leftmost path in the tree, as well as partitions over the space it produces are analyzed. The results include several convergence theorems regarding these characteristics, as the number of nodes in the tree tends to infinity.

1 Introduction

A vantage point tree (vp-tree) is a data structure for fast executing of nearest neighbor search queries in a given metric space. This class of trees was first introduced in 1993 in [17], and is widely used since then. It is not the only class of trees used for nearest neighbor search, other famous examples being kd-trees [3, 12], ball trees [14] as well as many other data structures.

Nearest neighbor search is used nowadays in a wide range of tasks. A simple example of day-to-day usage is geographic search of objects close to a specific location. The nearest points search is used in computational geometry, for example, for identifying intersections, which is a routine task in computer graphics or physics simulations. Last but not least example we mention here is a similarity search, such as search for similar images or similar articles.

Another major field, where the nearest neighbor search is an important tool, is machine learning. More precisely, it is used in variety of regression models. We refer the reader to [5] and [7] for a nice introduction to such models. We also refer to [6], which is a good exposition of ideas behind solving classification problem with the nearest neighbors search. These models proved useful in areas of classification such as text categorization, multimedia categorization and search, pattern recognition, information retrieval and many others. Recent studies have started to combine the nearest neighbor search and neural networks for better and more stable results, see, for example, [15].

Although the fast nearest neighbor search is important by itself, as well as an important ingredient in many algorithms, it and the corresponding data structures are not fully understood from the mathematical viewpoint. Most of the results describing the structure of these trees were obtained using empirical methods. For example, in [9] or [16] vp-trees and their heuristic variations were statistically investigated on specific datasets. Another approach is to compare different tree structures, for example, vp-trees and kd-trees, and collate their characteristics on the same datasets, see [11, 13]. In this paper we use a probabilistic approach to analyze a vp-tree and some related characteristics, as the number of nodes in the tree goes to infinity.

The structure of this paper is the following. In Section 2 we describe a vp-tree model and introduce characteristics that we are going to study. Section 3 contains our main results. In Section 4 we specialize some of the results to a number of partial cases and analyze them in more details.

2 Random vantage point tree model

Let $(\mathcal{X},\text{dist})$ be an arbitrary metric space. A vp-tree is a binary tree in which each node T stores two values: an element ${x_{T}}\in \mathcal{X}$ called a vantage point and a threshold ${\tau _{T}}>0$. One of the most common ways to construct a vp-tree is to do this in a recursive manner from a sequence of elements ${x_{1}},{x_{2}},\dots \hspace{0.1667em}$. The tree is empty at the beginning, the first element ${x_{1}}$ is always stored in its root and some threshold τ is assigned. In order to insert the n-th element ${x_{n}}$ the following recursive procedure is executed. We start from node T which is a root of the tree. If $\text{dist}({x_{T}},{x_{n}})\le \tau $ then we proceed recursively by adding a new element to the T-th left son, otherwise we proceed into the right son. When on some step the node is empty, then the value ${x_{n}}$ is stored in it and some threshold is assigned.

In this article we study a special class of random vp-trees. More precisely we consider the metric space $({[-1,1]^{d}},\text{dist})$ ($d\ge 1$) with a max-distance function

(1)

\[ \text{dist}(x,y)=\underset{1\le i\le d}{\max }|{x_{i}}-{y_{i}}|.\]

We also need to specify the choice of the thresholds. To this end, fix a parameter $\tau \in (0,1)$. The threshold of each node T is given by the rule ${\tau _{T}}={\tau ^{h}}$, where h is the distance from T to the root of the tree plus 1. Another possible option is to pick h as the number of “left turns” in the path from the root to T plus 1. Let us emphasize that such a choice of thresholds is quite natural. If one wants to assign thresholds as mean distances then they will also follow the exponential pattern.

We define the leftmost path of a vp-tree (or of an arbitrary binary tree) as a path which starts in the root and always goes into the left son of the current node till reaching a leaf. Note that for both aforementioned rules of picking a threshold, the value h is the same for elements of the leftmost path.

The sequence of elements ${x_{1}},{x_{2}},\dots \hspace{0.1667em}$ is a sequence of independent uniformly distributed points on ${[-1,1]^{d}}$ and this makes the vp-tree, that we construct, random.

Each node in the leftmost path can be enumerated starting from 1. To understand how leftmost path is formed and evolves let us associate with each node ${T_{h}}$ ($h\ge 1$) in this path a part of space ${I_{h}}\subset \mathcal{X}$, which contains all points x that, if added to the tree would be saved in left subtree of ${T_{h}}$. It is obvious that ${I_{h}}\subset {I_{h-1}}$ and, furthermore, a point x is saved in the left subtree of ${T_{h}}$ if $\text{dist}(x,{x_{{T_{h}}}})\le {\tau ^{h}}$. This gives

(2)

\[ {I_{h}}={I_{h-1}}\cap {B_{{\tau ^{h}}}}({x_{{T_{h}}}}),\hspace{1em}h\ge 1,\]

where ${B_{{\tau ^{h}}}}({x_{{T_{h}}}})$ is a metric ball of radius ${\tau ^{h}}$ and center ${x_{{T_{h}}}}$, that is,

(3)

\[ {B_{{\tau ^{h}}}}({x_{{T_{h}}}})=\{x\in \mathcal{X}:\text{dist}(x,{x_{{T_{h}}}})\le {\tau ^{h}}\}.\]

Note that by our choice of the metric, $\text{dist}(x,{x_{{T_{h}}}})\le {\tau ^{h}}$ is a d-dimensional cube. The root ${T_{1}}$ of the tree is always a part of the leftmost path, hence we can naturally put ${I_{0}}:=\mathcal{X}={[-1,1]^{d}}$ which is also a cube. From (2) we conclude that, for all $h\ge 1$, ${I_{h}}$ is an intersection of a rectangle and a cube, and hence is a rectangle itself with edges being parallel to coordinate axes. For such figures it is enough (up to parallel translations) to store only information about their sizes in each dimension which we denote by ${I_{h}^{(i)}}$, $1\le i\le d$.

Let ${T_{h-1}}$ be currently the last node in the leftmost path. Then the new node ${T_{h}}$ will be added to the left of ${T_{h-1}}$ if a new point ${x_{n}}$ falls inside ${I_{h-1}}$. From the properties of the uniform distribution, conditionally on ${x_{n}}\in {I_{h-1}}$, ${x_{n}}$ has the uniform distribution on ${I_{h-1}}$. Taking into account that ${I_{h-1}}$ is a rectangle, we also obtain that ${x_{n}}$’s position inside ${I_{h-1}}$ has the uniform distribution along every dimension $i=1,\dots ,d$. Thus, we can write d recursive equations in each dimension for the size of each side of ${I_{h}}$, which is just a size of an intersection of two segments: a segment of length ${I_{h-1}^{(i)}}$ and a segment of radius ${\tau ^{h}}$ and a center distributed uniformly inside the first segment. We compute the size ${I_{h}^{(i)}}$ as the distance between right and left endpoints of the intersection. Let ${u_{h}^{(i)}}$, $h\ge 1$, $1\le i\le d$, be a family of independent uniformly distributed random variables on $[0,1]$. We have, for $h\ge 1$, $1\le i\le d$,

(4)

\[\begin{aligned}{}{I_{h}^{(i)}}& =\min \{{I_{h-1}^{(i)}},{u_{h}^{(i)}}{I_{h-1}^{(i)}}+{\tau ^{h}}\}-\max \{0,{u_{h}^{(i)}}{I_{h-1}^{(i)}}-{\tau ^{h}}\}\\ {} & =({u_{h}^{(i)}}{I_{h-1}^{(i)}}-\max \{0,{u_{h}^{(i)}}{I_{h-1}^{(i)}}-{\tau ^{h}}\})\\ {} & \hspace{56.9055pt}+(\min \{{I_{h-1}^{(i)}},{u_{h}^{(i)}}{I_{h-1}^{(i)}}+{\tau ^{h}}\}-{u_{h}^{(i)}}{I_{h-1}^{(i)}})\\ {} & =({u_{h}^{(i)}}{I_{h-1}^{(i)}}+\min \{0,{\tau ^{h}}-{u_{h}^{(i)}}{I_{h-1}^{(i)}}\})\\ {} & \hspace{56.9055pt}+(\min \{{I_{h-1}^{(i)}},{u_{h}^{(i)}}{I_{h-1}^{(i)}}+{\tau ^{h}}\}-{u_{h}^{(i)}}{I_{h-1}^{(i)}})\\ {} & =\min \{{u_{h}^{(i)}}{I_{h-1}^{(i)}},{\tau ^{h}}\}+\min \{(1-{u_{h}^{(i)}}){I_{h-1}^{(i)}},{\tau ^{h}}\}.\end{aligned}\]

Making the substitution ${X_{h}^{(i)}}:={I_{h}^{(i)}}/{\tau ^{h}}$ we obtain the basic recursive formula describing evolution of ${I_{h}}$ in max-distance case:

(5)

\[ {X_{h}^{(i)}}=\min \left\{\frac{{u_{h}^{(i)}}{X_{h-1}^{(i)}}}{\tau },1\right\}+\min \left\{\frac{(1-{u_{h}^{(i)}}){X_{h-1}^{(i)}}}{\tau },1\right\}.\]

3 Convergence of the length of the leftmost path

3.1 The analysis of the process $({X_{h}})$

The process (5) defines sizes of ${X_{h}}={I_{h}}/{\tau ^{h}}$ in different dimensions. ${({X_{h}})_{h\ge 0}}$ can be considered as a Markov chain over rectangles (with omitting their positions). We define by $|{X_{h}}|$ (or $|{I_{h}}|$) a Lebesgue measure of a corresponding rectangle, which can be simply computed knowing sizes of each edge

(6)

\[ |{X_{h}}|={\prod \limits_{i=1}^{d}}{X_{h}^{(i)}}.\]

Proposition 3.1.

The Markov chain (5) satisfies

(7)

\[ 1\le {X_{h}^{(i)}}\le 2,\]

for $h\ge 0$ and $1\le i\le d$.

Proof.

Let us fix i and prove the statement on h by using induction. For a random vp-tree ${I_{0}}=\mathcal{X}={B_{1}}(0)={[-1,1]^{d}}$, hence ${X_{0}^{(i)}}={I_{0}^{(i)}}=2$ and the statement holds. Since $\min \{a,b\}\le b$ from (5) we have

(8)

\[ {X_{h}^{(i)}}\le 1+1=2.\]

To show that ${X_{h}^{(i)}}\ge 1$ we first note that if any of the minimums in (5) takes value 1, then the statement holds. On the other hand, if both minimums are less than 1, then

(9)

\[ {X_{h}^{(i)}}=\frac{{u_{h}^{(i)}}{X_{h-1}^{(i)}}}{\tau }+\frac{(1-{u_{h}^{(i)}}){X_{h-1}^{(i)}}}{\tau }=\frac{{X_{h-1}^{(i)}}}{\tau }\ge {X_{h-1}^{(i)}}\ge 1\]

by the inductive assumption, and since $\tau \in (0,1)$. □

Let us denote by S a set of all possible values the chain $({X_{h}})$ can visit provided that ${X_{0}}={B_{1}}(0)$. Note that S is uncountable.

We shall use the notation of stochastic ordering which we recall for convenience.

Definition 3.2.

For two random variables X and Y we say that Y stochastically dominates X and write $X\stackrel{st}{\le }Y$ if

(10)

\[ \mathbb{P}\{X\le t\}\ge \mathbb{P}\{Y\le t\}\hspace{1em}\text{for all}\hspace{0.2778em}\hspace{0.2778em}t\in \mathbb{R}.\]

The key ingredient for the subsequent analysis is the following theorem which says that the chain ${({X_{h}})_{h\ge 0}}$ visits the state ${B_{1}}(0)$ relatively often.

Theorem 3.3.

Fix $\alpha \in S$ and put

\[ {R_{\alpha }}:=\inf \{h\ge 1:{X_{h}}={B_{1}}(0)\textit{given that}\hspace{2.5pt}{X_{0}}=\alpha \}.\]

Then $\mathbb{E}{R_{\alpha }}<\infty $.

Remark 3.4.

We do believe that the counterpart of this theorem holds in arbitrary metric on ${[-1,\hspace{0.1667em}1]^{d}}$, however we have not been able to prove it in such generality.

Proof of Theorem 3.3.

We rewrite (2) as follows:

(11)

\[ {X_{h+1}}=\frac{{X_{h}}}{\tau }\bigcap {B_{1}}({u_{h+1}^{\ast }}),\]

where ${u_{h+1}^{\ast }}$ is uniformly distributed inside ${X_{h}}/\tau $.

Since the positions of rectangles are not important, the recursive equation (11) can be also rewritten as

(12)

\[ {X_{h+1}}=\frac{{X_{h}}-{u_{h+1}}}{\tau }\bigcap {B_{1}}(0),\]

where ${u_{h+1}}$ is uniformly distributed in ${X_{h}}$.

From Proposition 3.1 it follows that for any state ${X_{h}}$ there is a d-dimensional square with sides 1 inside ${X_{h}}$, or in other words there is always a point ${\hat{x}_{h}}$ such that ${B_{\delta }}({\hat{x}_{h}})\subset {X_{h}}$ with $\delta =1/2$.

Let us assume that point ${u_{h+1}}$ is chosen inside ${B_{\delta }}({\hat{x}_{h}})$ and denote

(13)

\[ \Delta =\text{dist}({u_{h+1}},{\hat{x}_{h}})\le \delta .\]

From the triangle inequality we obtain

(14)

\[ \underset{y\in \partial {B_{\delta }}({\hat{x}_{h}})}{\max }\text{dist}({u_{h+1}},y)\le \Delta +\delta ,\]

(15)

\[ \underset{y\in \partial {B_{\delta }}({\hat{x}_{h}})}{\min }\text{dist}({u_{h+1}},y)\ge \delta -\Delta ,\]

where $\partial B$ defines the boundary of the set B.

Then, in view of (12), we have

(16)

\[ \underset{y\in \partial \left(({B_{\delta }}({\hat{x}_{h}})-{u_{h+1}})/\tau \right)}{\max }\text{dist}(0,y)\le \frac{\Delta +\delta }{\tau },\]

and

(17)

\[ \underset{y\in \partial \left(({B_{\delta }}({\hat{x}_{h}})-{u_{h+1}})/\tau \right)}{\min }\text{dist}(0,y)\ge \frac{\delta -\Delta }{\tau }.\]

From (16) we have our first property that if $\frac{\Delta +\delta }{\tau }\le 1$ (or equally $\Delta \le \tau -\delta $), then ${B_{\delta /\tau }}((\hat{{x_{h}}}-{u_{h+1}})/\tau )\subset {X_{h+1}}$. From (17) we have our second property that if $\frac{\delta -\Delta }{\tau }\ge 1$ (or equally $\Delta \le \delta -\tau $), then ${X_{h+1}}={B_{1}}(0)$.

Let us now fix a constant $k=\min \{k\in \mathbb{N}:\frac{1+\tau }{2}{\tau ^{k-1}}\le \frac{1}{2}\}$. Since ${B_{1/2}}({\hat{x}_{h}})\subset {X_{h}}$, we have ${B_{\frac{1+\tau }{2}{\tau ^{k-1}}}}({\hat{x}_{h}})\subset {X_{h}}$. Since $k\ge 2$, it holds $\frac{1+\tau }{2}{\tau ^{k-1}}<\tau $. Therefore, if ${u_{h+1}}\in {B_{{\varepsilon _{1}}}}({\hat{x}_{h}})$ with ${\varepsilon _{1}}=\tau -\frac{1+\tau }{2}{\tau ^{k-1}}$, which happens with a positive probability, we will obtain ${B_{\frac{1+\tau }{2}{\tau ^{k-2}}}}({\hat{x}_{h+1}})\subset {X_{h+1}}$ from the first property. Repeating this $k-1$ times in total, where ${\varepsilon _{i}}=\tau -\frac{1+\tau }{2}{\tau ^{k-i}}>0$, for $1\le i\le k-1$, we obtain that ${B_{\frac{1+\tau }{2}}}({\hat{x}_{h+k-1}})\subset {X_{h+k-1}}$ and $\frac{1+\tau }{2}>\tau $. Thus, if ${u_{h+k}}\in {B_{{\varepsilon _{k}}}}({\hat{x}_{h+k-1}})$, which happens with a positive probability, where ${\varepsilon _{k}}=\frac{1+\tau }{2}-\tau $, we get that ${X_{h+k}}={B_{1}}(0)$ from the second property. Summarizing, we see that from an arbitrary state $\alpha \in S$ the chain jumps in k steps to the state ${B_{1}}(0)$ with a positive probability which is independent of the starting state α. Note that k also only depends on τ.

We split the evolution of the chain into blocks of length k. The state at the end of each block can be ${B_{1}}(0)$ with a positive probability, say p, obtained from the described scheme. We denote by g a number of the first block which ends up in the state ${B_{1}}(0)$. Then since actual probability to end up in the state ${B_{1}}(0)$ at the end of the block is not less than p we have, for any $t\ge 0$,

(18)

\[ \mathbb{P}\{g\le t\}\ge \mathbb{P}\{G\le t\},\]

where G has a geometric distribution with parameter p (number of trials). Or in terms of stochastic order

(19)

\[ g\stackrel{st}{\le }G.\]

Similarly, since the moment ${R_{\alpha }}$ of the first visit to state ${B_{1}}(0)$ (after starting state) can occur in the middle of each block, we can write

(20)

\[ {R_{\alpha }}\stackrel{st}{\le }kg.\]

From the properties of stochastic order, inequality also holds for expectations, hence combining last two comparisons we have

(21)

\[ \mathbb{E}{R_{\alpha }}\le k\mathbb{E}G=\frac{k}{p}<\infty .\]

□

Remark 3.5.

We note that in the proof of Theorem 3.3 we did not significantly use that ${u_{h}}$ has the uniform distribution, and it works for any absolutely continuous distribution, since probability for ${u_{h}}$ to fall inside some ${B_{\varepsilon }}({\hat{x}_{h}})$ is positive and larger than some fixed positive constant.

Remark 3.6.

The process (11) in particular case $\tau =1$ is called the diminishing process of Balint Toth. This process was studied in various partial cases including one-dimensional [1] and multi-dimensional case [10] with ${B_{1}}(0)={[-1,1]^{d}}$, as well as in other metrics [2, 10]. However evolution of this process with $\tau <1$ differs a lot and requires an independent study, whilst methods used for $\tau =1$ cannot be applied to treat the case $\tau <1$.

From Theorem 3.3 it follows that the chain ${({X_{h}})_{h\ge 0}}$ evolves as follows. There is a special state ${B_{1}}(0)$, which the chain visits infinitely often and time to return to this state is a.s. finite and, moreover, has a finite mean. Thus, the whole evolution of ${X_{h}}$ can be split into independent cycles between visiting the state ${B_{1}}(0)$.

In order to proceed we need to recall a notion of the Harris chains. For information about Harris chains and examples we refer the reader to the book [8, Chapter 6]. We will only present here the required definitions.

Definition 3.7.

A Markov chain ${({X_{h}})_{h\ge 0}}$ on a state space S is a Harris chain if there are two sets $A,B\subset S$, a positive function $q(x,y)\ge \varepsilon >0$ for $x\in A$, $y\in B$, and a probability measure ρ concentrated on B such that the following two conditions hold:

1. $\mathbb{P}\{\inf \{h\ge 0:{X_{h}}\in A\}<\infty \}>0$ for all possible starting states ${X_{0}}\in S$;
2. if $x\in A$ and $C\subset B$, then $\mathbb{P}\{{X_{h+1}}\in C|{X_{h}}=x\}\ge {\textstyle\int _{C}}q(x,y)\rho (dy)$.

Definition 3.8.

A Harris chain is called recurrent if there is a state α such that $\mathbb{P}\{\inf \{h\ge 1:{X_{h}}=\alpha \}<\infty \}=1$ when ${X_{0}}=\alpha $.

Definition 3.9.

A recurrent Harris chain is aperiodic if the greatest common divisor of set $\{h\ge 1:\mathbb{P}\{{X_{h}}=\alpha |{X_{0}}=\alpha \}>0\}$ is 1.

Proposition 3.10.

${({X_{h}})_{h\ge 0}}$ is an aperiodic recurrent Harris chain.

Proof.

Let us first show that ${({X_{h}})_{h\ge 0}}$ is a Harris chain. Put $A=B=\{{B_{1}}(0)\}$. From Theorem 3.3 it follows that $\mathbb{P}\{{R_{{X_{0}}}}<\infty \}=1$. Since $\inf \{h\ge 0:{X_{h}}\in A\}<\infty \}\le {R_{{X_{0}}}}$ for all possible starting states, the first condition holds. For the second part of the definition we put $q(x,y):=p:=\mathbb{P}\{{X_{h+1}}={B_{1}}(0)|{X_{h}}={B_{1}}(0)\}$. From the proof of Theorem 3.3, more precisely from the property (17), it follows that $p>0$.

To show recurrence of the chain we put $\alpha ={B_{1}}(0)$. Then from Theorem 3.3 we have that ${R_{\alpha }}<\infty $ almost surely. For the aperiodicity it is enough to note that $\mathbb{P}\{{X_{1}}=\alpha |{X_{0}}=\alpha \}=p>0$. □

Theorem 3.11.

The Markov chain ${({X_{h}})_{h\ge 0}}$ has a stationary distribution ${I_{\infty }}$ and

(22)

\[ \underset{n\to \infty }{\lim }\| \mathbb{P}\{{X_{h}}=\cdot |{X_{0}}={B_{1}}(0)\}-{I_{\infty }}(\cdot ){\| _{TV}}=0,\]

where $\| \cdot {\| _{TV}}$ denotes the total variance distance.

Proof.

From Proposition 3.10 we know that $({X_{h}})$ is an aperiodic recurrent Harris chain. By Theorem 6.8.5 in [8, p. 323] this chain has a stationary measure. By Theorem 6.8.8 in [8, p. 324] an aperiodic recurrent Harris chain converges to a stationary distribution in the total variation distance as (22) if ${R_{{B_{1}}(0)}}$ is almost surely finite. The latter is secured by Theorem 3.3. □

3.2 Limit theorems for the length of the leftmost path

Using the results on the behavior of the chain $({X_{h}})$ we can analyze the length of the leftmost path of a vp-tree constructed from n random points. Let us introduce the following random walk

(23)

\[ {S_{0}}:=0,\hspace{1em}{S_{k}}:={\sum \limits_{h=0}^{k-1}}{G_{h}}\hspace{1em}k\ge 1,\]

where, given the path of ${({I_{h}})_{h\ge 0}}$, ${({G_{h}})_{h\ge 0}}$ are conditionally independent random variables such that the distribution of ${G_{h}}$ is geometric on $\mathbb{N}$ with parameter $\frac{|{I_{h}}|}{{2^{d}}}$. This sum shows how many points are needed to have the leftmost path of size at least k, since the new node is added to it when a point ${x_{n}}$ falls into area of size $|{I_{h}}|$ which is, given ${({I_{h}})_{h\ge 0}}$, a Bernoulli trial.

The size (number of nodes) of the leftmost path in terms of ${S_{k}}$ can be expressed as

(24)

\[ {L_{n}}=\max \{k\ge 0:{S_{k}}\le n\}.\]

The length of the leftmost path can be simply computed from its size as ${L_{n}}-1$.

Theorem 3.12.

For a random vp-tree the size of the leftmost path ${L_{n}}$ satisfies the following weak law of large numbers, as $n\to \infty $,

(25)

\[ \frac{{L_{n}}}{\log n}\stackrel{\mathbb{P}}{\to }\frac{1}{-\log \tau }.\]

To prove this theorem we need two lemmas. We note that if $X\sim \text{Geom}(a)$ and $Y\sim \text{Geom}(b)$, then

(26)

\[ X\stackrel{st}{\le }Y\Leftrightarrow a\ge b.\]

Lemma 3.13.

For random variables ${G_{k}}\sim \textit{Geom}(\frac{|{I_{k}}|}{{2^{d}}})$ ($k\ge 0$)

(27)

\[ \frac{1}{k}\log {G_{k}}\stackrel{\mathbb{P}}{\to }-\log \tau ,\]

as $k\to \infty $.

Proof.

We compute the distribution function of $\frac{1}{k}\log {G_{k}}$. Note that

(28)

\[ \mathbb{P}\left\{\frac{1}{k}\log {G_{k}}\le x\right\}=\mathbb{E}\left\{\mathbb{P}\left\{\frac{1}{k}\log {G_{k}}\le x\Big|{I_{k}}\right\}\right\},\]

and further

(29)

\[ \mathbb{P}\{{k^{-1}}\log {G_{k}}\le x|{I_{k}}\}=\mathbb{P}\{{G_{k}}\le \exp (kx)|{I_{k}}\}=1-{(1-|{I_{k}}|/{2^{d}})^{\lfloor \exp (kx)\rfloor }}.\]

From Proposition 3.1 we have that

(30)

\[ {\tau ^{k}}\le |{I_{k}}|\le {2^{d}}{\tau ^{k}}.\]

Thus we can bound the probability in (29) by inserting lower and upper bounds of $|{I_{k}}|$. Thus,

\[ 1-{(1-{2^{-d}}{\tau ^{k}})^{\lfloor \exp (kx)\rfloor }}\le \mathbb{P}\{{k^{-1}}\log {G_{k}}\le x|{I_{k}}\}\le 1-{(1-{\tau ^{k}})^{\lfloor \exp (kx)\rfloor }}.\]

Both upper and lower bounds converge, as $k\to \infty $, to

(31)

\[ \left\{\begin{array}{l@{\hskip10.0pt}l}0,\hspace{1em}& \text{if}\hspace{2.5pt}\log \tau +x<0\\ {} 1,\hspace{1em}& \text{if}\hspace{2.5pt}\log \tau +x>0\end{array}\right..\]

Hence by the standard sandwich argument $\mathbb{P}\{{k^{-1}}\log {G_{k}}\le x|{I_{k}}\}$ converges to the same values almost surely. By the dominated convergence theorem, the same holds for the unconditional probability $\mathbb{P}\left\{\frac{1}{k}\log {G_{k}}\le x\right\}$. This finishes the proof. □

Lemma 3.14.

For a random walk ${S_{k}}$ described in (23)

(32)

\[ \frac{1}{k}\log {S_{k}}\stackrel{\mathbb{P}}{\to }-\log \tau \]

as $k\to \infty $.

Proof.

We write the distribution function of ${S_{k}}$ as

(33)

\[ \mathbb{P}\{{S_{k}}\le x\}=\mathbb{E}\{\mathbb{P}\{{S_{k}}\le x|{({I_{h}})_{0\le h<k}}\}\}\]

and study the conditional probability under the expectation.

Since ${S_{k}}\ge {G_{k-1}}$ we have an upper bound

(34)

\[ \mathbb{P}\{{S_{k}}\le x|{({I_{h}})_{0\le h<k}}\}\le \mathbb{P}\{{G_{k-1}}\le x|{({I_{h}})_{0\le h<k}}\},\]

and the inequality holds for expectations, that is, the unconditional probability.

Let us recall that, given ${({I_{h}})_{0\le h\le k}}$, the random variables ${({G_{h}})_{0\le h\le k}}$ are independent. We also have that ${G_{h}}\stackrel{st}{\le }{G_{k}}$ ($0\le h<k$) since the sequence ${(|{I_{h}}|)_{h\ge 0}}$ is nonincreasing. Thus, we have the bound

(35)

\[ {\sum \limits_{h=0}^{k-1}}{G_{h}}\stackrel{st}{\le }{\sum \limits_{h=0}^{k-1}}{G_{k}^{(h)}}\stackrel{st}{\le }k\underset{0\le h<k}{\max }{G_{k}^{(h)}}=k{\hat{G}_{k}}\]

where ${({G_{k}^{(h)}})_{0\le h<k}}$ is a set of independent copies of geometrically distributed random variables with the parameter ${\tau ^{k}}/{2^{d}}$, providing ${G_{k}}\stackrel{st}{\le }{G_{k}^{(1)}}$, and we put ${\hat{G}_{k}}:={\max _{0\le h<k}}{G_{k}^{(h)}}$. Note that the inequality stays for expectations, giving that it holds for unconditional values of ${S_{k}}$.

We compute the distribution function of $\hat{{G_{k}}}$:

(36)

\[ \mathbb{P}\left\{\frac{1}{k}\log {\hat{G}_{k}}\le x\right\}=\mathbb{P}\{{\hat{G}_{k}}\le \exp (kx)\}={\left(1-{(1-{\tau ^{k}}/{2^{d}})^{\lfloor \exp (kx)\rfloor }}\right)^{k}},\]

and therefore, as $k\to \infty $,

(37)

\[ \mathbb{P}\left\{\frac{1}{k}\log {\hat{G}_{k}}\le x\right\}\to \left\{\begin{array}{l@{\hskip10.0pt}l}0,\hspace{1em}& \text{if}\hspace{2.5pt}\log \tau +x<0\\ {} 1,\hspace{1em}& \text{if}\hspace{2.5pt}\log \tau +x\ge 0\end{array}\right..\]

Now we combine (34) and (35) to obtain

(38)

\[ \log {G_{k-1}}\stackrel{st}{\le }\log {S_{k}}\stackrel{st}{\le }\log k+\log {\hat{G}_{k}}.\]

Dividing everything by k we obtain

(39)

\[ \frac{k-1}{k}\frac{\log {G_{k-1}}}{k-1}\stackrel{st}{\le }\frac{\log {S_{k}}}{k}\stackrel{st}{\le }\frac{\log k}{k}+\frac{\log \hat{{G_{k}}}}{k}.\]

From Lemma 3.13 and (37) sending $k\to \infty $ yields

(40)

\[ \frac{1}{k}\log {S_{k}}\stackrel{d}{\to }-\log \tau .\]

Convergence in probability is implied from the fact that the limit is a constant. □

Proof of Theorem 3.12.

We start by computing the distribution function of $\frac{{L_{n}}}{\log n}$ in any point $x>0$ and $x\ne -1/\log \tau $:

(41)

\[\begin{aligned}{}\mathbb{P}\left\{\frac{{L_{n}}}{\log n}>x\right\}& =\mathbb{P}\{{L_{n}}>x\log n\}=\mathbb{P}\{{S_{\lfloor x\log n\rfloor }}\le n\}\\ {} & =\mathbb{P}\{\log {S_{\lfloor x\log n\rfloor }}\le \log n\}=\mathbb{P}\left\{\frac{\log {S_{\lfloor x\log n\rfloor }}}{\log n}\le 1\right\}\\ {} & =\mathbb{P}\left\{\frac{\log {S_{\lfloor x\log n\rfloor }}}{x\log n}\le \frac{1}{x}\right\}.\end{aligned}\]

By Lemma 3.14 we have

(42)

\[ \mathbb{P}\left\{\frac{\log {S_{\lfloor x\log n\rfloor }}}{x\log n}\le \frac{1}{x}\right\}\to \left\{\begin{array}{l@{\hskip10.0pt}l}0,\hspace{1em}& 1/x<-\log \tau \\ {} 1,\hspace{1em}& 1/x>-\log \tau \end{array}\right.,\]

as $k\to \infty $, and therefore

(43)

\[ \mathbb{P}\left\{\frac{{L_{n}}}{\log n}\le x\right\}=1-\mathbb{P}\left\{\frac{{L_{n}}}{\log n}>x\right\}\to \left\{\begin{array}{l@{\hskip10.0pt}l}1,\hspace{1em}& x>-1/\log \tau \\ {} 0,\hspace{1em}& x<-1/\log \tau \end{array}\right..\]

Since the convergence in distribution to a constant implies the convergence in probability to the same constant, the theorem is proved. □

We note that for proving Theorem 3.12 we only used that $|{I_{h}}|/{\tau ^{h}}$ is bounded both from zero and infinity by some absolute constants. The following results demonstrate convergence of ${L_{n}}$ over subsequences. We first state and prove the result for the random walk $({S_{k}})$. Recall that ${I_{\infty }}$ is the stationary distribution of the chain ${({I_{h}}/{\tau ^{h}})_{h\ge 0}}$.

Lemma 3.15.

For a random walk ${S_{k}}$ defined by (23), as $k\to \infty $,

(44)

\[ {\tau ^{k}}{S_{k}}\stackrel{d}{\to }{G_{\infty }},\]

where ${G_{\infty }}$ is a random series ${\textstyle\sum _{j=1}^{\infty }}{\xi _{j}}$. Here ${({\xi _{j}})_{j\ge 0}}$ are conditionally independent, given ${I_{\infty }}$, random variables such that ${\xi _{j}}\sim \textit{Exp}(\frac{|{I_{\infty }}|}{{2^{d}}{\tau ^{j}}})$.

Proof.

Let us write the characteristic function of ${\tau ^{k}}{S_{k}}$:

(45)

\[\begin{aligned}{}\mathbb{E}{e^{it{\tau ^{k}}{S_{k}}}}& =\mathbb{E}\left[\mathbb{E}\left({e^{it{\tau ^{k}}{S_{k}}}}|{({I_{h}})_{0\le h<k}}\right)\right]=\mathbb{E}\left[{\prod \limits_{j=0}^{k-1}}\frac{|{I_{j}}|/{2^{d}}}{{e^{-it{\tau ^{k}}}}-(1-|{I_{j}}|/{2^{d}})}\right]\\ {} =& \mathbb{E}\left[{\prod \limits_{j=0}^{k-1}}\frac{|{I_{j}}|/{2^{d}}}{1-it{\tau ^{k}}+O({\tau ^{2k}})-(1-|{I_{j}}|/{2^{d}})}\right]\\ {} =& \mathbb{E}\left[{\prod \limits_{j=0}^{k-1}}\frac{|{I_{j}}|/{2^{d}}}{|{I_{j}}|/{2^{d}}-it{\tau ^{k}}+O({\tau ^{2k}})}\right]\\ {} =& \mathbb{E}\left[{\prod \limits_{j=0}^{k-1}}\frac{1}{1-{2^{d}}it{\tau ^{k-j}}\frac{{\tau ^{j}}}{|{I_{j}}|}+O(\frac{{\tau ^{2k}}}{|{I_{j}}|})}\right]\\ {} =& \mathbb{E}\left[{\prod \limits_{j=1}^{k}}\frac{1}{1-{2^{d}}it{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O(\frac{{\tau ^{k}}}{|{I_{k-j}}|})O({\tau ^{k}})}\right]\\ {} =& \mathbb{E}\left[{\prod \limits_{j=1}^{M}}\frac{1}{1-{2^{d}}it{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O({\tau ^{k}})}\cdot {\prod \limits_{j=M+1}^{k}}\frac{1}{1-{2^{d}}it{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O({\tau ^{k}})}\right]\\ {} =& \mathbb{E}\left[{A_{k,M}}(t)\cdot {B_{k,M}}(t)\right],\end{aligned}\]

where $1\le M<k$ is a fixed parameter and $O(\frac{{\tau ^{k}}}{|{I_{k-j}}|})=O(1)$ by Proposition 3.1.

From Theorem 3.11 the convergence in total variance implies the convergence in distribution, the convergence of modules is also implied considering specifics of given rectangles. Thus, for every fixed M as $k\to \infty $ we have

(46)

\[ {A_{k,M}}(t)\stackrel{d}{\to }{\prod \limits_{j=1}^{M}}\frac{1}{1-{2^{d}}it{\tau ^{j}}/|{I_{\infty }}|}.\]

Let us show now that, for any $\varepsilon >0$,

(47)

\[ \underset{M\to \infty }{\lim }\underset{k\to \infty }{\limsup }\mathbb{P}\{|{A_{k,M}}(t){B_{k,M}}(t)-{A_{k,M}}(t)|\ge \varepsilon \}=0.\]

By Markov’s inequality

(48)

\[ \mathbb{P}\{|{A_{k,M}}(t){B_{k,M}}(t)-{A_{k,M}}(t)|\ge \varepsilon \}\le \frac{\mathbb{E}|{A_{k,M}}(t)||{B_{k,M}}(t)-1|}{\varepsilon }.\]

At first we want to show that ${B_{k,M}}(t)$ converges to 1 by separately showing that module and argument of the complex value converge. We have

(49)

\[ |{B_{k,M}}(t){|^{2}}={\prod \limits_{j=M+1}^{k}}\frac{1}{{(1+O({\tau ^{k}}))^{2}}+{({2^{d}}t{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O({\tau ^{k}}))^{2}}},\]

(50)

\[ \begin{aligned}{}\log |{B_{k,M}}(t){|^{2}}=& -{\sum \limits_{j=M+1}^{k}}\log \left[{(1+O({\tau ^{k}}))^{2}}+{\left({2^{d}}t{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O({\tau ^{k}})\right)^{2}}\right]\\ {} =& -{\sum \limits_{j=M+1}^{k}}\log \left[1+O({\tau ^{k}})+{2^{2d}}{t^{2}}{\tau ^{2j}}{\left(\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}\right)^{2}}+O({\tau ^{k}})\right]\\ {} & \hspace{-56.9055pt}=-{\sum \limits_{j=M+1}^{k}}\left[{2^{2d}}{t^{2}}{\tau ^{2j}}{\left(\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}\right)^{2}}+O({\tau ^{k}})+O({\tau ^{4j}})+O({\tau ^{2j+k}})+O({\tau ^{2k}})\right]\\ {} =& -{\sum \limits_{j=M+1}^{k}}\left[O({\tau ^{2j}})+O({\tau ^{k}})\right].\end{aligned}\]

The expansion for logarithm holds starting with some large enough M. We also used Proposition 3.1 which shows that ${\tau ^{k-j}}/|{I_{k-j}}|$ is bounded by nonrandom constants, which also do not dependent on $k-j$. We note that for a fixed t, the constants in the remainders $O({\tau ^{2j}})$ and $O({\tau ^{k}})$ are uniform, that is, do not depend on j, k. Thus, we can write

(51)

\[ \underset{M\to \infty }{\lim }\underset{k\to \infty }{\limsup }{\sum \limits_{j=M+1}^{k}}\left[O({\tau ^{2j}})+O({\tau ^{k}})\right]=0\hspace{1em}\text{a.s.}\]

For the argument we write

(52)

\[\begin{aligned}{}\text{arg}{B_{k,M}}(t)& =-{\sum \limits_{j=M+1}^{k}}\text{Arg}\left(1-{2^{d}}it{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}+O({\tau ^{k}})\right)\\ {} & ={\sum \limits_{j=M+1}^{k}}\left[\frac{{2^{d}}t{\tau ^{j}}\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}}{\sqrt{1+{2^{2d}}{t^{2}}{\tau ^{2j}}{\left(\frac{{\tau ^{k-j}}}{|{I_{k-j}}|}\right)^{2}}}}+O({\tau ^{3j}})+O({\tau ^{k}})\right]\\ {} & ={\sum \limits_{j=M+1}^{k}}\left[O({\tau ^{j}})+O({\tau ^{k}})\right].\end{aligned}\]

The expansion for the principal argument holds starting from some large enough M. Arguing as before we can write

(53)

\[ \underset{M\to \infty }{\lim }\underset{k\to \infty }{\limsup }\text{arg}{B_{k,M}}(t)=0\hspace{1em}\text{a.s.}\]

We also showed in process that starting from some large enough k and M the quantity $|{B_{k,M}}(t)|$ is bounded by a nonrandom constant $\overline{B}>0$ on every stochastic path. Since $|{A_{k,M}}(t){B_{k,M}}(t)|=1$, then $|{A_{k,M}}(t)|$ is also bounded by some constant $\overline{A}>0$. Thus, we can write

(54)

\[ \frac{\mathbb{E}|{A_{k,M}}(t)||{B_{k,M}}(t)-1|}{\varepsilon }\le \frac{\overline{A}\mathbb{E}|{B_{k,M}}(t)-1|}{\varepsilon }.\]

We already showed that

\[ \underset{M\to \infty }{\lim }\underset{k\to \infty }{\limsup }|{B_{k,M}}(t)-1|=0\hspace{1em}\text{a.s.}\]

Since $|{B_{k,M}}(t)-1|\le 1+\overline{B}$, then by the dominated convergence theorem

(55)

\[ \underset{M\to \infty }{\lim }\underset{k\to \infty }{\lim }\mathbb{E}|{B_{k,M}}(t)-1|=0.\]

Combining pieces together we obtain (47).

By Theorem 3.2 in [4, p. 28] we have that as $k\to \infty $

(56)

\[ {e^{it{\tau ^{k}}{S_{k}}}}\stackrel{d}{\to }\underset{M\to \infty }{\lim }{\prod \limits_{j=1}^{M}}\frac{1}{1-{2^{d}}it{\tau ^{j}}/|{I_{\infty }}|}={\prod \limits_{j=1}^{\infty }}\frac{1}{1-{2^{d}}it{\tau ^{j}}/|{I_{\infty }}|}.\]

Elements of the sequence ${({e^{it{\tau ^{k}}{S_{k}}}})_{k\ge 0}}$ are uniformly integrable as characteristic functions, so expectations also converge as $k\to \infty $,

(57)

\[ \mathbb{E}{e^{it{\tau ^{k}}{S_{k}}}}\to \mathbb{E}{\prod \limits_{j=1}^{\infty }}\frac{1}{1-{2^{d}}it{\tau ^{j}}/|{I_{\infty }}|},\]

and the limiting value here is a characteristic function of ${G_{\infty }}$ which finishes the proof. □

Theorem 3.16.

For any fixed $T>0$,

(58)

\[ {L_{\lfloor T{\tau ^{-n}}\rfloor }}-n\stackrel{d}{\to }{L_{\infty }^{(T)}}\]

as $n\to \infty $ and the distribution function of ${L_{\infty }^{(T)}}$ is given by

(59)

\[ \mathbb{P}\{{L_{\infty }^{(T)}}\le m\}=1-\mathbb{P}\{{G_{\infty }}\le T{\tau ^{m}}\}\]

for any $m\in \mathbb{Z}$ with ${G_{\infty }}$ defined in Lemma 3.15.

Proof.

Since ${L_{\lfloor T{\tau ^{-n}}\rfloor }}-n$ takes values only on $\mathbb{Z}$, we study the distribution function only over integer values. For fixed $T>0$ and $m\in \mathbb{Z}$ we have

(60)

\[\begin{aligned}{}\mathbb{P}\{{L_{\lfloor T{\tau ^{-n}}\rfloor }}-n>m\}& =\mathbb{P}\{{L_{\lfloor T{\tau ^{-n}}\rfloor }}>n+m\}=\mathbb{P}\{{S_{n+m}}\le \lfloor T{\tau ^{-n}}\rfloor \}\\ {} & =\mathbb{P}\{{\tau ^{n+m}}{S_{n+m}}\le {\tau ^{n+m}}\lfloor T{\tau ^{-n}}\rfloor \}.\end{aligned}\]

From Lemma 3.15 and the fact that ${\tau ^{n}}\lfloor T{\tau ^{-n}}\rfloor \to T$ as $n\to \infty $ we obtain

(61)

\[ \underset{n\to \infty }{\lim }\mathbb{P}\{{\tau ^{n+m}}{S_{n+m}}\le {\tau ^{n+m}}\lfloor T{\tau ^{-n}}\rfloor \}=\mathbb{P}\{{G_{\infty }}\le T{\tau ^{m}}\},\]

which finishes the proof. □

4 One-dimensional convergence

In this section we analyze the random recursive equation (5). Sending $h\to \infty $ in (5) we formally obtain the stochastic fixed-point equation for the limit ${X_{\infty }}$ of ${X_{h}^{(i)}}$:

(62)

\[ {X_{\infty }}\stackrel{d}{=}\min \{{\tau ^{-1}}u{X_{\infty }},1\}+\min \{{\tau ^{-1}}(1-u){X_{\infty }},1\},\]

where u is a uniformly distributed random variable on $[0,1]$. Let us show, using the contraction method, that ${X_{h}^{(i)}}$ converges to ${X_{\infty }}$, as $h\to \infty $.

Theorem 4.1.

The function

(63)

\[ f(X,u)=\mathrm{Law}\left(\min \left\{\frac{uX}{\tau },1\right\}+\min \left\{\frac{(1-u)X}{\tau },1\right\}\right)\]

is a strict contraction in the space of probability measures on $[1,2]$ endowed with the ${\ell _{1}}$-minimal metric, where u is a uniformly distributed random variable on $[0,1]$ independent of a random variable X taking values in $[1,\hspace{0.1667em}2]$.

Definition 4.2.

The ${\ell _{1}}$-minimal metric between two random variables X and Y is given by

(64)

\[ {\ell _{1}}(X,Y)=\inf \mathbb{E}|\hat{X}-\hat{Y}|,\]

where the infimum is taken over all pairs of random variables $(\hat{X},\hat{Y})$ such that $\hat{X}\sim X$ and $\hat{Y}\sim Y$.

Proof of Theorem 4.1.

Similarly to the proof of Proposition 3.1 it can be checked that if X takes values in $[1,2]$, the same holds also for $f(X,u)$.

We prove the contraction property in two steps.

Step 1. We compute the ${\ell _{1}}$-minimal distance between $f(x,{u_{1}})$ and $f(y,{u_{2}})$ for fixed nonrandom numbers $x,y\in [1,\hspace{0.1667em}2]$. We have

(65)

\[ {\ell _{1}}(f(x,u),f(y,u))=\inf \mathbb{E}|f(x,{u_{1}})-f(x,{u_{2}})|.\]

Since the infimum is taken over all possible dependencies between ${u_{1}}$ and ${u_{2}}$, we may pick ${u_{1}}={u_{2}}=u$. Then

(66)

\[ {\ell _{1}}(f(x,{u_{1}}),f(y,{u_{2}}))\le {\int _{[0,1]}}|f(x,u)-f(y,u)|\mathrm{d}u.\]

Let us assume for the time being that there is $\alpha \in (0,1)$ such that

(67)

\[ {\int _{[0,1]}}|f(x,u)-f(y,u)|\mathrm{d}u\le \alpha |x-y|.\]

Step 2. We compute the ${\ell _{1}}$-minimal distance between $f(X,{u_{1}})$ and $f(Y,{u_{2}})$. Combining the above estimates, we obtain

(68)

\[ {\ell _{1}}(f(X,{u_{1}}),f(Y,{u_{2}}))\le \alpha \mathbb{E}|\widehat{X}-\widehat{Y}|,\]

for arbitrary pair $(\widehat{X},\widehat{Y})$ such that $\widehat{X}\sim X$ and $\widehat{Y}\sim Y$. Passing to the infimum over all such pairs, we obtain

(69)

\[ {\ell _{1}}(f(X,{u_{1}}),f(Y,{u_{2}}))\le \alpha {\ell _{1}}(X,Y)\]

for some constant $\alpha \in (0,1)$, which is the definition of contraction.

It remains to prove (67). At first we divide f into sum of two functions.

(70)

\[ f(x,u)={f_{1}}(x,u)+{f_{2}}(x,u),\]

where

(71)

\[ {f_{1}}(x,u)=\min \left\{\frac{uX}{\tau },1\right\}=\left\{\begin{array}{l@{\hskip10.0pt}l}\frac{ux}{\tau },\hspace{1em}& u<\frac{\tau }{x}\\ {} 1,\hspace{1em}& u\ge \frac{\tau }{x}\end{array}\right.\]

and ${f_{2}}(x,u)=\min \{\frac{(1-u)X}{\tau },1\}$.

We compute the following integral under the condition $x\le y$:

(72)

\[\begin{aligned}{}{\int _{0}^{1}}|{f_{1}}(x,u)-{f_{1}}(y,u)|du& ={\int _{0}^{\tau /y}}\left(\frac{uy}{\tau }-\frac{ux}{\tau }\right)du\\ {} & \hspace{1em}+{\int _{\tau /y}^{\tau /x}}\left(1-\frac{ux}{\tau }\right)du+{\int _{\tau /x}^{1}}(1-1)du\\ {} & ={\int _{0}^{\tau /y}}\frac{u}{\tau }(y-x)du+{\int _{\tau /y}^{\tau /x}}du-{\int _{\tau /y}^{\tau /x}}\frac{x}{\tau }du\\ {} & =\frac{y-x}{\tau }\frac{{u^{2}}}{2}{\Big|_{0}^{\tau /y}}+\left(\frac{\tau }{x}-\frac{\tau }{y}\right)-\frac{x}{\tau }\frac{{u^{2}}}{2}{\Big|_{\tau /y}^{\tau /x}}\\ {} & =\frac{y-x}{2\tau }\frac{{\tau ^{2}}}{{y^{2}}}+\left(\frac{\tau }{x}-\frac{\tau }{y}\right)-\frac{x}{2\tau }\left(\frac{{\tau ^{2}}}{{x^{2}}}-\frac{{\tau ^{2}}}{{y^{2}}}\right)\\ {} & =\frac{\tau (y-x)}{2{y^{2}}}+\frac{\tau }{x}-\frac{\tau }{y}-\frac{\tau }{2x}+\frac{\tau x}{2{y^{2}}}\\ {} & =\frac{\tau x(y-x)+2\tau {y^{2}}-2\tau xy-\tau {y^{2}}+\tau {x^{2}}}{2x{y^{2}}}\\ {} & =\frac{-\tau xy+\tau {y^{2}}}{2xy}=\frac{\tau }{2}\frac{y-x}{xy}.\end{aligned}\]

Similarly if $y\le x$ we obtain that

(73)

\[ {\int _{0}^{1}}|{f_{1}}(x,u)-{f_{1}}(y,u)|du\le \frac{\tau }{2}\frac{x-y}{xy},\]

which results in

(74)

\[ {\int _{0}^{1}}|{f_{1}}(x,u)-{f_{1}}(y,u)|du\le \frac{\tau }{2}\frac{|x-y|}{xy}.\]

Now we estimate the same integral for ${f_{2}}$, where

(75)

\[ {f_{2}}(x,u)=\left\{\begin{array}{l@{\hskip10.0pt}l}1,\hspace{1em}& 1-u\ge \frac{\tau }{x}\\ {} \frac{(1-u)x}{\tau },\hspace{1em}& 1-u<\frac{\tau }{x}\end{array}\right..\]

In the integral we make a substitution $v=1-u$:

(76)

\[ {\int _{0}^{1}}|{f_{2}}(x,u)-{f_{2}}(y,u)|du={\int _{0}^{1}}|{f_{2}}(x,1-v)-{f_{2}}(y,1-v)|dv,\]

where we instantly obtain the same estimation as (74), since ${f_{2}}(x,1-v)={f_{1}}(x,v)$.

Finally,

(77)

\[ \begin{aligned}{}{\int _{[0,1]}}|f(x,u)-f(y,u)|du\le & {\int _{0}^{1}}|{f_{1}}(x,u)-{f_{1}}(y,u)|du\\ {} & +{\int _{0}^{1}}|{f_{2}}(x,u)-{f_{2}}(y,u)|du\\ {} \le & \frac{\tau }{2}\frac{|x-y|}{xy}+\frac{\tau }{2}\frac{|x-y|}{xy}=\tau \frac{|x-y|}{xy}\le \tau |x-y|,\end{aligned}\]

which proves (67) with $\alpha =\tau $. □

Corollary 4.3.

The stochastic fixed-point equation (62) has the unique solution ${X_{\infty }}$ in the space of probability measures on $[1,2]$ and ${X_{h}^{(i)}}\stackrel{d}{\to }{X_{\infty }}$, as $h\to \infty $, for every fixed $i=1,\dots ,d$.

Remark 4.4.

We obtained the same result as in the previous section without resort to geometric interpretation but with explicit usage of the distribution of sequence of points ${x_{1}},{x_{2}},\dots \hspace{0.1667em}$. It also allowed us to deduce the stochastic fixed-point equation for ${X_{\infty }}$. Recall that ${I_{\infty }}$ is a rectangle with edges being parallel to coordinate axes, its sizes in all dimensions can be given by $({X_{\infty }^{(1)}},\dots ,{X_{\infty }^{(d)}})$, where ${({X_{\infty }^{(i)}})_{1\le i\le d}}$ are independent copies of ${X_{\infty }}$.

It turns out that the distribution of ${X_{\infty }}$ can be found explicitly using the stochastic fixed-point equation (62) by directly computing its distribution function $F(t)=\mathbb{P}\{{X_{\infty }}<t\}$. The solution depends heavily on the parameter τ.

We start with a rather simple and special example $\tau \le 1/2$.

For $t\in [1,2]$ we write

(78)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}<t\}& =\mathbb{P}\left\{\min \left\{\frac{u{X_{\infty }}}{\tau },1\right\}+\min \left\{\frac{(1-u){X_{\infty }}}{\tau },1\right\}<t\right\}\\ {} & =\mathbb{P}\left\{1+1<t,\frac{u{X_{\infty }}}{\tau }\ge 1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & \hspace{1em}+\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+\frac{(1-u){X_{\infty }}}{\tau }<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }<1\right\}\\ {} & \hspace{1em}+\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+1<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & \hspace{1em}+\mathbb{P}\left\{1+\frac{(1-u){X_{\infty }}}{\tau }<t,\frac{u{X_{\infty }}}{\tau }\ge 1,\frac{(1-u){X_{\infty }}}{\tau }<1\right\}.\end{aligned}\]

The first term is obviously 0, since we assume $t\le 2$. The second term is also 0 since events $\{\frac{u{X_{\infty }}}{\tau }<1\}$ and $\{\frac{(1-u){X_{\infty }}}{\tau }<1\}$ cannot occur at the same time for $\tau \le 1/2$. And the last two terms are symmetrical.

(79)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}<t\}& =2\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+1<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & =2\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+1<t,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & =2\mathbb{P}\left\{u<\frac{\tau (t-1)}{{X_{\infty }}},u\le \frac{{X_{\infty }}-\tau }{{X_{\infty }}}\right\}=2\mathbb{P}\left\{u<\frac{\tau (t-1)}{{X_{\infty }}}\right\}.\end{aligned}\]

Here in the second equality we used that the event $\{\frac{u{X_{\infty }}}{\tau }+1<t\}$ implies the event $\{\frac{u{X_{\infty }}}{\tau }<1\}$. And in the last equality we used that the event $\{u<\frac{\tau (t-1)}{{X_{\infty }}}\}$ includes the event $\{u\le \frac{{X_{\infty }}-\tau }{{X_{\infty }}}\}$ since $\tau t<1\le {X_{\infty }}$. Finally,

(80)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}<t\}& =2\mathbb{P}\left\{u<\frac{\tau (t-1)}{{X_{\infty }}}\right\}=2\mathbb{E}\left(\mathbb{P}\left\{\frac{\tau (t-1)}{{X_{\infty }}}>u\right\}\big|{X_{\infty }}\right)\\ {} & =2\mathbb{E}\left(\frac{\tau (t-1)}{{X_{\infty }}}\right)=2\tau \mathbb{E}\frac{1}{{X_{\infty }}}(t-1).\end{aligned}\]

And we also have a special state ${X_{\infty }}=2$:

(81)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}=2\}& =\mathbb{P}\left\{1+1=2,\frac{u{X_{\infty }}}{\tau }\ge 1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & =\mathbb{P}\left\{u\ge \frac{\tau }{{X_{\infty }}},u\le 1-\frac{\tau }{{X_{\infty }}}\right\}=1-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}.\end{aligned}\]

From (80) and (81) we see that ${X_{\infty }}$ is a combination of the uniform distribution on $[1,2)$ and an atom at point 2. We only need to compute a constant $\mathbb{E}\frac{1}{{X_{\infty }}}$.

(82)

\[\begin{aligned}{}\mathbb{E}\frac{1}{{X_{\infty }}}& =\frac{1}{2}\mathbb{P}\{{X_{\infty }}=2\}+{\int _{1}^{2}}\frac{1}{t}d\mathbb{P}\{{X_{\infty }}<t\}\\ {} & =\frac{1}{2}\left(1-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\right)+{\int _{1}^{2}}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}}{t}dt\\ {} & =\frac{1}{2}\left(1-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\right)+2\tau \mathbb{E}\frac{1}{{X_{\infty }}}(\log 2-\log 1).\end{aligned}\]

We solve (82) for an unknown $\mathbb{E}\frac{1}{{X_{\infty }}}$ and obtain that

(83)

\[ \mathbb{E}\frac{1}{{X_{\infty }}}=\frac{1}{2\tau (1-2\log 2+1/\tau )}.\]

On Figure 1 the reader can see simultaneous plots of just obtained theoretical values of $F(t)$ and statistical values. We refer to statistical values as an empirical cumulative distribution function obtained from values of ${X_{h}^{(i)}}$ taken in 10000 consecutive iterations of the process (5).

Fig. 1.

Plot of theoretical and statistical values of the distribution function $F(t)$ of ${X_{\infty }}$ with $\tau =0.4$

From now on we assume that $\tau >1/2$. We begin as in previous case but write all possibilities for minimums separately.

The situation ${X_{\infty }}=2$ is possible only when both minimums are 1.

(84)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}=2\}& =\mathbb{P}\left\{1+1=2,1\le \frac{u{X_{\infty }}}{\tau },1\le \frac{(1-u){X_{\infty }}}{\tau }\right\}\\ {} & =\mathbb{P}\left\{\frac{\tau }{{X_{\infty }}}\le u\le \frac{{X_{\infty }}-\tau }{{X_{\infty }}},{X_{\infty }}\ge 2\tau \right\}\\ {} & ={\int _{[2\tau ,2]}}\left(1-\frac{\tau }{x}\right)dF(x)-{\int _{[2\tau ,2]}}\frac{\tau }{x}dF(x)\\ {} & =1-F(2\tau )-2\tau {\int _{[2\tau ,2]}}\frac{1}{x}dF(x)\\ {} & =1-F(2\tau )-2\tau \left(\mathbb{E}\frac{1}{{X_{\infty }}}-{\int _{[1,2\tau )}}\frac{1}{x}dF(x)\right).\end{aligned}\]

When ${X_{\infty }}<2$ there are two possibilities that only one of the minimums in (62) is 1, and both of these situations are symmetrical, so we compute only one of the probabilities:

(85)

\[\begin{aligned}{}& \mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+1<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & =\mathbb{P}\left\{u<\frac{\tau (t-1)}{{X_{\infty }}},{X_{\infty }}\ge \tau t\right\}+\mathbb{P}\left\{u\le \frac{{X_{\infty }}-\tau }{{X_{\infty }}},{X_{\infty }}<\tau t\right\}\\ {} & ={\int _{[\tau t,2]}}\frac{\tau (t-1)}{x}dF(x)+{\int _{[1,\tau t]}}\left(1-\frac{\tau }{x}\right)dF(x)\\ {} & =\tau (t-1)\left(\mathbb{E}\frac{1}{{X_{\infty }}}-{\int _{[1,\tau t)}}\frac{1}{x}dF(x)\right)+F(\tau t)-\tau {\int _{[1,\tau t)}}\frac{1}{x}dF(x).\end{aligned}\]

And the last possibility of ${X_{\infty }}<2$ is when both minimums are less than 1.

(86)

\[\begin{aligned}{}& \mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+\frac{(1-u){X_{\infty }}}{\tau }<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }<1\right\}\\ {} & =\mathbb{P}\left\{\frac{{X_{\infty }}-\tau }{{X_{\infty }}}<u<\frac{\tau }{{X_{\infty }}},x<\tau t\right\}={\int _{[1,\tau t]}}\frac{\tau }{x}dF(x)-{\int _{[1,\tau t]}}(1-\frac{\tau }{x})dF(x)\\ {} & =2\tau {\int _{[1,\tau t]}}\frac{1}{x}dF(x)-F(\tau t).\end{aligned}\]

We combine (85) and (86) and get the following for $t\in [1,2]$:

(87)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}<t\}& =\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+\frac{(1-u){X_{\infty }}}{\tau }<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }<1\right\}\\ {} & \hspace{1em}+2\mathbb{P}\left\{\frac{u{X_{\infty }}}{\tau }+1<t,\frac{u{X_{\infty }}}{\tau }<1,\frac{(1-u){X_{\infty }}}{\tau }\ge 1\right\}\\ {} & =F(\tau t)+2\tau (t-1)\left(\mathbb{E}\frac{1}{{X_{\infty }}}-{\int _{[1,\tau t)}}\frac{1}{x}dF(x)\right).\end{aligned}\]

Now we will show how to retrieve exact values for probability (87). On step 1 we consider that $t\in [1,1/\tau ]$, then $\tau t<1$ and from (87) one gets:

(88)

\[ \mathbb{P}\{{X_{\infty }}<t\}=0+2\tau (t-1)\left(\mathbb{E}\frac{1}{{X_{\infty }}}-0\right)=2\tau (t-1)\mathbb{E}\frac{1}{{X_{\infty }}}.\]

We also compute the derivative

(89)

\[ dF(t)=2\tau \mathbb{E}\frac{1}{{X_{\infty }}}dt.\]

We repeat the procedure by computing probabilities and the derivatives for $t\in [1/\tau ,1/{\tau ^{2}}]$ and so on following intervals by using results from previous intervals. More generally, let us compute an interval on step k when $t\in [1/{\tau ^{k-1}},1/{\tau ^{k}}]$.

We have $\tau t\in [1/{\tau ^{k-2}},1/{\tau ^{k-1}}]$ which means that $F(\tau t)$ is a value from previous interval and we can use it. We also have an integral part in (87):

(90)

\[ {\int _{[1,\tau t)}}\frac{1}{x}dF(x)={\int _{1}^{1/\tau }}\frac{1}{x}dF(x)+{\int _{1/\tau }^{1/{\tau ^{2}}}}\frac{1}{x}dF(x)+\cdots +{\int _{1/{\tau ^{k-2}}}^{\tau t}}\frac{1}{x}dF(x)\]

where the integrals are taken by using computed derivatives on first $k-1$ intervals respectively.

At some k we have $1/{\tau ^{k}}>2$ so we stop computing intervals. We have only one unknown variable $\mathbb{E}\frac{1}{{X_{\infty }}}$. We compute this value directly by using computed probabilities and solve an equation for it.

Example 4.5.

Let us compute ${X_{\infty }}$ when $1/\tau <2\le 1/{\tau ^{2}}$ ($k=2$). We also have that $2\tau \le 1/\tau $.

When $t\le 1/\tau $, we have

(91)

\[ \mathbb{P}\{{X_{\infty }}<t\}=2\tau (t-1)\mathbb{E}\frac{1}{{X_{\infty }}},\hspace{1em}dF(t)=2\tau \mathbb{E}\frac{1}{{X_{\infty }}}dt.\]

When $t\ge 1/\tau $,

(92)

\[ {\int _{[1,\tau t]}}\frac{1}{x}dF(x)={\int _{1}^{\tau t}}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}}{x}dx=2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\log (\tau t).\]

After plugging (91) and (92) into (87) we obtain

(93)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}<t\}& =F(\tau t)+2\tau (t-1)\left(\mathbb{E}\frac{1}{{X_{\infty }}}-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\log (\tau t)\right)\\ {} & =2\tau (\tau t-1)\mathbb{E}\frac{1}{{X_{\infty }}}+2\tau (t-1)\mathbb{E}\frac{1}{{X_{\infty }}}-4{\tau ^{2}}(t-1)\log (\tau t)\\ {} & =2\tau \mathbb{E}\frac{1}{{X_{\infty }}}(t+\tau t-2\tau t\log (\tau t)+2\tau \log (\tau t)-2),\end{aligned}\]

and the derivative is

(94)

\[ dF(t)=2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\left(1+\tau -2\tau (1+\log (\tau t))+2\tau \frac{1}{t}\right).\]

Finally, we also return to a special case value ${X_{\infty }}=2$ which was computed at the beginning in (84):

(95)

\[\begin{aligned}{}\mathbb{P}\{{X_{\infty }}=2\}& =1-F(2\tau )-2\tau \left(\mathbb{E}\frac{1}{{X_{\infty }}}-{\int _{[1,2\tau )}}\frac{1}{x}dF(x)\right)=1-\mathbb{P}\{{X_{\infty }}<2\}\\ {} & =1-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}(2+2\tau -4\tau \log (2\tau )+2\tau \log (2\tau )-2)\\ {} & =1-4{\tau ^{2}}\mathbb{E}\frac{1}{{X_{\infty }}}(1-\log (2\tau )).\end{aligned}\]

We only need to compute the constant $\mathbb{E}\frac{1}{{X_{\infty }}}$:

(96)

\[\begin{aligned}{}\mathbb{E}\frac{1}{{X_{\infty }}}& =\frac{1}{2}\left(1-4{\tau ^{2}}\mathbb{E}\frac{1}{{X_{\infty }}}(1-\log (2\tau ))\right)+{\int _{1}^{1/\tau }}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}}{x}dx\\ {} & \hspace{1em}+{\int _{1/\tau }^{2}}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}(1+\tau -2\tau (1+\log (\tau t))+2\tau \frac{1}{t})}{x}dx\\ {} & =\frac{1}{2}\left(1-4{\tau ^{2}}\mathbb{E}\frac{1}{{X_{\infty }}}(1-\log (2\tau ))\right)+{\int _{1}^{1/\tau }}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}}{x}dx\\ {} & \hspace{1em}+2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\left({\int _{1/\tau }^{2}}\frac{1+\tau -2\tau }{x}dx+{\int _{1/\tau }^{2}}\frac{-2\tau \log \tau x}{x}dx+{\int _{1/\tau }^{2}}\frac{2\tau }{{x^{2}}}dx\right)\\ {} & =\frac{1}{2}\left(1-4{\tau ^{2}}\mathbb{E}\frac{1}{{X_{\infty }}}(1-\log (2\tau ))\right)+{\int _{1}^{1/\tau }}\frac{2\tau \mathbb{E}\frac{1}{{X_{\infty }}}}{x}dx+2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\ast \\ {} & \hspace{1em}\ast \left((1-\tau )(\log 2-\log (1/\tau ))-\tau ({\log ^{2}}(2\tau )-{\log ^{2}}(\tau /\tau ))-2\tau \left(\frac{1}{2}-\tau \right)\right)\\ {} & =\frac{1}{2}\left(1-4{\tau ^{2}}\mathbb{E}\frac{1}{{X_{\infty }}}(1-\log (2\tau ))\right)-2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\log (\tau )\\ {} & \hspace{1em}+2\tau \mathbb{E}\frac{1}{{X_{\infty }}}\left((1-\tau )\log (2\tau )-\tau {\log ^{2}}(2\tau )+2\tau \left(\tau -\frac{1}{2}\right)\right).\end{aligned}\]

By solving this equation for an unknown $\mathbb{E}\frac{1}{{X_{\infty }}}$ we obtain:

(97)

\[ \mathbb{E}\frac{1}{{X_{\infty }}}=\frac{1}{2(1-4{\tau ^{3}}+4{\tau ^{2}}+2{\tau ^{2}}{\log ^{2}}(2\tau )+2\tau \log (\tau )-2\tau \log (2\tau ))}.\]

Fig. 2.

Plot of theoretical and statistical values of the distribution function $F(t)$ of ${X_{\infty }}$ with $\tau =0.7$

Fig. 3.

Plot of statistical values of the distribution function $F(t)$ of ${X_{\infty }}$ with $\tau =0.9$

As the end of the example, Figure 2 shows plots of theoretical and statistical values of $F(t)$ with τ lying in the studied range.

Finally, without computing theoretical values of the distribution function, we want to show how it further evolves with increasing τ by showing only its statistical values on Figure 3 with $\tau =0.9$.

Authors

Abstract

1 Introduction

2 Random vantage point tree model

(1)

(2)

(3)

(4)

(5)

3 Convergence of the length of the leftmost path

3.1 The analysis of the process $({X_{h}})$

(6)

Proposition 3.1.

(7)

Proof.

(8)

(9)

Definition 3.2.

(10)

Theorem 3.3.

Remark 3.4.

Proof of Theorem 3.3.

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

Remark 3.5.

Remark 3.6.

Definition 3.7.

Definition 3.8.

Definition 3.9.

Proposition 3.10.

Proof.

Theorem 3.11.

(22)

Proof.

3.2 Limit theorems for the length of the leftmost path

(23)

(24)

Theorem 3.12.

(25)

(26)

Lemma 3.13.

(27)

Proof.

(28)

(29)

(30)

(31)

Lemma 3.14.

(32)

Proof.

(33)

(34)

(35)

(36)

(37)

(38)

(39)

(40)

Proof of Theorem 3.12.

(41)

(42)

(43)

Lemma 3.15.

(44)

Proof.

(45)

(46)

(47)

(48)

(49)

(50)