Probabilistic properties of vantage point trees are studied. A vp-tree built from a sequence of independent identically distributed points in

A vantage point tree (vp-tree) is a data structure for fast executing of nearest neighbor search queries in a given metric space. This class of trees was first introduced in 1993 in [

Nearest neighbor search is used nowadays in a wide range of tasks. A simple example of day-to-day usage is geographic search of objects close to a specific location. The nearest points search is used in computational geometry, for example, for identifying intersections, which is a routine task in computer graphics or physics simulations. Last but not least example we mention here is a similarity search, such as search for similar images or similar articles.

Another major field, where the nearest neighbor search is an important tool, is machine learning. More precisely, it is used in variety of regression models. We refer the reader to [

Although the fast nearest neighbor search is important by itself, as well as an important ingredient in many algorithms, it and the corresponding data structures are not fully understood from the mathematical viewpoint. Most of the results describing the structure of these trees were obtained using empirical methods. For example, in [

The structure of this paper is the following. In Section

Let

In this article we study a special class of random vp-trees. More precisely we consider the metric space

We also need to specify the choice of the thresholds. To this end, fix a parameter

We define the

The sequence of elements

Each node in the leftmost path can be enumerated starting from 1. To understand how leftmost path is formed and evolves let us associate with each node

Let

The process (

Let us fix

Let us denote by

We shall use the notation of stochastic ordering which we recall for convenience.

For two random variables

The key ingredient for the subsequent analysis is the following theorem which says that the chain

We do believe that the counterpart of this theorem holds in arbitrary metric on

We rewrite (

Since the positions of rectangles are not important, the recursive equation (

From Proposition

Let us assume that point

From the triangle inequality we obtain

Then, in view of (

From (

Let us now fix a constant

We split the evolution of the chain into blocks of length

Similarly, since the moment

From the properties of stochastic order, inequality also holds for expectations, hence combining last two comparisons we have

We note that in the proof of Theorem

The process (

From Theorem

In order to proceed we need to recall a notion of the Harris chains. For information about Harris chains and examples we refer the reader to the book [

A Markov chain

if

A Harris chain is called recurrent if there is a state

A recurrent Harris chain is aperiodic if the greatest common divisor of set

Let us first show that

To show recurrence of the chain we put

From Proposition

Using the results on the behavior of the chain

The size (number of nodes) of the leftmost path in terms of

The length of the leftmost path can be simply computed from its size as

To prove this theorem we need two lemmas. We note that if

We compute the distribution function of

From Proposition

We write the distribution function of

Since

Let us recall that, given

We compute the distribution function of

Now we combine (

We start by computing the distribution function of

We note that for proving Theorem

Let us write the characteristic function of

From Theorem

By Markov’s inequality

At first we want to show that

For the argument we write

We also showed in process that starting from some large enough

By Theorem 3.2 in [

Elements of the sequence

Since

In this section we analyze the random recursive equation (

The

Similarly to the proof of Proposition

We prove the contraction property in two steps.

Since the infimum is taken over all possible dependencies between

Let us assume for the time being that there is

It remains to prove (

We compute the following integral under the condition

Similarly if

Now we estimate the same integral for

In the integral we make a substitution

Finally,

We obtained the same result as in the previous section without resort to geometric interpretation but with explicit usage of the distribution of sequence of points

It turns out that the distribution of

We start with a rather simple and special example

For

And we also have a special state

From (

On Figure

Plot of theoretical and statistical values of the distribution function

From now on we assume that

The situation

When

And the last possibility of

We combine (

Now we will show how to retrieve exact values for probability (

We repeat the procedure by computing probabilities and the derivatives for

We have

At some

Let us compute

When

When

After plugging (

Finally, we also return to a special case value

We only need to compute the constant

Plot of theoretical and statistical values of the distribution function

Plot of statistical values of the distribution function

As the end of the example, Figure

Finally, without computing theoretical values of the distribution function, we want to show how it further evolves with increasing

I would like to thank my supervisor, Prof. Alexander Marynych, for his dedicated guidance and support throughout this research. I am grateful for his prompt and friendly responses to my questions and queries, as well as for his genuine care about my work. I also thank the anonymous referee for useful suggestions which improved the presentation of my results.