Supported by grant ANR-17-CE40-0017 of the French National Research Agency (ANR project ASPAG). The authors are indebted to the anonymous referees for their very insightful remarks which helped them to improve this paper.

Two optimal transport problems between determinantal point processes (DPP for short) are investigated. It is shown how to estimate the Kantorovitch–Rubinstein and Wasserstein-2 distances between distributions of DPP. These results are applied to evaluate the accuracy of a fast but approximate simulation algorithm of the Ginibre point process restricted to a circle. One can now simulate in a reasonable amount of time more than ten thousands points.

Determinantal point processes (DPP) have been introduced in the seventies [

A careful analysis of the simulation algorithm given in [

Peaks and valleys of some densities

As a consequence, it is hardly feasible to simulate a DPP with more than one thousand points in less than a few hours. As some DPP arise as locations of eigenvalues of some matrix ensembles, it may seem faster and simpler to draw random matrices and compute their eigenvalues with the optimized libraries to do so. However, there are several drawbacks to this approach: a) we cannot control the domain into which the points fall, and for some applications it may be important to simulate DPP restricted to some compact sets, b) as eigenvalues belong to

Several refinements of Algorithm

We evaluate the impact of these approximations by bounding the distances between the original distribution of the DPP to be simulated and the distribution according to which the points are drawn. The computations of the error margin are specific to the Ginibre DPP but could be done for other processes with radial symmetry like polyanalytic Ginibre ensembles [

Actually, there are several notions of distances between the distributions of point processes (see [

The paper is organized as follows. We first recall the definition and salient properties of DPP. In Section

Let

Next, let

A locally finite point process is defined as a probability measure on

A finite point process is a probability measure on

For

A random point process

A point process

We see that

A measure

For details about the relationships between correlation functions and Janossy densities, see [

For details, we mainly refer to [

The spectrum of

The map

The determinantal measure on

There is a particular class of DPP which is the building blocks on which general DPP are built upon.

A DPP whose spectrum is reduced to the singleton

If

Alternatively, when the spectrum of

For a compact subset

For any compact

It is straightforward to see that the spectrum of

The Ginibre point process, denoted by

The simulation algorithm introduced in [

In the following, let

Sampling of the locations of the points given the set

We have two kinds of difficulties here: drawing of

To illustrate this second difficulty, we simulate

Average overhead (over 10 runs) due to rejections in the simulation of

1 | 0.7 | 0.5 | 0.25 | 0.1 | |

Overhead | 3.1 | 4.5 | 6.1 | 13 | 29 |

These are the problems we intend to address in the following.

Note that this algorithm is fully applicable even if

For details on optimal transport in

When

There are several ways to define a distance between point processes on the same underlying space

Consider the distance in total variation

By the definition of the total variation, it is straightforward that for any compact set

In the sequel, we assume that

Cost between two configurations. On the left, there is no feasible coupling as the number of atoms are different. When the two configurations have the same cardinality, there are several possible coupling. One is given on the right

Moreover, the optimum cost is attained at the permutation of

This means that whenever the distance between

It is shown in [

For determinantal point processes, we can evaluate the effect of a modification of the eigenvalues with the Kantorovitch–Rubinstein distance and the effect of a modification of the eigenvectors with the Wasserstein-2 distance.

The hypothesis means that there exists

We make a coupling of the Bernoulli random variables which appear in Lemma

The next corollary is an immediate consequence of the alternative definition of the Kantorovitch–Rubinstein distance for point processes, see (

This means that the Kantorovitch–Rubinstein distance between point processes focuses on the number of atoms in any compact. As we shall see now, the Wasserstein-2 distance evaluates the matching distance between configurations when they have the same cardinality.

If the Wasserstein-2 distance between

Conversely, if the two spectra coincide, it remains to verify point 2 of Theorem

The next lemma is a straightforward consequence of Lemma

Consider that

Then, Theorem

Theorems

For two rank-3 DPP, to compute the optimal map at stage 2 (i.e. for configurations with two points), we compute the coefficient of the convex combination which will be used on both sides to compute the Janossy density. Then we solve the optimal transport problem between these two densities. This gives the map to be applied to configurations with 2 points of the first DPP to obtain the matching configuration in the second DPP

Recall from Definition

We know that the points of

In this section, we show how the previous theorems can be applied to give some guarantees when we make an approximate simulation of a Ginibre point process. We will consider Ginibre point processes but our reasoning could be applied to any rotational invariant determinantal process like the polyanalytic ensembles [

Actually, the proof says that with high probability,

First, using the integral expression

As a corollary of the previous proof, we have

As in the previous proof, we will reduce the problem to bounding a sum of reduced incomplete gamma functions.

The combination of Lemma

The next step of the algorithm is to draw the points according to a density given by a determinant. Since we do not have explicit expression of the inverse cumulative function of these densities, we have to resort to rejection sampling. Fortunately, the particular form of the eigenfunctions of isotropic point processes, like the Ginibre point process, is prone to the simulation of modulus and arguments by inverting their respective cumulative distribution function. The key remark is that along the iterations which are necessary to draw the

Given a sequence of complex numbers

Given the modulus, we can now simulate the arguments.

Similarly to the simulation of the modulus, for

Computing

The total cost of sampling the

Gathering the results of this section, we get in Algorithm

Simulation of a compact symmetric projection point process restricted to the disc

Using Theorem

For a given constant

For a complex

We now show that replacing

According to Theorem

Finally, using the same techniques as above, we bound the sum of

We split the sum in three parts:

For

Simulation of determinantal point processes. Left: kernel

An implementation in Python of this algorithm publicly available in [