1 Introduction
A probability measure μ on ${\mathbb{R}}^{d}$ is said to satisfy the log-Sobolev inequality if for every smooth compactly supported function $f:{\mathbb{R}}^{d}\to \mathbb{R}$, the entropy of ${f}^{2}$, which by definition equals
with some constant c. The least possible constant c such that (1) holds for every compactly supported smooth f is called the log-Sobolev constant for the measure μ; the multiplier 2 in (1) is chosen in such a way that for the standard Gaussian measure on ${\mathbb{R}}^{d}$, its log-Sobolev constant equals 1.
\[\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}=\int _{{\mathbb{R}}^{d}}{f}^{2}\log {f}^{2}\hspace{0.1667em}d\mu -\bigg(\int _{{\mathbb{R}}^{d}}{f}^{2}\hspace{0.1667em}d\mu \bigg)\log \bigg(\int _{{\mathbb{R}}^{d}}{f}^{2}\hspace{0.1667em}d\mu \bigg),\]
possesses a bound
(1)
\[\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}\le 2c\int _{{\mathbb{R}}^{d}}\| \nabla f{\| }^{2}\hspace{0.1667em}d\mu \]The weighted log-Sobolev inequality has the form
where the function W, taking values in ${\mathbb{R}}^{d\times d}$, has the meaning of a weight. Clearly, one can consider (1) as a particular case of (2) with constant weight W equal to $\sqrt{c}$ multiplied by the identity matrix. The problem of giving explicit conditions on μ that ensure the log-Sobolev inequality or its modifications is intensively studied in the literature, in particular, because of numerous connections between these inequalities with measure concentration, semigroup properties, and so on (see, e.g., [8]). Motivated by this general problem, in this paper, we propose an approach that is based mainly on martingale methods and provides explicit bounds for the entropy with the right-hand side given in a certain integral form.
(2)
\[\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}\le 2\int _{{\mathbb{R}}^{d}}\| W\nabla f{\| }^{2}\hspace{0.1667em}d\mu ,\]Our approach is motivated by the well-known fact that, on a path space of a Brownian motion, the log-Sobolev inequality possesses a simple proof based on fine martingale properties of the space (cf. [1, 6]). We observe that a part of this proof is, to a high extent, insensitive w.r.t. the structure of the probability space; we formulate a respective martingale bound for the entropy in Section 1.1. To apply this general bound on a probability space of the form $({\mathbb{R}}^{d},\mu )$, one needs a proper martingale structure therein. In Section 2, we introduce such a structure in terms of a trimming filtration, defined in terms of a set of trimmed regions in ${\mathbb{R}}^{d}$. This leads to an integral bound for the entropy on $({\mathbb{R}}^{d},\mu )$. In Section 3, we show the way how this bound can be used to obtain a weighted log-Sobolev inequality; this is made in the one-dimensional case $d=1$, although we expect that similar arguments should be effective for the multidimensional case as well; this is a subject of our further research.
1.1 A martingale bound for the entropy
Let $(\varOmega ,\mathcal{F},\mathbf{P})$ be a probability space with filtration $\mathbb{F}=\{\mathcal{F}_{t},t\in [0,1]\}$, which is right-continuous and complete, that is, every $\mathcal{F}_{t}$ contains all P-null sets from $\mathcal{F}$. Let $\{M_{t},t\in [0,1]\}$ be a nonnegative square-integrable martingale w.r.t. $\mathbb{F}$ on this space, with càdlàg trajectories. We will use the following standard facts and notation (see [4]).
The martingale M has unique decomposition $M={M}^{c}+{M}^{d}$, where ${M}^{c}$ is a continuous martingale, and ${M}^{d}$ is a purely discontinuous martingale (see [4], Definition 9.20). Denote by $\langle {M}^{c}\rangle $ the quadratic variation of ${M}^{c}$, by
the optional quadratic variation of M, and by $\langle M\rangle $ the predictable quadratic variation of M, that is, the projection of $[M]$ on the set of $\mathbb{F}$-predictable processes. Alternatively, $\langle M\rangle $ is identified as the $\mathbb{F}$-predictable process that appears in the Doob–Meyer decomposition for ${M}^{2}$, that is, the $\mathbb{F}$-predictable nondecreasing process A such that $A_{0}=0$ and ${M}^{2}-A$ is a martingale.
For a nonnegative r.v. ξ, define its entropy by $\operatorname{\mathbf{Ent}}\xi =\mathbf{E}\xi \log \xi -\mathbf{E}\xi \log (\mathbf{E}\xi )$ with the convention $0\log 0=0$.
Proof.
Consider first the case where
with some positive constants $c_{1},c_{2}$. Consider a smooth function Φ, bounded with all its derivatives, such that
Then by the Itô formula (see [4], Theorem 12.19),
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varPhi (M_{1})-\varPhi (M_{0})& \displaystyle ={\int _{0}^{1}}{\varPhi ^{\prime }}(M_{t-})\hspace{0.1667em}dM_{t}+\frac{1}{2}{\int _{0}^{1}}{\varPhi ^{\prime\prime }}(M_{t-})\hspace{0.1667em}d\big\langle {M}^{c}\big\rangle _{t}\\{} & \displaystyle \hspace{1em}+\sum \limits_{0<t\le 1}\big[\varPhi (M_{t})-\varPhi (M_{t-})-{\varPhi ^{\prime }}(M_{t-})(M_{t}-M_{t-})\big].\end{array}\]
Clearly,
Because $\mathcal{F}_{0}$ is assumed to be degenerate, $M_{0}=\mathbf{E}[M_{1}|\mathcal{F}_{0}]=\mathbf{E}M_{1}$ a.s., and hence
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathbf{Ent}}M_{1}& \displaystyle =\mathbf{E}\big(\varPhi (M_{1})-\varPhi (M_{0})\big)\\{} & \displaystyle =\frac{1}{2}\mathbf{E}{\int _{0}^{1}}{\varPhi ^{\prime\prime }}(M_{t-})\hspace{0.1667em}d\big\langle {M}^{c}\big\rangle _{t}\\{} & \displaystyle \hspace{1em}+\mathbf{E}\sum \limits_{0<t\le 1}\big[\varPhi (M_{t})-\varPhi (M_{t-})-{\varPhi ^{\prime }}(M_{t-})(M_{t}-M_{t-})\big].\end{array}\]
For $x\in [c_{1},c_{2}]$, we have ${\varPhi ^{\prime }}(x)=1+\log x$ and ${\varPhi ^{\prime\prime }}(x)=1/x$. Observe that for any $x,\delta $ such that $x,x+\delta \in [c_{1},c_{2}]$,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varPhi (x+\delta )-\varPhi (x)-{\varPhi ^{\prime }}(x)\delta & \displaystyle =(x+\delta )\log (x+\delta )-x\log x-\delta (1+\log x)\\{} & \displaystyle =(x+\delta )\log \bigg(1+\frac{\delta }{x}\bigg)-\delta \le (x+\delta )\frac{\delta }{x}-\delta =\frac{{\delta }^{2}}{x}.\end{array}\]
Then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathbf{Ent}}M_{1}& \displaystyle \le \frac{1}{2}\mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d\big\langle {M}^{c}\big\rangle _{t}+\mathbf{E}\sum \limits_{0<t\le 1}\frac{{(M_{t}-M_{t-})}^{2}}{M_{t-}}\le \mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d[M]_{t}.\end{array}\]
Because the process $M_{t-},\hspace{0.1667em}t\in [0,1]$, is $\mathbb{F}$-predictable, we have
\[\mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d[M]_{t}=\mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d\langle M\rangle _{t},\]
which completes the proof of the required bound under assumption (3).The upper bound in this assumption can be removed using the following standard localization procedure. For $N\ge 1$, define
with the convention $\inf \varnothing =1$. Then, repeating the above argument, we get
\[\operatorname{\mathbf{Ent}}M_{\tau _{N}}\le \mathbf{E}{\int _{0}^{\tau _{N}}}\frac{1}{M_{t-}}d\langle M\rangle _{t}\le \mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d\langle M\rangle _{t}.\]
We have $M_{\tau _{N}}\to M_{1},N\to \infty $ a.s. On the other hand, $\mathbf{E}{M_{\tau _{N}}^{2}}\le \mathbf{E}{M_{1}^{2}}$, and
Hence, the family $\{M_{\tau _{N}}\log M_{\tau _{N}},N\ge 1\}$ is uniformly integrable, and
\[\operatorname{\mathbf{Ent}}M_{\tau _{N}}\to \operatorname{\mathbf{Ent}}M_{1},\hspace{1em}N\to \infty .\]
Passing to the limit as $N\to \infty $, we obtain the required statement under the assumption $M_{t}\ge c_{1}>0$. Taking $M_{t}+(1/n)$ instead of $M_{t}$ and then passing to the limit as $n\to \infty $, we complete the proof of the theorem. □We further give two examples where the shown martingale bound for the entropy is applied. In these examples, it would be more convenient to assume that t varies in $[0,\infty )$ instead of $[0,1]$; a respective version of Theorem 1 can be proved by literally the same argument.
Example 1 (Log-Sobolev inequality on a Brownian path space; [1, 6]).
Let $B_{t},\hspace{0.1667em}t\ge 0$, be a Wiener process on $(\varOmega ,\mathcal{F},\mathbf{P})$ such that $\mathcal{F}=\sigma (B)$. Let $\{\mathcal{F}_{t}\}$ be the natural filtration for B. Then for every $\zeta \in L_{2}(\varOmega ,\mathbf{P})$, the following martingale representation is available:
with the Itô integral of a (unique) square-integrable $\{\mathcal{F}_{t}\}$-adapted process $\{\eta _{t}\}$ in the right-hand side (cf. [3]). Take $\xi \in L_{4}(\varOmega ,\mathbf{P})$ and put $\zeta ={\xi }^{2}$ and
\[M_{t}=\mathbf{E}[\zeta |\mathcal{F}_{t}]=\mathbf{E}\zeta +{\int _{0}^{t}}\eta _{s}\hspace{0.1667em}dB_{s},\hspace{1em}t\ge 0.\]
Then the calculation from the proof of Theorem 1 gives the bound
\[\operatorname{\mathbf{Ent}}{\xi }^{2}\le \frac{1}{2}\mathbf{E}{\int _{0}^{1}}\frac{1}{M_{t-}}d\big\langle {M}^{c}\big\rangle _{t}=\frac{1}{2}\mathbf{E}{\int _{0}^{1}}\frac{{\eta _{t}^{2}}}{M_{t}}dt=\frac{1}{2}\mathbf{E}{\int _{0}^{1}}\frac{{\eta _{t}^{2}}}{\mathbf{E}[{\xi }^{2}|\mathcal{F}_{t}]}dt.\]
Note the extra term $1/2$, which appears because the martingale M is continuous.Next, recall the Ocone representation [10] for the process $\{\eta _{t}\}$, which is valid if ζ possesses the Malliavin derivative $D\zeta =\{D_{t}\zeta ,t\ge 0\}$:
We omit the details concerning the Malliavin calculus, referring the reader, if necessary, to [9]. Because the Malliavin derivative possesses the chain rule, we have
where $D\xi $ is considered as a random element in $H=L_{2}(0,\infty )$. By a proper approximation procedure one can show that (7) holds for every $\xi \in L_{2}(\varOmega ,\mathbf{P})$ that has a Malliavin derivative $D\xi \in L_{2}(\varOmega ,\mathbf{P},H)$.
\[{\eta _{t}^{2}}=4{\big(\mathbf{E}[\xi D_{t}\xi |\mathcal{F}_{t}]\big)}^{2}\le 4\mathbf{E}\big[{\xi }^{2}\big|\mathcal{F}_{t}\big]\mathbf{E}\big[{(D_{t}\xi )}^{2}\big|\mathcal{F}_{t}\big],\]
and consequently the following log-Sobolev-type inequality holds:
(6)
\[\operatorname{\mathbf{Ent}}{\xi }^{2}\le 2\mathbf{E}{\int _{0}^{1}}\mathbf{E}\big[{(D_{t}\xi )}^{2}\big|\mathcal{F}_{t}\big]\hspace{0.1667em}dt=2\mathbf{E}\| D\xi {\| _{H}^{2}},\]The previous example is classic and well known. The next one apparently is new, which is a bit surprising because the main ingredients therein (the Malliavin calculus on the Poisson space and the respective analogue of the Clark–Ocone representation (4), (5)) are well known (cf. [2, 5]).
Example 2 (Log-Sobolev inequality on the Poisson path space).
Let $N_{t}$, $t\ge 0$, be a Poisson process with intensity λ, and $\mathcal{F}=\sigma (N)$. Denote by $\tau _{k},\hspace{0.1667em}k\ge 1$, the moments of consequent jumps of the process N, and by $\mathcal{F}_{t}=\sigma (N_{s},s\le t)$, $t\ge 0$, the natural filtration for N. For any variable of the form
with some $n\ge 1$ and some compactly supported $F\in {C}^{1}({\mathbb{R}}^{n})$, define the random element $D\xi $ in $H=L_{2}(0,\infty )$ by
Denote by the same symbol D the closure of D, considered as an unbounded operator $L_{2}(\varOmega ,\mathbf{P})\to L_{2}(\varOmega ,\mathbf{P},H)$. Then the following analogue of the Clark–Ocone representation (4), (5) is available ([5]): for every ζ that possesses the stochastic derivative $D\zeta $, the following martingale representation holds:
\[\zeta =\mathbf{E}\zeta +\frac{1}{\lambda }{\int _{0}^{\infty }}\eta _{s}\hspace{0.1667em}d\tilde{N}_{s},\]
where $\tilde{N}_{t}=N_{t}-\lambda t$ denotes the compensated Poisson process corresponding to N, and $\{\eta _{t}\}$ is the projection in $L_{2}(\varOmega ,\mathbf{P},H)$ of $D\xi $ on the subspace generated by the $\{\mathcal{F}_{t}\}$-predictable processes.2 Trimmed regions on ${\mathbb{R}}^{d}$ and associated integral bounds for the entropy
Let μ be a probability measure on ${\mathbb{R}}^{d}$ with Borel σ-algebra $\mathcal{B}({\mathbb{R}}^{d})$. Our further aim is to apply the general martingale bound from Theorem 1 in the particular setting $(\varOmega ,\mathcal{F},\mathbf{P})=({\mathbb{R}}^{d},\mathcal{B}({\mathbb{R}}^{d}),\mu )$. To this end, we first construct a filtration $\{\mathcal{F}_{t},t\in [0,1]\}$.
In what follows, we denote $\mathcal{N}_{\mu }=\{A\in \mathcal{F}:\mu (A)=0\}$ (the class of μ-null Borel sets).
Fix the family $\{D_{t},t\in [0,1]\}$ of closed subsets of ${\mathbb{R}}^{d}$ such that:
We call the sets $D_{t},\hspace{0.1667em}t\in [0,1]$, trimmed regions, following the terminology used frequently in the multivariate analysis (cf. [7]). Given the family $\{D_{t}\}$, we define the respective trimmed filtration $\{\mathcal{F}_{t}\}$ by the following convention. Denote $Q_{t}={\mathbb{R}}^{d}\setminus D_{t}$. Then, by definition, a set $A\in \mathcal{F}$ belongs to $\mathcal{F}_{t}$ if either $A\cap Q_{t}\in \mathcal{N}_{\mu }$ or $Q_{t}\setminus A\in \mathcal{N}_{\mu }$.
By the construction, $\mathbb{F}=\{\mathcal{F}_{t}\}$ is complete. It is also clear that, by property (ii) of the family $\{D_{t}\}$, the σ-algebra $\mathcal{F}_{0}$ is degenerate and, by property (iii), the filtration $\mathbb{F}$ is continuous. Hence, we can apply Theorem 1.
Fix a Borel-measurable function $g:{\mathbb{R}}^{d}\to {\mathbb{R}}^{+}$ that is square-integrable w.r.t. μ. Consider it as a random variable on $(\varOmega ,\mathcal{F},\mathbf{P})=({\mathbb{R}}^{d},\mathcal{B}({\mathbb{R}}^{d}),\mu )$ and define
Since the σ-algebra possesses an explicit description, we can calculate every $g_{t}$ directly; namely, for $t>0$ and μ-a.a. x, we have
where we denote
Note that $\mu (Q_{t})>0$ for $t<1$ and the function $G:[0,1)\to {\mathbb{R}}^{+}$ is continuous. In what follows, we consider the modification of the process $\{g_{t}\}$ defined by (8) for every $x\in {\mathbb{R}}^{d}$. Its trajectories can be described as follows. Denote
then by property (iii) of the family $\{D_{t}\}$ we have $\tau (x)=\min \{t:x\in D_{t}\}$, and by property (ii) we have $\tau (x)<1,x\in {\mathbb{R}}^{d},\tau (x)=0\Leftrightarrow x\in D_{0}$. Then, for a fixed $x\in {\mathbb{R}}^{d}$, we have
which is a càdlàg function because $\{G_{t}\}$ is continuous on $[0,1)$.
(8)
\[g_{t}(x)=\bigg\{\begin{array}{l@{\hskip10.0pt}l}g(x),& x\in D_{t},\\{} G_{t},& x\in Q_{t},\end{array}\]Theorem 2.
Let g be a Borel-measurable function $g:{\mathbb{R}}^{d}\to {\mathbb{R}}^{+}$, square-integrable w.r.t. μ. Let $\{D_{t}\}$ be a family of trimmed regions that satisfy (i)–(iii).
Proof.
We have already verified the assumptions of Theorem 1: the filtration $\{\mathcal{F}_{t}\}$ is complete and right continuous, and the square-integrable martingale $\{g_{t}\}$ has càdlàg trajectories. Because $g_{1}=g$ a.s. and $\mathcal{F}_{0}$ is degenerate, by Theorem 1 we have the bound
First, we observe the following.
\[\operatorname{\mathbf{Ent}}_{\mu }g\le \mathbf{E}{\int _{0}^{1}}\frac{1}{g_{t-}}d\langle g\rangle _{t}.\]
Hence, we only have to specify the integral in the right-hand side of this bound. Namely, our aim is to prove that
(11)
\[\mathbf{E}{\int _{0}^{1}}\frac{1}{g_{t-}}d\langle g\rangle _{t}=\int _{{\mathbb{R}}^{d}}\frac{{(g(x)-G_{\tau (x)})}^{2}}{G_{\tau (x)}}\mu (dx).\]Proof.
By the definition of $\langle g\rangle $,
Indeed, because G is continuous on $[0,1)$, the left-hand side integral can be approximated by the integral sum
where $s=v_{0}<\cdots <v_{m}=t$ is some partition of $[s,t]$. This sum equals
For $x\in D_{v_{k}}\setminus D_{v_{k-1}}$, we have $\tau (x)\in [v_{k-1},v_{k}]$. Hence, this sum equals
\[\mathbf{E}\big[\alpha \big(\langle g\rangle _{t}-\langle g\rangle _{s}\big)\big]=\mathbf{E}\big[\alpha \big({g_{t}^{2}}-{g_{s}^{2}}\big)\big]=\mathbf{E}\big[\alpha \big(\mathbf{E}\big({g_{t}^{2}}\big|\mathcal{F}_{s}\big)-{g_{s}^{2}}\big)\big].\]
We have
\[{g_{s}^{2}}(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}{g}^{2}(x),& x\in D_{s},\\{} {G_{s}^{2}},& x\in Q_{s},\end{array}\right.\hspace{1em}{g_{t}^{2}}(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}{g}^{2}(x),& x\in D_{t},\\{} {G_{t}^{2}},& x\in Q_{t},\end{array}\right.\]
and applying formula (8) with $g={g_{t}^{2}}$ and $t=s$, we get
\[\mathbf{E}\big({g_{t}^{2}}\big|\mathcal{F}_{s}\big)(x)-{g_{s}^{2}}(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}0,& x\in D_{s},\\{} \frac{H_{t,s}}{\mu (Q_{s})},& x\in Q_{s},\end{array}\right.\]
\[H_{t,s}=\bigg(\int _{D_{t}\setminus D_{s}}\big({g}^{2}(x)-{G_{s}^{2}}\big)\hspace{0.1667em}\mu (dx)+\int _{Q_{t}}\big({G_{t}^{2}}-{G_{s}^{2}}\big)\hspace{0.1667em}\mu (dx)\bigg).\]
Because α is $\mathcal{F}_{s}$-measurable, it equals a constant on $Q_{s}$ μ-a.s. Denote this constant by A; then the previous calculation gives
Write $H_{t,s}$ in the form
\[H_{t,s}=\int _{D_{t}\setminus D_{s}}{g}^{2}(x)\hspace{0.1667em}\mu (dx)+\mu (Q_{t}){G_{t}^{2}}-\mu (Q_{s}){G_{s}^{2}}.\]
Denote
then
Observe that the functions $\mu _{t},t\in [0,1]$ and $I_{t},t\in [0,1]$, are continuous functions of a bounded variation and $\mu _{t}>0,\hspace{0.1667em}t<1$. Then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mu (Q_{t}){G_{t}^{2}}-\mu (Q_{s}){G_{s}^{2}}& \displaystyle ={\int _{s}^{t}}d\bigg(\frac{{I_{v}^{2}}}{\mu _{v}}\bigg)={\int _{s}^{t}}\bigg(-\frac{{I_{v}^{2}}}{{\mu _{v}^{2}}}d\mu _{v}+2\frac{I_{v}}{\mu _{v}}dI_{v}\bigg)\\{} & \displaystyle ={\int _{s}^{t}}\big(-{G_{v}^{2}}\hspace{0.1667em}d\mu _{v}+2G_{v}\hspace{0.1667em}dI_{v}\big).\end{array}\]
It is easy to show that
(12)
\[-{\int _{s}^{t}}{G_{v}^{2}}\hspace{0.1667em}d\mu _{v}=\int _{D_{t}\setminus D_{s}}{G_{\tau (x)}^{2}}\hspace{0.1667em}\mu (dx).\]
\[\sum \limits_{k=1}^{m}\int _{D_{v_{k}}\setminus D_{v_{k-1}}}{G_{\tau (x)}^{2}}\hspace{0.1667em}\mu (dx)=\int _{D_{t}\setminus D_{s}}{G_{\tau (x)}^{2}}\hspace{0.1667em}\mu (dx)\]
up to a residue term that is dominated by
\[\underset{u,v\in [s,t],|u-v|\le \max _{k}(v_{k}-v_{k-1})}{\sup }\big|{G_{u}^{2}}-{G_{v}^{2}}\big|\]
and tends to zero as the size of the partition tends to zero. This proves (12). Similarly, we can show that
\[{\int _{s}^{t}}G_{v}\hspace{0.1667em}dI_{v}=-\int _{D_{t}\setminus D_{s}}G_{\tau (x)}g(x)\hspace{0.1667em}\mu (dx).\]
We can summarize this calculation as follows:
\[\mathbf{E}\big[\alpha \big(\langle g\rangle _{t}-\langle g\rangle _{s}\big)\big]=A\int _{D_{t}\setminus D_{s}}{\big(g(x)-G_{\tau (x)}\big)}^{2}\hspace{0.1667em}\mu (dx).\]
Because $\alpha (x)=A$ for μ-a.a. $x\notin D_{s}$, this completes the proof. □Let us continue with the proof of (11). Assume first that $g\ge c$ with some $c>0$. Then $g_{t}\ge c$, and consequently the process $1/g_{t-}$ is left continuous and bounded. In addition, the function $G_{t}=I_{t}/\mu _{t}$ is bounded on every segment $[0,T]\subset [0,1)$.
Fix $T<1$ and take a sequence $\{{\lambda }^{n}\}$ of dyadic partitions of $[0,T]$,
\[{\lambda }^{n}=\big\{{t_{k}^{n}},k=0,\dots ,{2}^{n}\big\},\hspace{1em}{t_{k}^{n}}=\frac{Tk}{{2}^{n}},\]
and define
\[{g_{t}^{n}}=g_{0}1_{t=0}+\sum \limits_{k=1}^{{2}^{n}}g_{{t_{k-1}^{n}}}1_{t\in ({t_{k-1}^{n}},{t_{k}^{n}}]}.\]
For every fixed $t>0$, the value ${g_{t}^{n}}$ equals the value of g at some (dyadic) point $t_{n}<t$, and $t_{n}\to t-$. Hence,
pointwise. In addition, because of the additional assumption $g\ge c$, this sequence is bounded by $1/c$. Hence, by the dominated convergence theorem,
\[\mathbf{E}{\int _{0}^{T}}\frac{1}{g_{t-}}d\langle g\rangle _{t}=\underset{n\to \infty }{\lim }\mathbf{E}\sum \limits_{k=1}^{{2}^{n}}\frac{1}{g_{{t_{k-1}^{n}}}}\big(\langle g\rangle _{{t_{k}^{n}}}-\langle g\rangle _{{t_{k-1}^{n}}}\big);\]
here we take into account that the point $t=0$ in the left-hand side integral is negligible because $g_{t}\to \mathbf{E}g,\hspace{0.1667em}t\to 0+$, in $L_{2}$, and consequently $\langle g\rangle _{t}\to 0$, $t\to 0+$, in $L_{1}$. By Lemma 1,
\[\mathbf{E}\sum \limits_{k=1}^{{2}^{n}}\frac{1}{g_{{t_{k-1}^{n}}}}\big(\langle g\rangle _{{t_{k}^{n}}}-\langle g\rangle _{{t_{k-1}^{n}}}\big)=\mathbf{E}\sum \limits_{k=1}^{{2}^{n}}\int _{D_{{t_{k}^{n}}}\setminus D_{{t_{k-1}^{n}}}}\frac{{(g(x)-G_{\tau (x)})}^{2}}{G_{{t_{k-1}^{n}}}}\mu (dx);\]
recall that $g_{{t_{k-1}^{n}}}(x)=G_{{t_{k-1}^{n}}}$ for $x\notin D_{{t_{k-1}^{n}}}$. Next, for $x\in D_{{t_{k}^{n}}}\setminus D_{{t_{k-1}^{n}}}$, we have $|\tau (x)-{t_{k-1}^{n}}|\le {2}^{-n}$. Because $G_{t},t\in [0,T]$, is uniformly continuous and separated from zero, and $G_{\tau (x)},x\in D_{T}$ is bounded, we obtain that
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mathbf{E}{\int _{0}^{T}}\frac{1}{g_{t-}}d\langle g\rangle _{t}& \displaystyle =\underset{n\to \infty }{\lim }\mathbf{E}\sum \limits_{k=1}^{{2}^{n}}\int _{D_{{t_{k}^{n}}}\setminus D_{{t_{k-1}^{n}}}}\frac{{(g(x)-G_{\tau (x)})}^{2}}{G_{{t_{k-1}^{n}}}}\mu (dx)\\{} & \displaystyle =\int _{D_{T}}\frac{{(g(x)-G_{\tau (x)})}^{2}}{G_{\tau }(x)}\mu (dx).\end{array}\]
Taking $T\to 1-$ and applying the monotone convergence theorem to both sides of the above identity, we get (11).To remove the additional assumption $g\ge c$, consider the family ${g_{t}^{n}}=g_{t}+1/n$. Then $\langle {g}^{n}\rangle =\langle g\rangle $, ${g}^{n}(x)-{G_{\tau (x)}^{n}}=g(x)-G_{\tau (x)}$, ${g_{t-}^{n}}=g_{t-}+(1/n),{G_{\tau (x)}^{n}}=G_{\tau (x)}+(1/n)$. Hence, we can write (11) for ${g}^{n}$, apply the monotone convergence theorem to both sides of this identity, and get (11) for g. □
3 One corollary: a weighted log-Sobolev inequality on $\mathbb{R}$
In this section, we show the way how the integral bound for the entropy, established in Theorem 2, can be used to obtain weighted log-Sobolev inequalities. Consider a continuous probability measure μ on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ and denote by $p_{\mu }$ the density of its absolutely continuous part. Fix a family of segments $D_{t}=[a_{t},b_{t}],t\in [0,1)$, where $a_{0}=b_{0}$, the function $a_{t}$ is continuous and decreasing to $-\infty $ as $t\to 1-$, and the function $b_{\cdot }$ is continuous and increasing to $+\infty $ as $t\to 1-$. Then the family
satisfies the assumptions imposed before. Hence, Theorem 2 is applicable.
We call a function $f:\mathbb{R}\to \mathbb{R}$ symmetric w.r.t. the family $\{D_{t}\}$ if
In the following proposition, we apply Theorem 2 to $g={f}^{2}$, where f is smooth and symmetric.
Proposition 1.
Proof.
Write
\[g(x)-G_{\tau (x)}=\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}\big(g(x)-g(y)\big)\hspace{0.1667em}\mu (dy)=\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}{\int _{x}^{y}}{g^{\prime }}(z)\hspace{0.1667em}dz\hspace{0.1667em}\mu (dy).\]
Let us analyze the expression in the right-hand side. Observe that now $Q_{\tau (x)}$ is the union of two intervals $(-\infty ,a_{\tau (x)})$ and $(b_{\tau (x)},+\infty )$. Denote
\[{Q_{t}^{+}}=(b_{t},\infty ),\hspace{2em}{Q_{t}^{-}}=(-\infty ,a_{t}),\hspace{2em}{\mu _{t}^{\pm }}=\mu \big({Q_{t}^{\pm }}\big).\]
The point x equals either $a_{\tau (x)}$ or $b_{\tau (x)}$; hence, because $g={f}^{2}$ is symmetric,
Then we have
\[{\int _{x}^{y}}{g^{\prime }}(z)\hspace{0.1667em}dz=\left\{\begin{array}{l@{\hskip10.0pt}l}{\textstyle\int _{b_{\tau (x)}}^{y}}{g^{\prime }}(z)\hspace{0.1667em}dz,& y\in {Q_{\tau (x)}^{+}},\\{} {\textstyle\int _{a_{\tau (x)}}^{y}}{g^{\prime }}(z)\hspace{0.1667em}dz,& y\in {Q_{\tau (x)}^{-}}.\end{array}\right.\]
Consequently,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \big|g(x)-G_{\tau (x)}\big|& \displaystyle \le \frac{1}{\mu _{\tau (x)}}\bigg[\int _{{Q_{\tau (x)}^{+}}}\int _{{Q_{\tau (x)}^{+}},\tau (z)\le \tau (y)}\big|{g^{\prime }}(z)\big|\hspace{0.1667em}dz\hspace{0.1667em}\mu (dy)\\{} & \displaystyle \hspace{1em}+\int _{{Q_{\tau (x)}^{-}}}\int _{{Q_{\tau (x)}^{-}},\tau (z)\le \tau (y)}\big|{g^{\prime }}(z)\big|\hspace{0.1667em}dz\hspace{0.1667em}\mu (dy)\bigg].\end{array}\]
Using Fubini’s theorem, we get
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \big|g(x)-G_{\tau (x)}\big|& \displaystyle \le \frac{1}{\mu _{\tau (x)}}\bigg[\int _{{Q_{\tau (x)}^{+}}}{\mu _{\tau (z)}^{+}}\big|{g^{\prime }}(z)\big|\hspace{0.1667em}dz+\int _{{Q_{\tau (x)}^{-}}}{\mu _{\tau (z)}^{-}}\big|{g^{\prime }}(z)\big|\hspace{0.1667em}dz\bigg]\\{} & \displaystyle \le \frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}V(z)\big|{g^{\prime }}(z)\big|\hspace{0.1667em}\mu (dz).\end{array}\]
Because $g={f}^{2}$ and hence ${g^{\prime }}=2f{f^{\prime }}$, by the Cauchy inequality we then have
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle {\big(g(x)-G_{\tau (x)}\big)}^{2}\\{} & \displaystyle \hspace{1em}\le 4\bigg(\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}{\big(V(z){f^{\prime }}(z)\big)}^{2}\hspace{0.1667em}\mu (dz)\bigg)\bigg(\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}{\big(f(z)\big)}^{2}\hspace{0.1667em}\mu (dz)\bigg)\\{} & \displaystyle \hspace{1em}=4\bigg(\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}{\big(V(z){f^{\prime }}(z)\big)}^{2}\hspace{0.1667em}\mu (dz)\bigg)G_{\tau (x)}.\end{array}\]
Observe that
\[z\in Q_{\tau (x)}\hspace{1em}\Leftrightarrow \hspace{1em}\tau (z)>\tau (x)\hspace{1em}\Leftrightarrow \hspace{1em}x\in D_{\tau (z)}\setminus \{a_{\tau (z)},b_{\tau (z)}\}.\]
Hence, by Theorem 2 and Fubini’s theorem we have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathbf{Ent}}_{\mu }\big({f}^{2}\big)& \displaystyle \le 4\int _{\mathbb{R}}\bigg(\frac{1}{\mu _{\tau (x)}}\int _{Q_{\tau (x)}}{\big(V(z){f^{\prime }}(z)\big)}^{2}\mu (dz)\bigg)\hspace{0.1667em}\mu (dx)\\{} & \displaystyle =4\int _{\mathbb{R}}\bigg(\int _{D_{\tau (z)}}\frac{\mu (dx)}{\mu _{\tau (x)}}\bigg){\big(V(z){f^{\prime }}(z)\big)}^{2}\hspace{0.1667em}\mu (dz).\end{array}\]
Similarly to the proof of (12), we can show that
\[\int _{D_{t}}\frac{\mu (dx)}{\mu _{\tau (x)}}=\log \mu _{s}{\big|_{s=0}^{s=t}}=\log \bigg(\frac{1}{\mu _{t}}\bigg);\]
the last identity holds because $\mu _{0}=1$. This completes the proof. □Next, we develop a symmetrization procedure in order to remove the restriction for f to be symmetric. For any $x\ne a_{0}$, one border point of the segment $D_{\tau }(x)$ equals x; let us denote $s(x)$ the other border point. Denote also $s(a_{0})=a_{0}$. Define the σ-algebra $\hat{\mathcal{F}}$ of symmetric sets $A\in \mathcal{F}$, that is, such that $x\in A\Leftrightarrow s(x)\in A$. For a function $f\in L_{2}(\mathbb{R},\mu )$, consider its $L_{2}$-symmetrization
It can be seen easily that there exists a measurable function $p:\mathbb{R}\to [0,1]$ such that, for μ-a.a. $x\in \mathbb{R}$,
where we denote
We have
and, consequently,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathbf{Ent}}_{\mu }{f}^{2}-\operatorname{\mathbf{Ent}}_{\mu }{(\hat{f})}^{2}& \displaystyle =\mathbf{E}_{\mu }{f}^{2}\log {f}^{2}-\mathbf{E}_{\mu }{(\hat{f})}^{2}\log {(\hat{f})}^{2}\\{} & \displaystyle =\mathbf{E}_{\mu }\big(\mathbf{E}_{\mu }\big[{f}^{2}\log {f}^{2}-{(\hat{f})}^{2}\log {(\hat{f})}^{2}\big|\hat{\mathcal{F}}\big]\big)\\{} & \displaystyle =\int _{\mathbb{R}}\big(\operatorname{\mathbf{Ent}}_{\nu _{x}}{f}^{2}\big)\hspace{0.1667em}\mu (dx).\end{array}\]
It is well known (cf. [8]) that for a Bernoulli measure $\nu =p\delta _{1}+q\delta _{-1}$ ($p+q=1$), the following discrete analogue of the log-Sobolev inequality holds:
\[\operatorname{\mathbf{Ent}}_{\nu }{f}^{2}\le C_{p}{(Df)}^{2},\hspace{2em}C_{p}=\left\{\begin{array}{l@{\hskip10.0pt}l}pq\frac{\log p-\log q}{p-q},& p\ne q,\\{} \frac{1}{2},& p=q,\end{array}\right.\]
where we denote $Df=f(1)-f(-1)$. This yields the bound
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathbf{Ent}}_{\mu }{f}^{2}-\operatorname{\mathbf{Ent}}_{\mu }{(\hat{f})}^{2}& \displaystyle \le \int _{\mathbb{R}}C_{p(x)}{\big(f(x)-f\big(s(x)\big)\big)}^{2}\hspace{0.1667em}\mu (dx)\\{} & \displaystyle =\int _{\mathbb{R}}C_{p(x)}{\bigg(\int _{D_{\tau (x)}}{f^{\prime }}(z)\hspace{0.1667em}dz\bigg)}^{2}\hspace{0.1667em}\mu (dx).\end{array}\]
By the Cauchy inequality,
\[{\bigg(\int _{D_{\tau (x)}}{f^{\prime }}(z)\hspace{0.1667em}dz\bigg)}^{2}\le \bigg(\int _{D_{\tau (x)}}{\big({f^{\prime }}(z)\big)}^{2}\frac{{\mu _{\tau (z)}^{3/2}}}{{p_{\mu }^{2}}(z)}\mu (dz)\bigg)\bigg(\int _{D_{\tau (x)}}\frac{\mu (dz)}{{\mu _{\tau (z)}^{3/2}}}\bigg),\]
and, similarly to the proof of (12), we can show that
\[\int _{D_{\tau (x)}}\frac{\mu (dz)}{{\mu _{\tau (z)}^{3/2}}}=2\big({\mu _{\tau (x)}^{-1/2}}-1\big)<2{\mu _{\tau (x)}^{-1/2}}.\]
This yields the following bound for the difference $\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}-\operatorname{\mathbf{Ent}}_{\mu }{(\hat{f})}^{2}$, formulated in the terms of ${f^{\prime }}$:
\[\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}-\operatorname{\mathbf{Ent}}_{\mu }{(\hat{f})}^{2}\le 2\int _{\mathbb{R}}{\big({f^{\prime }}(z)\big)}^{2}U(z)\hspace{0.1667em}\mu (dz),\]
\[U(z)=\frac{{\mu _{\tau (z)}^{3/2}}}{{p_{\mu }^{2}}(z)}\int _{Q_{\tau (z)}}C_{p(x)}\frac{\mu (dx)}{{\mu _{\tau (x)}^{1/2}}}.\]
Note that $C_{p}\le 1$ for any $p\in [0,1]$, and hence we have
Assuming that the bound from Proposition 1 is applicable to $\hat{f}$ (which is yet to be studied because $\hat{f}$ may fail to be smooth), we obtain the following inequality, valid without the assumption of symmetry of f:
The right-hand side of this inequality contains the derivative of $\hat{f}$ and hence depends on the choice of the family of trimmed regions $\{D_{t}\}$. We further give a particular corollary, which appears when $\{D_{t}\}$ is the set of quantile trimmed regions. In what follows, we assume μ to possess a positive distribution density $p_{\mu }$ and choose $\{D_{t}=[a_{t},b_{t}]\}$ in the following way. Denote $q_{v}={F_{\mu }^{-1}}(v)$, that is, the quantile of μ of the level v, and put
In particular, $a_{0}=b_{0}=m$, the median of μ. Denote also $\hat{F}_{\mu }=\min (F_{\mu },1-F_{\mu })$; observe that now we have
(13)
\[\operatorname{\mathbf{Ent}}_{\mu }{f}^{2}\le \int _{\mathbb{R}}\big(4W(x){\big({(\hat{f})^{\prime }}(x)\big)}^{2}+2U(x){\big({f^{\prime }}(x)\big)}^{2}\big)\hspace{0.1667em}\mu (dx).\]Theorem 3.
Let μ be a probability measure on $\mathbb{R}$ with positive distribution density $p_{\mu }$. Then, for any absolutely continuous f, we have
Proof.
First, observe that now the $L_{2}$-symmetrization of a function f has the form
This identity is evident for functions f of the form $1_{(-\infty ,{F}^{-1}(v))}$, $v\in (0,1/2]$ and $1_{[{F}^{-1}(v),\infty )}$, $v\in [1/2,1)$, and then easily extends to general f.
Next, observe that
and because $F_{\mu }$ is absolutely continuous and strictly increasing, $s(x)$ is absolutely continuous as well. Then $\hat{f}$ is absolutely continuous with
\[{(\hat{f})^{\prime }}(x)=\frac{f(x){f^{\prime }}(x)+f(s(x)){f^{\prime }}(s(x)){s^{\prime }}(x)}{\sqrt{2({f}^{2}(x)+{f}^{2}(s(x)))}};\]
here and below the derivatives are well defined for a.a. x. Using a standard localization/approximation procedure, we can show that Proposition 1 is well applicable to any absolutely continuous function. Hence, it is applicable to $\hat{f}$, and (13) holds.We have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\big({(\hat{f})^{\prime }}(x)\big)}^{2}& \displaystyle \le \frac{{({f^{\prime }}(x))}^{2}+{(f(s(x)){f^{\prime }}(s(x)){s^{\prime }}(x))}^{2}}{{f}^{2}(x)+{f}^{2}(s(x))}\\{} & \displaystyle \le {\big({f^{\prime }}(x)f(x)\big)}^{2}+{\big({f^{\prime }}\big(s(x)\big){s^{\prime }}(x)\big)}^{2}.\end{array}\]
The function $W(x)$ in (13) now can be rewritten as
hence,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \int _{\mathbb{R}}W(x){\big(\big({\hat{f}^{\prime }}(x)\big)\big)}^{2}\hspace{0.1667em}\mu (dx)\\{} & \displaystyle \hspace{1em}\le \int _{\mathbb{R}}W(x){\big({f^{\prime }}(x)\big)}^{2}\hspace{0.1667em}\mu (dx)+\int _{\mathbb{R}}W(x){\big({f^{\prime }}\big(s(x)\big)\big)}^{2}{\big({s^{\prime }}(x)\big)}^{2}\hspace{0.1667em}\mu (dx).\end{array}\]
Let us analyze the second integral in the right-hand side. By (15),
hence,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \int _{\mathbb{R}}W(x){\big({f^{\prime }}\big(s(x)\big)\big)}^{2}{\big({s^{\prime }}(x)\big)}^{2}\hspace{0.1667em}\mu (dx)\\{} & \displaystyle \hspace{1em}=\int _{\mathbb{R}}{\big({f^{\prime }}\big(s(x)\big)\big)}^{2}{\bigg(\frac{\hat{F}_{\mu }(x)}{p_{\mu }(s(x))}\bigg)}^{2}\log \frac{1}{2\hat{F}_{\mu }(x)}p_{\mu }(x)\hspace{0.1667em}dx.\end{array}\]
Change the variables $y=s(x)$; observe that we have $x=s(y)$ and $\hat{F}_{\mu }(x)=\hat{F}_{\mu }(y)$. Then we finally get
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \int _{\mathbb{R}}W(x){\big({f^{\prime }}(y)\big)}^{2}{\big({s^{\prime }}(x)\big)}^{2}\hspace{0.1667em}\mu (dx)\\{} & \displaystyle \hspace{1em}=\int _{\mathbb{R}}{\big({f^{\prime }}\big(s(x)\big)\big)}^{2}{\bigg(\frac{\hat{F}_{\mu }(y)}{p_{\mu }(y)}\bigg)}^{2}\log \frac{1}{2\hat{F}_{\mu }(y)}p_{\mu }\big(s(y)\big)\hspace{0.1667em}\frac{p_{\mu }(y)}{p_{\mu }(s(y))}dy\\{} & \displaystyle \hspace{1em}=\int _{\mathbb{R}}W(y){\big({f^{\prime }}(y)\big)}^{2}\hspace{0.1667em}\mu (dy),\end{array}\]
and therefore