Chapter 9 Multiple Risks

9.1 Introduction

Traditionally, actuarial developments are based on the assumption of independence among random variables present. A notable exception is the credibility theory and bonus-malus systems (which we will study in a later chapter) where the serial dependence between the annual number or amount of claims caused by an insured is exploited to revise the premium amount that the insured must pay to benefit from the insurer’s coverage.

The ever-increasing complexity of insurance products and the obligation to cover events that were once excluded from coverage (such as floods or earthquakes) have highlighted the importance of considering dependence. Therefore, it seemed pertinent to dedicate a (lengthy) chapter to the study of stochastic dependence, a topic traditionally ignored in risk theory literature.

While there is only one way for risks to be mutually independent, not influencing each other, mathematically expressed by the fact that their joint distribution factorizes into the product of their marginals, there are, of course, infinitely many ways to introduce dependence into an actuarial model. For a long time, the multivariate normal distribution was the only tool used to account for certain dependence, in statistics as well as in finance and actuarial science. It is only recently that techniques relying (implicitly or explicitly) on the multivariate normal distribution have been severely criticized, and more realistic alternatives have been developed to model the correlation between random variables with diverse distributions. The purpose of this chapter is precisely to introduce readers to the basic concepts underlying these new approaches.

9.2 Comonotonicity and Antimonotonicity

9.2.1 Fréchet Classes

Named in honor of the French mathematician Maurice Fréchet, these sets contain all probability distributions with fixed marginals. Fréchet classes are, therefore, the ideal framework for studying dependence, as two elements in the same class differ only in their correlation structure, not in their marginal behavior.

Definition 9.1 We denote \(\mathcal{F}\left( F_1, F_2 \right)\) as the set of bivariate distribution functions with marginal distribution functions \(F_1\) and \(F_2\), respectively, i.e., \[\begin{eqnarray*} \mathcal{F}\left( F_1, F_2 \right) &=& \Big\{ \text{distribution functions }F_{\boldsymbol{X}} \text{ such that } \\ && \lim_{t \rightarrow \infty} F_{\boldsymbol{X}}\left( x_1, t \right) = F_1\left( x_1 \right), \text{ for any } x_1 \in \mathbb{R} \\ && \lim_{t \rightarrow \infty} F_{\boldsymbol{X}}\left( t, x_2 \right) = F_2\left( x_2 \right), \text{ for any } x_2 \in \mathbb{R} \Big\}. \end{eqnarray*}\]

9.2.2 Fréchet Bounds

9.2.2.1 Definition

Within each Fréchet class \(\mathcal{F}\left( F_1, F_2 \right)\), two elements play a very particular role: they are the upper and lower Fréchet bounds, defined as follows.

Definition 9.2 The distribution function \(W\) defined as \[ W(x_1, x_2) = \min \left\{ F_{1}(x_{1}), F_2(x_2) \right\}, \hspace{2mm} \boldsymbol{x}\in \mathbb{R}^{2}, \] is called the upper Fréchet bound in \(\mathcal{F}\left( F_1, F_2 \right)\). Similarly, the distribution function \(M\) defined as \[ M(x_1, x_2) = \max \left\{ F_{1}(x_{1}) + F_2(x_2) - 1, 0 \right\}, \hspace{2mm} \boldsymbol{x}\in \mathbb{R}^{2}, \] is called the lower Fréchet bound in \(\mathcal{F}\left( F_1, F_2 \right)\).

The term “Fréchet bound” comes from the following result, which shows that any distribution function \(F_{\boldsymbol{X}}\) in \(\mathcal{F}\left( F_1, F_2 \right)\) is bounded from below by \(M\) and from above by \(W\). The Fréchet bounds thus delimit the set \(\mathcal{F}\left( F_1, F_2 \right)\).

(#prp:Prop6.2.3) The Fréchet class \(\mathcal{F}\left( F_1, F_2 \right)\) is bounded in the sense that for all \(F_{\boldsymbol{X}} \in \mathcal{F}\left( F_1, F_2 \right)\), \[ M(\boldsymbol{x}) \leq F_{\boldsymbol{X}}(\boldsymbol{x}) \leq W(\boldsymbol{x}) \text{ for any } \boldsymbol{x}\in \mathbb{R}^2. \]

Proof. The announced result comes from the following inequality for any random events \(A_1\) and \(A_2\): \[\begin{eqnarray} \label{IneEvent2} \max\Big\{{\Pr}[A_1] + {\Pr}[A_2] - 1, 0\Big\} &\leq& {\Pr}[A_1 \cap A_2] \nonumber \\ &\leq& \min\Big\{{\Pr}[A_1], {\Pr}[A_2]\Big\}. \end{eqnarray}\] The first of these inequalities comes from \[ 1 \geq {\Pr}[A_1 \cup A_2] = {\Pr}[A_1] + {\Pr}[A_2] - {\Pr}[A_1 \cap A_2], \] while the second is explained by the fact that \((A_1 \cap A_2) \subseteq A_1\) and \((A_1 \cap A_2) \subseteq A_2\). Applying \(\eqref{IneEvent2}\) with \(A_i = \{X_i \leq x_i\}\), \(i=1,2\), yields the desired result.

Remark. Regardless of the marginal distribution functions, the Fréchet classes \(\mathcal{F}\left( F_1, F_2 \right)\) are never empty. To see this, it suffices to note that \(\mathcal{F}\left( F_1, F_2 \right)\) always contains the distribution functions \(M\), \(W\), and \(F_1F_2\), for example. In reality, \(\mathcal{F}\left( F_1, F_2 \right)\) always contains many elements. Indeed, for any \(\theta \in \left[ 0, 1 \right]\), and distribution functions \(F_{\boldsymbol{X}}\) and \(F_{\boldsymbol{Y}}\) in \(\mathcal{F}\left( F_1, F_2 \right)\), the distribution function \[ F_{\boldsymbol{Z}}(\boldsymbol{x}) = \theta F_{\boldsymbol{X}}(\boldsymbol{x}) + (1-\theta) F_{\boldsymbol{Y}}(\boldsymbol{x}) \] belongs to \(\mathcal{F}\left( F_1, F_2 \right)\). Using the Fréchet bounds of \(\mathcal{F}\left( F_1, F_2 \right)\), we can define the family of distribution functions \[\begin{equation*} F_\theta(\boldsymbol{x}) = \theta M\left( \boldsymbol{x}\right) + \left( 1-\theta \right) W\left( \boldsymbol{x}\right), \hspace{2mm} \theta \in \left[ 0, 1 \right], \end{equation*}\] all within \(\mathcal{F}\left( F_1, F_2 \right)\).

The distribution function \(F_1F_2\) corresponding to independence is not part of the family we just defined (i.e., there is no value of \(\theta\) such that \(F_\theta = F_1F_2\)). To include it, we can propose the family \[\begin{eqnarray*} F_\theta(\boldsymbol{x}) & =& \frac{1}{2}\theta^{2}(1-\theta)M\left( \boldsymbol{x}\right) \\ && + \left( 1-\theta^{2} \right) F_1(x_1)F_2(x_2) \\ && + \frac{1}{2}\theta^{2}(1+\theta)W\left( \boldsymbol{x}\right), \hspace{2mm} \theta \in [-1, 1], \end{eqnarray*}\] known as the Mardia family, which includes the Fréchet bounds and independence as special cases. Specifically, we have \(M\) for \(\theta=1\), \(W\) for \(\theta=-1\), and \(F_1F_2\) for \(\theta=0\).

One may wonder to which random variable pairs the Fréchet bounds in \({\mathcal{F}}(F_{1},F_{2})\) correspond. The following result provides an answer to this question.

Proposition 9.1 Let \(U\sim\mathcal{U}ni(0,1)\). In \({\mathcal{F}}(F_{1},F_{2})\),

  1. \(W_2\) is the distribution function of the pair \((F_{1}^{-1}(U),F_{2}^{-1}(U))\).
  2. \(M_2\) is the distribution function of the pair \((F_{1}^{-1}(U),F_{2}^{-1}(1-U))\).

\end{Proposition} ::: {.proof} For any \(\boldsymbol{x}\in {\mathbb{R}}^2\), we have \[\begin{eqnarray*} &&{\Pr}[F_{1}^{-1}(U)\leq x_1,F_{2}^{-1}(U)\leq x_2]\\ & = & {\Pr}[U\leq\min\{F_1(x_1),F_2(x_2)\}] \\ &=&\Pr[U\leq F_1(x_1),U\leq F_2(x_2)]\\ & = &W_2(x_1,x_2), \end{eqnarray*}\] and \[\begin{eqnarray*} &&{\Pr}[F_{1}^{-1}(U)\leq x_1,F_{2}^{-1}(1-U)\leq x_2] \\ & = & {\Pr}[U\leq F_1(x_1),1-U\leq F_2(x_2)] \\ & = & M_2(x_1,x_2). \end{eqnarray*}\] The announced result is thus established.

Proposition \(\ref{Prop3B3}\) teaches us, in particular, that the support of \(W\) is a non-decreasing curve in \({\mathbb{R}}^2\), while that of \(M\) is a non-increasing curve in \({\mathbb{R}}^2\).

9.2.3 Perfect Dependence: Comonotonicity and Antimonotonicity

9.2.3.1 Definition

Perfect dependence is said to occur when two risks can be written as increasing or decreasing functions of the same underlying random variable.

Definition 9.3

  1. The pair \(\boldsymbol{X}=(X_1,X_2)\) is called comonotone if there exist non-decreasing functions \(g_1\) and \(g_2\) and a random variable \(Z\) such that \[ \boldsymbol{X}\stackrel{d}{=}(g_1(Z),g_2(Z)). \]
  2. The pair \(\boldsymbol{X}=(X_1,X_2)\) is called antimonotone if there exist a non-decreasing function \(g_1\), a non-increasing function \(g_2\), and a random variable \(Z\) such that \[ \boldsymbol{X}\stackrel{d}{=}(g_1(Z),g_2(Z)). \]

\end{Definition}

Now that we have defined the concepts of perfect dependence, let’s examine some situations where comonotonicity and antimonotonicity naturally come into play.

Example 9.1 Very often, a risk \(X\) is divided into tranches and covered by different economic agents (insured, insurer, reinsurer, etc.). The interval \((a,a+h]\) of width \(h\) for the risk \(X\) is defined as \[ X_{(a,a+h]}=\left\{ \begin{array}{l} 0\text{ if }0\leq X<a\\ X-a\text{ if }a\leq X<a+h\\ h\text{ if }a+h\leq X \end{array} \right. \] where \(a\) is called the retention. The tail distribution function of \(X_{(a,a+h]}\) is given by \[ \overline{F}_{X_{(a,a+h]}}=\left\{ \begin{array}{l} \overline{F}_X(a+t)\text{ if }t<h\\ 0\text{ if }t\geq h \end{array} \right. \] so the pure premium for the coverage of the interval is \[\begin{eqnarray*} \mathbb{E}[X_{(a,a+h]}]&=&\int_{t=0}^{+\infty}\overline{F}_{X_{(a,a+h]}}(t)dt\\ &=&\int_{t=0}^h\overline{F}_X(a+t)dt\\ &=&\int_{t=a}^{a+h}\overline{F}_X(t)dt. \end{eqnarray*}\]

Now consider two intervals \((a,a+h)\) and \((b,b+h)\) of the same risk \(X\) (whose distribution function is assumed to be continuous, for simplicity). The costs \(X_{(a,a+h]}\) and \(X_{(b,b+h]}\) are comonotonic because both are non-decreasing functions of \(X\) (which satisfies Definition \(\ref{DefComonot}\)(i)). Indeed, \[\begin{eqnarray*} X_{(a,a+h]}&=&\min\{X-a,h\}\mathbb{I}[X>a]=g_1(X)\\ X_{(b,b+h]}&=&\min\{X-b,h\}\mathbb{I}[X>b]=g_2(X) \end{eqnarray*}\] where \(g_1\) and \(g_2\) are non-decreasing functions.

::: {.example}[Comonotonic Exchange of Risks]

Consider a risk \(X\) divided into \[ X_1=\left\{\begin{array}{l} X,\text{ if } X\leq d,\\ d,\text{ otherwise}, \end{array} \right. \] covered by the insurer, and \[ X_2=\left\{\begin{array}{l} 0,\text{ if } X\leq d,\\ X-d,\text{ otherwise}, \end{array} \right. \] covered by the reinsurer in a stop-loss treaty. Since \(X_1\) and \(X_2\) are increasing functions of risk \(X\), \(X_1\) and \(X_2\) are comonotonic according to Definition \(\ref{DefComonot}\)(i).

This is the most commonly encountered situation, with most risk exchanges leading to comonotonic variables. This ensures that both partners participating in the exchange will see their financial exposure increase when the underlying risk increases. More precisely, if we denote \(I:{\mathbb{R}}^+\rightarrow {\mathbb{R}}^+\) as the indemnity function (i.e., \(I(x)\) is the amount the insurer (in the broad sense) will have to pay if a loss of amount \(x\) occurs), we require that \(I\) be non-decreasing. This ensures that \(X\) and \(I(X)\) are comonotonic. If we further require that \(I\) grows less rapidly than the identity (which is equivalent to demanding \(I'\leq 1\) when \(I\) is differentiable), then \(X-I(X)\) and \(X\) are also comonotonic.

There are many examples of insurance policies that satisfy these constraints. These include

  1. compulsory excess: \(I(x)=(x-\delta)_+\) for an excess \(\delta\geq 0\);
  2. coinsurance: \(I(x)=\alpha x\) for \(\alpha\in[0,1]\);
  3. intervention cap: \(I(x)=\min\{x,\omega\}\) for \(\omega\geq 0\);
  4. coverages combining several of these mechanisms, such as \(I(x)=\min\{\alpha(x-d_1)_+,d_2\}\).

::: {.example}[Derivative Products in Finance]

Beautiful examples of comonotonicity and antimonotonicity are provided by modern stochastic finance. Let \(Z\) be the price of a stock at time \(t\), and consider call and put options with a strike price \(K\) and maturity \(t\) on this stock. In this case, the value of the call option is \[ V_{\text{call}}=(Z-K)_+ \] and the value of the put option is \[ V_{\text{put}}=(K-Z)_+. \] Returning to Definition \(\ref{DefComonot}\), it is easy to see that \(V_{\text{call}}\) and \(Z\) are comonotonic, while \(V_{\text{put}}\) and \(Z\) are antimonotonic, as are \(V_{\text{call}}\) and \(V_{\text{put}}\).

:::

This situation of perfect dependence corresponds precisely to Fréchet bounds, as shown by the following result, to be related to Proposition \(\ref{Prop3B3}\).

Proposition 9.2

  1. The pair \(\boldsymbol{X}=(X_1,X_2)\) is comonotonic if, and only if, it has \(W\) as its distribution function.
  2. The pair \(\boldsymbol{X}=(X_1,X_2)\) is antimonotonic if, and only if, it has \(M\) as its distribution function.

In the case where the marginal distribution functions \(F_1\) and \(F_2\) are continuous, this result can be further strengthened using Property \(\ref{NombrAl}\).

Proposition 9.3 Suppose \(F_{1}\) and \(F_{2}\) are continuous. Then,

  1. \(\boldsymbol{X}\) is comonotonic if, and only if, \[ (X_1,X_2)=_{\text{loi}}(X_1,F_2^{-1}(F_1(X_1))); \]
  2. \(\boldsymbol{X}\) is antimonotonic if, and only if, \[ (X_1,X_2)=_{\text{loi}}(X_1,F_2^{-1}(\overline{F}_1(X_1))). \]

Proof. We only prove (1), the reasoning leading to (2) being similar. Property \(\ref{NombrAl}\) allows us to write \[\begin{eqnarray*} & & {\Pr}\Big[X_1\leq x_1,F_2^{-1}\Big(F_1(X_1)\Big)\leq x_2\Big]\\&=& {\Pr}\Big[X_1\leq F_1^{-1}\Big(F_1(x_1)\Big),F_2^{-1}\Big(F_1(X_1)\Big)\leq x_2\Big]\\ &=& {\Pr}\Big[F_1(X_1)\leq F_1(x_1),F_1(X_1)\leq F_2(x_2)\Big]\\&=&W(x_1,x_2) \end{eqnarray*}\] which completes the proof thanks to Proposition \(\ref{FunctLink}\).

The following result indicates that VaRs are additive in the case of comonotonic risks.

Proposition 9.4 Let \(\boldsymbol{X}\) be comonotonic, with its joint distribution function belonging to \(\mathcal{F}(F_1,F_2)\), where \(F_1\) and \(F_2\) are continuous and increasing. Then, for any probability level \(\alpha\in[0,1]\), we have \[ {\text{VaR}}[X_1+X_2;\alpha]={\text{VaR}}[X_1;\alpha]+{\text{VaR}}[X_2;\alpha] \] \end{Proposition} ::: {.proof} If \(X_1\) and \(X_2\) are comonotonic, then according to Propositions \(\ref{Prop3B3}\) and \(\ref{PropDh1}\) \[ X_{1}+X_{2}=_{\text{loi}}\Psi \left( U\right) \] with \(U\sim\mathcal{U}ni(0,1)\), and the function \(\Psi\) is given by \[ \Psi \left( u\right) =F_1^{-1}(u)+F_2^{-1}(u),\qquad 0\leq u\leq 1. \] The function \(\Psi\) defined in this way is clearly non-decreasing. For \(0<p<1\), it is sufficient to invoke Lemma \(\ref{InvGQuant}\) to write \[ F_{X_1+X_2}^{-1}(p)=F_{\Psi \left( U\right) }^{-1}(p)=\Psi \left( F_{U}^{-1}(p)\right) =\Psi (p), \] as stated. It remains to deal with the limit cases, i.e., to verify that the result still holds for \(p=0\) and \(p=1\). Indeed, \[ F_{X_1+X_2}^{-1}(1)=F_1^{-1}(1)+F_2^{-1}(1) \] is correct since \({X_1+X_2}\) will reach its maximum value if and only if each of the two terms reaches its maximum value, both being non-decreasing functions of the same variable \(Z\)). This completes the proof.

Example 9.2 Suppose \(X_1\sim\mathcal{E}xp(1/b_1)\) and \(X_2\sim\mathcal{E}xp(1/b_2)\) with \(b_1>0\) and \(b_2>0\). If \(X_1\) and \(X_2\) are comonotones, then the inverse of the tail function of \(X_1+X_2\) is \[ \overline{F}_{X_1+X_2}^{-1}(p)=-b_\bullet\ln p\text{ with }b_\bullet=b_1+b_2, \] i.e., \(X_1+X_2\sim\mathcal{E}xp(1/b_\bullet)\). In other words, the sum of two exponentially distributed and comonotonic random variables is also exponentially distributed.

Example 9.3 Suppose \(X_1\sim\mathcal{P}ar(\alpha,\theta_1)\) and \(X_2\sim\mathcal{P}ar(\alpha,\theta_2)\). If \(X_1\) and \(X_2\) are comonotonic, the inverse of the tail function of their sum \(X_1+X_2\) is \[ \overline{F}_{X_1+X_2}^{-1}(p)=\theta_\bullet(p^{-1/\alpha}-1)\text{ with }\theta_\bullet=\theta_1+\theta_2, \] i.e., \(X_1+X_2\sim\mathcal{P}ar(\alpha,\theta_\bullet)\).

9.3 Measures of Dependence

9.3.1 Concept

Although common in insurance and finance, the term “correlation” is often misunderstood. While correlation is just a specific measure of dependence in statistics (often called linear correlation or Pearson correlation), practitioners tend to refer to all concepts of dependence with the same term.

In this section, we will study some measures of dependence. We will see that the coefficient of linear correlation, a canonical measure of dependence in the Gaussian world, loses much of its relevance when departing from this domain. For continuous variables, we will find that the Spearman and Kendall correlation coefficients provide good measures of dependence.

Before we proceed, it is important to define what is meant by a “good measure of dependence.” Probabilists quickly wondered what properties a measure of dependence should possess to be practical. This led to the definition of a concordance measure.

Definition 9.4 A measure of dependence \(\delta(.,.)\) is a concordance measure if it has the following desirable properties:

  • P1 (Symmetry) \(\delta(X_1,X_2)=\delta(X_2,X_1)\);
  • P2 (Normalization) \(-1\leq\delta(X_1,X_2)\leq 1\);
  • P3 \(\delta(X_1,X_2)=1\) if, and only if, \(X_1\) and \(X_2\) are comonotonic;
  • P4 \(\delta(X_1,X_2)=-1\) if, and only if, \(X_1\) and \(X_2\) are antimonotonic;
  • P5 for any strictly monotonic function \(g:{\mathbb{R}}\to{\mathbb{R}}\), \[ \delta(g(X_1),X_2)=\left\{ \begin{array}{l} \delta(X_1,X_2)\text{ if $g$ is increasing}\\ -\delta(X_1,X_2)\text{ if $g$ is decreasing}. \end{array} \right. \]

Initially, one may consider other desirable properties, but they may not necessarily be compatible with P1-P5. Another seemingly interesting property is \[\begin{equation} \label{DesDep} \delta(X_1,X_2)=0\Leftrightarrow X_1\text{ and }X_2\text{ are independent}. \end{equation}\] Unfortunately, this property contradicts P5, as shown in the following result.

Proposition 9.5 There is no concordance measure satisfying (\(\ref{DesDep}\)).

Proof. Consider the pair \((X_1,X_2)\) uniformly distributed along the unit circle in the \({\mathbb{R}}^2\) plane, i.e., \[ (X_1, X_2)=(\cos Z,\sin Z)\text{ where }Z\sim\mathcal{U}ni[0,2\pi]. \] Since \((-X_1,X_2)\stackrel{\text{law}}{=}(X_1,X_2)\), we have \[ \delta(-X_1,X_2)=\delta(X_1,X_2)=-\delta(X_1,X_2), \] which implies that \(\delta(X_1,X_2)=0\) even though \(X_1\) and \(X_2\) are clearly dependent (but not comonotonic!).

9.3.2 Linear Correlation or Pearson Correlation

9.3.2.1 Covariance

In general, the variance of a sum is not simply the sum of variances (this is only true if the random variables involved are independent, as we saw in Property \(\ref{VarSomme}\)). Indeed, the variability of a sum of random variables can be more or less than the simple aggregation of the variabilities of the individual terms, depending on how they interact. Let’s illustrate this. To do so, let’s focus on the variance of a sum of two random variables \(X_1+X_2\): \[\begin{eqnarray*} \mathbb{V}[X_1+X_2]&=&\mathbb{E}\Big[\big\{X_1-\mathbb{E}[X_1]+X_2-\mathbb{E}[X_2]\big\}^2\Big]\\ &=&\mathbb{E}\big[(X_1-\mathbb{E}[X_1])^2\big]+\mathbb{E}\big[(X_2-\mathbb{E}[X_2])^2\big]\\ & & +2\mathbb{E}\Big[\big\{X_1-\mathbb{E}[X_1]\big\}\big\{X_2-\mathbb{E}[X_2]\big\}\Big]\\ &=&\mathbb{V}[X_1]+\mathbb{V}[X_2]+2\mathbb{C}[X_1,X_2]. \end{eqnarray*}\] The variance of a sum is equal to the sum of the variances of the two terms plus the expectation of the product of the centered variables (which disappears when \(X_1\) and \(X_2\) are independent). This latter expectation is called covariance (as it expresses how \(X_1\) and \(X_2\) “co-vary”) and is defined as follows.

Definition 9.5 The covariance between \(X_1\) and \(X_2\), denoted as \(\mathbb{C}[X_1,X_2]\), is defined as \[\begin{eqnarray*} \mathbb{C}[X_1,X_2] &=&\mathbb{E}\big[ \left( X_1-\mathbb{E}[X_1]\right) \left( X_2-\mathbb{E}[ X_2] \right) \big]\\ &=&\mathbb{E}[X_1X_2] -\mathbb{E}[ X_1] \mathbb{E}[X_2] . %\\ %&=&\int \int \left[ F\left( x,y\right) -F^{\bot }\left( x,y\right) %\right] dxdy \end{eqnarray*}\]

Thus, \[\begin{equation} \label{VarSom} \mathbb{V}[X_1+X_2]=\mathbb{V}[X_1]+\mathbb{V}[X_2]+2\mathbb{C}[X_1,X_2] \end{equation}\] and it is the covariance that quantifies the excess variability (or reduction of it) of the sum of two random variables compared to the sum of their variances.

Remark. It is worth noting that \(\mathbb{V}[X]=\mathbb{C}[X,X]\), so the variance of a random variable measures how it “co-varies” with itself.

::: {.example}[Covariance between components of a normal vector]

If \(\boldsymbol{X}\sim\mathcal{N}or(\boldsymbol{\mu},\boldsymbol{\Sigma})\), then it can be easily verified that the off-diagonal elements \(\sigma_{ij}\) of \(\boldsymbol{\Sigma}\) are the covariances between pairs of components of \(\boldsymbol{X}\), i.e., \[ \sigma_{ij}=\mathbb{C}[X_i,X_j]\text{ for }i\neq j. \] :::

::: {.example}[Covariance between components of a multinomial vector] Consider the vector \(\boldsymbol{N}\sim\mathcal{M}ult(m,p_1,\ldots,p_n)\). Since \(N_i\sim\mathcal{B}in(m,p_i)\) and \(N_i+N_j\sim\mathcal{B}in(m,p_i+p_j)\) for any \(i\neq j\), we can deduce from \(\eqref{VarSom}\) that \[\begin{eqnarray*} \mathbb{C}[N_i,N_j]&=&\frac{1}{2}\Big(\mathbb{V}[N_i+N_j]-\mathbb{V}[N_i]-\mathbb{V}[N_j]\Big)\\ &=&\frac{1}{2}\Big(m(p_i+p_j)(1-p_i-p_j)-mp_i(1-p_i)-mp_j(1-p_j)\Big)\\ &=&-mp_ip_j. \end{eqnarray*}\] :::

If \(X_1\) and \(X_2\) are independent, their covariance is zero. However, the converse is not true, as demonstrated in the following example.

Example 9.4 A zero covariance either indicates no dependence between \(X_1\) and \(X_2\) or a non-linear relationship. For example, if \(X\sim\mathcal{N}or(0,1)\), \[ \mathbb{C}[X,X^2]=\mathbb{E}[X^3]-\mathbb{E}[X]\mathbb{E}[X^2]=0 \] even though \(X\) and \(X^2\) are strongly dependent. Note that \(X\) and \(X^2\) are not comonotonic!

Remark. A null covariance is closely related to the concept of orthogonality. Indeed, the space \(L^2\) of square-integrable random variables is a Hilbert space equipped with an inner product for which \(\mathbb{C}[X_1,X_2]=0\) reflects orthogonality in \(L^2\).

Just as we defined conditional expectation and conditional variance, we can introduce the concept of conditional covariance.

Definition 9.6 For a random vector \(\boldsymbol{\Theta}\) with \(m\) dimensions, the conditional covariance of random variables \(X_1\) and \(X_2\) given \(\boldsymbol{\Theta}\) is the random variable \[\begin{eqnarray*} \mathbb{C}[X_1,X_2|\boldsymbol{\Theta}]&=&\mathbb{E}\Big[\big\{X_1-\mathbb{E}[X_1|\boldsymbol{\Theta}]\big\} \big\{X_2-\mathbb{E}[X_2|\boldsymbol{\Theta}]\big\}\Big|\boldsymbol{\Theta}\Big]\\ &=&\mathbb{E}[X_1X_2|\boldsymbol{\Theta}]-\mathbb{E}[X_1|\boldsymbol{\Theta}]\mathbb{E}[X_2|\boldsymbol{\Theta}]. \end{eqnarray*}\]

Furthermore, conditional covariance enjoys the following properties.

Proof. We will only prove (i). It suffices to write \[\begin{eqnarray*} \mathbb{C}[X_1,X_2] & = & \mathbb{E}\Big[\mathbb{E}\big[(X_1-\mathbb{E}[X_1])(X_2-\mathbb{E}[X_2])\big|\boldsymbol{\Theta}\big]\Big]\\ &=&\mathbb{E}\Big[\mathbb{E}\big[(X_1-\mathbb{E}[X_1|\boldsymbol{\Theta}]+\mathbb{E}[X_1|\boldsymbol{\Theta}]-\mathbb{E}[X_1])\\ &&\hspace{10mm} (X_2-\mathbb{E}[X_2|\boldsymbol{\Theta}]+\mathbb{E}[X_2|\boldsymbol{\Theta}]-\mathbb{E}[X_2])\big|\boldsymbol{\Theta}\big]\Big]\\ &=&\mathbb{E}\Big[\mathbb{E}\big[(X_1-\mathbb{E}[X_1|\boldsymbol{\Theta}])(X_2-\mathbb{E}[X_2|\boldsymbol{\Theta}])\big|\boldsymbol{\Theta}\big]\Big]\\ &&+\mathbb{E}\Big[\underbrace{\mathbb{E}\big[X_1-\mathbb{E}[X_1|\boldsymbol{\Theta}]|\boldsymbol{\Theta}]}_{=0}(\mathbb{E}[X_2|\boldsymbol{\Theta}]-\mathbb{E}[X_2])\Big] \\ &&+\mathbb{E}\Big[\underbrace{\mathbb{E}\big[X_2-\mathbb{E}[X_2|\boldsymbol{\Theta}]|\boldsymbol{\Theta}]}_{=0}(\mathbb{E}[X_1|\boldsymbol{\Theta}]-\mathbb{E}[X_1])\Big]\\ &&+\mathbb{E}\Big[(\mathbb{E}[X_1|\boldsymbol{\Theta}]-\mathbb{E}[X_1])(\mathbb{E}[X_2|\boldsymbol{\Theta}]-\mathbb{E}[X_2])\Big] \\ &=&\mathbb{E}\Big[\mathbb{C}[X_1,X_2|\boldsymbol{\Theta}]\Big]+ \mathbb{C}\Big[\mathbb{E}[X_1|\boldsymbol{\Theta}],\mathbb{E}[X_2|\boldsymbol{\Theta}]\Big]. \end{eqnarray*}\]

Example 9.5 Suppose that, conditional on \(\Theta=\theta\), \(N_1\) and \(N_2\) are independently distributed with respective Poisson distributions \(\mathcal{P}oi(\lambda_1\theta)\) and \(\mathcal{P}oi(\lambda_2\theta)\). Then, in accordance with Property \(\ref{PropCovCond}\)(i), we have \[ \mathbb{C}[N_1,N_2]=\mathbb{E}\Big[\mathbb{C}[N_1,N_2|\Theta]\Big] +\mathbb{C}\Big[\mathbb{E}[N_1|\Theta],\mathbb{E}[N_2|\Theta]\Big]. \] Now, \[ \mathbb{C}[N_1,N_2|\Theta]=0 \] due to Property \(\ref{PropCovCond}\)(ii), and thus \[ \mathbb{C}[N_1,N_2]=\mathbb{C}[\lambda_1\Theta,\lambda_2\Theta]=\lambda_1\lambda_2\mathbb{V}[\Theta] \] which, for example, reduces to \(\lambda_1\lambda_2/a\) when \(\Theta\sim\mathcal{G}am(a,a)\).

Covariance, by its sign, informs us about the direction of “co-variation” between \(X_1\) and $X_2,” but its absolute value does not tell us about the magnitude of this co-variation. Indeed, the value of covariance is sensitive to the inherent variability of both variables \(X_1\) and \(X_2\). In particular, covariance can change drastically if we change the units of measurement for \(X_1\) and/or \(X_2\) (switching from thousands to millions of euros for \(X_1\) and \(X_2\) would divide the covariance by \(10^6\)!).

Correlation aims to correct this drawback of covariance by providing an index that does not depend on units of measurement or the individual variability of the variables.

Definition 9.7 The linear correlation coefficient between \(X_1\) and \(X_2\), denoted as \(r(X_1,X_2)\), is defined as \[ r\left(X_1,X_2\right) =\frac{\mathbb{C}[X_1,X_2]}{\sqrt{\mathbb{V}[ X_1] \mathbb{V}[X_2]}}. \]

We observe that this is a dimensionless number that exists only when the variances and covariance that compose it are well-defined. The sign of the covariance coincides with that of the correlation. Therefore, what was mentioned earlier about the sign of the covariance is always applicable to the sign of the correlation.

Remark. It is easy to see that \(r(X_1, X_2)\) is, in fact, the covariance between the two centered and standardized variables, i.e., \[ r(X_1, X_2) = \text{Cov}[V_1, V_2], \] where \[ V_1 = \frac{X_1 - \mathbb{E}[X_1]}{\sqrt{\text{Var}[X_1]}} \text{ and } V_2 = \frac{X_2 - \mathbb{E}[X_2]}{\sqrt{\text{Var}[X_2]}}. \]

For any random variables \(X_1\) and \(X_2\), we always have \[ -1 \leq r(X_1, X_2) \leq 1 \] according to the Cauchy-Schwarz inequality. The bounds +1 and -1 are reached when \(X_1\) and \(X_2\) are related by a linear relationship. More precisely, if \(X_2 \sim a + bX_1\) with \(b \neq 0\), then the correlation is 1 in absolute value, and it has the sign of \(b\). The converse is also true.

Contrary to what one might think, there are many situations where the linear correlation coefficient cannot reach the bounds -1 and 1, as demonstrated in the following example.

Example 9.6 Let \(X_1\) and \(X_2\) be two random variables with the same support in \({\mathbb{R}}^+\). Then, \(r(X_1, X_2) > -1\). To establish this result, let us reason by contradiction and assume that \(r(X_1, X_2) = -1\), which implies \[ X_2 \sim aX_1 + b \text{ with } a < 0, \hspace{2mm} b \in {\mathbb{R}}. \] It follows then for any arbitrary \(x_2 < 0\), \[\begin{align*} F_2(x_2) = \Pr[aX_1 + b \leq x_2] & = \Pr\left[X_1 \geq \frac{x_2 - b}{a}\right] \\ & \geq \Pr\left[X_1 > \frac{x_2 - b}{a}\right] \\ & = \overline{F}_1\left(\frac{x_2 - b}{a}\right) > 0, \end{align*}\] which clearly contradicts the assumption \(F_2(0) = 0\).

The following technical lemma will be very useful in obtaining the possible values for the linear correlation coefficient. It can be seen as a two-dimensional generalization of Property \(\ref{RepEsp}\).

Lemma 9.1 For any random variables \(X_1\) and \(X_2\) with joint distribution function \(F_{\boldsymbol{X}}\), we have \[ \mathbb{E}[X_1X_2] = \int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \overline{F}_{\boldsymbol{X}}(x_1,x_2)dx_1dx_2. \] \end{Lemma} ::: {.proof} Let us start by writing \[\begin{align*} &\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Pr[X_1>x_1,X_2>x_2]dx_1dx_2\\ &= \int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty}\int_{y_1=x_1}^{+\infty}\int_{y_2=x_2}^{+\infty} dF_{\boldsymbol{X}}(y_1,y_2)dx_1dx_2. \end{align*}\] Now, let’s invoke Fubini’s theorem to interchange the integrals, yielding \[\begin{align*} &\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Pr[X_1>x_1,X_2>x_2]dx_1dx_2\\ &= \int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty}\int_{y_1=x_1}^{+\infty}\int_{y_2=x_2}^{+\infty} dF_{\boldsymbol{X}}(y_1,y_2)dx_1dx_2\\ &= \int_{y_1=0}^{+\infty}\int_{y_2=0}^{+\infty}\int_{x_1=0}^{y_1}\int_{x_2=0}^{y_2} dx_1dx_2dF_{\boldsymbol{X}}(y_1,y_2)\\ &= \int_{y_1=0}^{+\infty}\int_{y_2=0}^{+\infty}y_1y_2dF_{\boldsymbol{X}}(y_1,y_2) = \mathbb{E}[X_1X_2], \end{align*}\] which completes the proof.

This representation of the covariance portrays it as a distance between the joint distribution function \(F_{\boldsymbol{X}}\) of \(\boldsymbol{X}\) and that of a pair with the same marginal distributions as \(\boldsymbol{X}\) but with independent components.

We deduce from Corollary \(\ref{TchenIn}\) that for any pair \(\boldsymbol{X}\), the following inequalities hold: \[\begin{align*} & \int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{M(x_1,x_2)-F_1(x_1)F_2(x_2)\Big\}dx_1dx_2\\ & = \mathbb{C}[F_1^{-1}(U),F_2^{-1}(1-U)]\\ & \leq \mathbb{C}[X_1,X_2]\\ & \leq \int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{W(x_1,x_2)-F_1(x_1)F_2(x_2)\Big\}dx_1dx_2\\ & = \mathbb{C}[F_1^{-1}(U),F_2^{-1}(U)]. \end{align*}\]

9.3.3 Values of the Linear Correlation Coefficient

If we define \[ r_{\min} = \frac{\mathbb{C}[F_1^{-1}(U),F_2^{-1}(1-U)]} {\sqrt{\mathbb{V}[X_1]\mathbb{V}[X_2]}} \] and \[ r_{\max} = \frac{\mathbb{C}[F_1^{-1}(U),F_2^{-1}(U)]} {\sqrt{\mathbb{V}[X_1]\mathbb{V}[X_2]}} \] then the linear correlation coefficient \(r(X_1,X_2)\) satisfies the inequalities \[\begin{equation} \label{InCoeffCorr} r_{\min}\leq r(X_1,X_2) \leq r_{\max}, \end{equation}\] so that a value of \(\pm 1\) is generally not admissible.

We show in the following result that the extreme values of the linear correlation coefficient characterize comonotonicity and antimonotonicity.

Proposition 9.6 Let \(\boldsymbol{X}\) have a distribution function \(F_{\boldsymbol{X}}\) in \(\mathcal{F}(F_1,F_2)\). Then,

  1. \(r(X_1,X_2)=r_{\max}\) if, and only if, \(\boldsymbol{X}\) is comonotonic.
  2. \(r(X_1,X_2)=r_{\min}\) if, and only if, \(\boldsymbol{X}\) is antimonotonic.

Proof. We establish only (1), with the reasoning leading to (2) being similar. Corollary \(\ref{TchenIn}\) shows that under the conditions of (1) \[ 0=\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{W(x_1,x_2)-F_{\boldsymbol{X}}(x_1,x_2)\Big\}dx_1dx_2. \] Since the integrand is everywhere non-negative, the integral’s nullity implies \(F_{\boldsymbol{X}}=W\), from which the result follows by virtue of Proposition \(\ref{Prop3B3}\).

The following example illustrates the possible values of the linear correlation coefficient for a pair of log-normally distributed random variables.

Example 9.7 Let \(X_1\sim \mathcal{LN}or(0,1)\) and \(X_2\sim \mathcal{LN}or (0,\sigma ^{2})\). The extreme values for the linear correlation coefficient are achieved when \(X_1\) and \(X_2\) are perfectly dependent. Thus, \[\begin{align*} r_{\max}(\sigma)&=r(\exp (Z),\exp(\sigma Z))\\ &=\frac{\exp(\sigma)-1}{\sqrt{\exp(\sigma^2)-1}\sqrt{e-1}} \end{align*}\] where \(Z\sim \mathcal{N}or(0,1)\), and \[\begin{align*} r_{\min}(\sigma)&=r(\exp (Z),\exp (-\sigma Z))\\ &=\frac{\exp(-\sigma)-1}{\sqrt{\exp(\sigma^2)-1}\sqrt{e-1}}. \end{align*}\] These bounds are plotted as functions of \(\sigma\) in Figure \(\ref{ExtPearson}\). It can be seen that \[ \lim_{\sigma\to +\infty}r_{\max}(\sigma)=\lim_{\sigma\to +\infty}r_{\min}(\sigma)=0. \] Therefore, it is possible to have a pair \(\boldsymbol{X}\) with a nearly zero correlation coefficient even though its components are perfectly dependent. This clearly contradicts the intuition that low values of the correlation coefficient imply low dependence.

::: {.example}[Exchangeability and Correlation]

Random variables \(X_{1},...,X_{n}\) are said to be exchangeable if, for any permutation $$ of ${ 1,…,n} $,% \[\begin{equation*} \left( X_{1},...,X_{n}\right) =_{\text{law}}\left(X_{\pi \left( 1\right) },...,X_{\pi \left( n\right) }\right). \end{equation*}\]%

It can be noted that if random variables \(X_{1},...,X_{n}\) are independent and identically distributed, then they are exchangeable. However, the converse is false: if \(Y\) is a symmetric variable, and if $( X_{1},X_{2})=(Y,-Y) $, then the pair $( X_{1},X_{2}) $ is exchangeable, but \(X_1\) and \(X_2\) are not independent and identically distributed. The concept of exchangeability is very important for modeling homogeneous risks, for example.

Consider random variables \(X_{1},...,X_{n}\) that are exchangeable. Then, denoting \(\sigma ^{2}=\mathbb{V}[X_{i}]\) and \(\rho =r(X_{i},X_{j})\) for \(i\neq j\), we have \[\begin{equation*} 0\leq \mathbb{V}[ X_{1}+...+X_{n}] =n\sigma ^{2}+n\left(n-1\right) \rho \sigma ^{2}. \end{equation*}\] As a result, the correlation $$ of a pair of exchangeable variables satisfies \[ \rho \geq \frac{-1}{ n-1}. \] This bound can be achieved by considering Gaussian vectors. :::

Despite its drawbacks, the correlation coefficient remains widely used in practice for three major reasons:

The linear correlation coefficient naturally appears as a parameter in the density of Gaussian vectors.

The coefficients \(a^*\) and \(b^*\) that minimize \(\mathbb{E}[(X_2-aX_1-b)^2]\) are given by \[ a^*=\frac{\mathbb{C}[X_1,X_2]}{\mathbb{V}[X_1]}\text{ and }b^*=\mathbb{E}[X_2]-a^*\mathbb{E}[X_1]. \] The relation \[ r^2(X_1,X_2)=\frac{\mathbb{V}[X_2]-\min_{a,b}\mathbb{E}[(X_2-aX_1-b)^2]}{\mathbb{V}[X_2]} \] shows that the square of the correlation coefficient represents the proportion of variability (measured by the sum of squares) in \(X_2\) that can be explained by a linear function of \(X_1\).

The linear correlation coefficient appears as a fundamental factor in the portfolio theory introduced by Markowitz in 1952. Indeed, if we denote \(R_{M}\) as the market return, \(R_{i}\) as the return of the \(i\)th risky asset, and \(R_{0}\) as the risk-free rate, then the CAPM theory allows us to write \[\begin{equation*} \mathbb{E}[R_{i}-R_0] =\beta _{i}.\mathbb{E}[R_{M}-R_0] \text{ where }\beta _{i}=\frac{\mathbb{C}[ R_{i},R_{M}]}{\mathbb{V}[R_{M}] } \end{equation*}\] This coefficient \(\beta _{i}\) is then directly related to the correlation between the return of asset \(i\) and the average market return.

9.3.4 Kendall’s Rank Correlation Coefficient

9.3.4.1 Definition

As reasonable and intuitive as it may seem, one must be cautious not to blindly trust the linear correlation coefficient \(r\). Indeed, while it is the canonical measure of dependence in the multivariate Gaussian world, we have seen above that it loses much of its relevance as soon as we leave this ideal world.

To address the issues encountered with the linear correlation coefficient, we can turn to rank correlation coefficients, such as Kendall’s tau and Spearman’s rho. These measures of dependence are based on the concordances and discordances observed among collected data. Given two pairs \((X_1,X_2)\) and \((X_1',X_2')\), independently and identically distributed, Kendall’s tau is defined as the probability of “concordance” (i.e., the probability that we simultaneously have \(X_1<X_1'\) and \(X_2<X_2'\), or \(X_1>X_1'\) and \(X_2>X_2'\)) minus the probability of “discordance” (i.e., the probability that we simultaneously have \(X_1>X_1'\) and \(X_2<X_2'\), or \(X_1>X_1'\) and \(X_2<X_2'\)). This leads to the following definition.

Definition 9.8 Kendall’s tau associated with the pair \((X_1, X_2)\) of random variables with continuous marginal distribution functions is defined as follows: \[\begin{eqnarray*} \tau(X_1, X_2) & = & \Pr\left[(X_1-X_1')(X_2-X_2') > 0\right]\\ && - \Pr\left[(X_1-X_1')(X_2-X_2') < 0\right], \end{eqnarray*}\] where \((X_1', X_2')\) is independent of \((X_1, X_2)\) and has the same distribution as the latter.

Kendall’s tau can also be expressed as follows:

Proof. Starting from the Definition \(\ref{DefTau}\) of Kendall’s tau, we can write: \[\begin{eqnarray*} \tau(X_1, X_2) & = & 2\Pr\left[(X_1-X_1')(X_2-X_2') > 0\right] - 1\\ &=& 2\left\{\Pr\left[X_1 < X_1', X_2 < X_2'\right] + \Pr\left[X_1 > X_1', X_2 > X_2'\right]\right\} - 1\\ & = & 2\left\{\mathbb{E}\left[F_{\boldsymbol{X}}(X_1', X_2')\right] + \mathbb{E}\left[F_{\boldsymbol{X}}(X_1, X_2)\right]\right\} - 1, \end{eqnarray*}\] which completes the proof.

Kendall’s tau enjoys the property of functional invariance (as it is based on the ranks of observations rather than their values).

In particular, Kendall’s tau satisfies Property P5 of Definition \(\ref{DefMesConc}\) of concordance measures. Indeed, when \(g\) is increasing, this is an immediate consequence of \(\eqref{FunctInvTau}\). If \(g\) is decreasing, we have: \[\begin{eqnarray*} \tau(g(X_1), X_2) & = & 2\Pr\left[(g(X_1)-g(X_1'))(X_2-X_2') > 0\right] - 1\\ &=& 2\Pr\left[(X_1-X_1')(X_2-X_2') < 0\right] - 1\\ &=& 2\left\{1 - \Pr\left[(X_1-X_1')(X_2-X_2') > 0\right]\right\} - 1\\ &=& -\tau(X_1, X_2). \end{eqnarray*}\]

It is easy to see that Kendall’s tau takes its values in the interval \([-1, 1]\) and is defined regardless of the distributions involved (the restriction to finite variance distributions no longer applies to this measure of dependence). The values \(\pm 1\) can be achieved regardless of the marginals, and they characterize perfect dependence (thus, Kendall’s tau satisfies Properties P3 and P4 of the Definition \(\ref{DefMesConc}\) of concordance measures).

Proof. In accordance with Proposition \(\ref{PropDh1}\)(i), it suffices to show that the maximum value of 1 is achieved when \(X_2=g(X_1)\) with \(g\) non-decreasing (\(g\) is the function \(F_2^{-1}\circ F_1\)). This follows from: \[ \tau\left(X_1, g(X_1)\right) = 2\Pr\left[\left(X_1-X_1'\right)\left(g(X_1)-g(X_1')\right) > 0\right] - 1 = 1, \] since \(X_1-X_1'\) and \(g(X_1)-g(X_1')\) obviously always have the same sign. Similarly, the value -1 for Kendall’s tau is achieved when \(X_2=g(X_1)\) with \(g\) non-increasing.

Conversely, if \(\tau(X_1, X_2) = 1\), then the distribution function is \(W\). Indeed, the equality: \[ \tau(X_1, X_2) = \tau(U_1, U_2) \text{ where } U_1=F_1(X_1) \text{ and } U_2=F_2(X_2), \] is valid, and we can, in accordance with Property \(\ref{NombrAl}\), assume without loss of generality that the marginals \(F_1\) and \(F_2\) are \(\mathcal{U}ni(0,1)\). We must therefore demonstrate the implication: \[ \tau(U_1, U_2) = 1 \Rightarrow (U_1, U_2) =_{\text{law}} (U, U) \text{ where } U\sim\mathcal{U}ni(0,1). \] Let \(C\) be the distribution function of \((U_1, U_2)\) and \(C_U\) those of \((U, U)\); Proposition \(\ref{Prop6.2.3}\) ensures that \(C\leq C_U\). The following inequalities hold: \[ \mathbb{E}[C(U_1, U_2)]\leq \mathbb{E}[C_U(U_1, U_2)]\leq \mathbb{E}[C_U(U, U)] = \frac{1}{2}, \] since \(\min\{U_1, U_2\}\leq U_1\). We thus have the equivalence: \[ \tau(U_1, U_2) = \tau(U, U) \Leftrightarrow \mathbb{E}[C(U_1, U_2)] = \mathbb{E}[C_U(U, U)] \] which in turn implies: \[ \mathbb{E}[C_U(U_1, U_2)] - \mathbb{E}[C(U_1, U_2)] = \int_{u_1=0}^1\int_{u_2=0}^1\left\{C_U(u_1, u_2) - C(u_1, u_2)\right\}dC(u_1, u_2) = 0, \] allowing us to conclude that \(C=C_U\).

Similarly, let \(C_L\) be the distribution function of the pair \((U, 1-U)\); Proposition \(\ref{Prop6.2.3}\) ensures that \(C_L\leq C\). We can reason as above starting from the inequalities: \[ \mathbb{E}[C_L(U, 1-U)]\leq \mathbb{E}[C_L(U_1, U_2)]\leq \mathbb{E}[C(U_1, U_2)]. \]

So we have established that Kendall’s tau is indeed a measure of concordance. We can also note that if \(X_1\) and \(X_2\) are independent, then \(\tau(X_1, X_2) = 0\). Indeed, \[\begin{eqnarray*} \tau(X_1, X_2) & = & 2\Pr\left[(X_1-X_1')(X_2-X_2') > 0\right] - 1 \\ & = & 2\left\{\Pr\left[X_1-X_1' > 0, X_2-X_2' > 0\right] + \Pr\left[X_1-X_1' < 0, X_2-X_2' < 0\right]\right\} - 1 \\ & = & 2\left\{\frac{1}{4} + \frac{1}{4}\right\} - 1 = 0. \end{eqnarray*}\]

9.3.5 Spearman’s Rank Correlation Coefficient

9.3.5.1 Definition

Just like Kendall’s tau, Spearman’s rho is based on the concepts of concordance and discordance.

Definition 9.9 Consider the pair \(\boldsymbol{X}\) of continuous marginal distribution functions \(F_1\) and \(F_2\) and define \(\boldsymbol{X}^\perp = (X_1^\perp, X_2^\perp)\) as an independent version of \(\boldsymbol{X}\) (i.e., \(\boldsymbol{X}^\perp\) has the joint distribution function \(F_1F_2\)). Spearman’s rho is then defined as three times the difference between the probabilities of concordance and discordance of \(\boldsymbol{X}\) and \(\boldsymbol{X}^\perp\), i.e., \[\begin{eqnarray*} \rho(X_1, X_2) & = & 3\left\{\Pr\left[(X_1-X_1')(X_2-X_2') > 0\right] \right. \\ && \left. - \Pr\left[(X_1-X_1')(X_2-X_2') < 0\right]\right\}. \end{eqnarray*}\]

Spearman’s rank correlation coefficient addresses the shortcomings of Pearson’s linear correlation coefficient by considering not the original variables \(X_1\) and \(X_2\), but the uniform versions \(F_1(X_1)\) and \(F_2(X_2)\). When dealing with random variables having a linear relationship between them, extreme values \(\pm 1\) can be achieved.

We are now able to define the supermodular order, which will allow us to compare the intensity of dependence between the components of random couples.

Definition 9.10 The vector \({\boldsymbol{X}}\) is said to be inferior to the vector \({\boldsymbol{Y}}\) in the supermodular sense, which will now be denoted as \({\boldsymbol{X}}{\preceq_{\text{sm}}}{\boldsymbol{Y}}\), when the inequality \(\mathbb{E}[g({\boldsymbol{X}})]\leq \mathbb{E}[g({\boldsymbol{Y}})]\) holds for all supermodular functions \(g:{\mathbb{R}}^2\to {\mathbb{R}}\) such that expectations exist.

The stochastic inequality \({\boldsymbol{X}}{\preceq_{\text{sm}}}{\boldsymbol{Y}}\) should be understood as “the components of \({\boldsymbol{X}}\) are less dependent than those of \({\boldsymbol{Y}}\),” meaning that \(F_{\boldsymbol{Y}}\), on average, assigns more probability mass to points located on the upward diagonal of all rectangles in the plane than \(F_{\boldsymbol{X}}\).

Directly verifying the conditions of Definition \(\ref{DefSm}\) is generally difficult. However, the following result, which we will accept without proof, allows us to restrict ourselves to regular supermodular functions, as defined in Property \(\ref{PropSM}\)(i).

9.3.6 Functional Stability of Supermodular Comparisons

Having two random couples such that \(\boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\), i.e., the components \(Y_1\) and \(Y_2\) are more dependent than \(X_1\) and \(X_2\), the same judgment holds for any monotonic transformation of the components of these couples, as shown by the following result.

Proof. The result is easily obtained from Property \(\ref{PropSM}\)(ii). Indeed, for any supermodular function \(g\), define the function \[ \Psi(x_1,x_2)=g\big(f_1(x_1),f_2(x_2)\big). \] Then, we can write \[\begin{eqnarray*} \mathbb{E}\Big[g\big(f_1(X_1),f_2(X_2)\big)\Big]&=&\mathbb{E}[\Psi(X_1,X_2)]\\ &\leq &\mathbb{E}[\Psi(X_1,X_2)]\\ &&\text{ because $\Psi$ is supermodular and $\boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}$}\\ &=&\mathbb{E}\Big[g\big(f_1(Y_1),f_2(Y_2)\big)\Big], \end{eqnarray*}\] which completes the proof.

9.3.7 Supermodular Comparison and Fréchet Space

The functions \(g_1\) and \(g_2\) defined by \[ g_1(\boldsymbol{y})= \mathbb{I}[{\boldsymbol{y}}>{\boldsymbol{x}}]\text{ and } g_2(\boldsymbol{y})= \mathbb{I}[{\boldsymbol{y}}\le{\boldsymbol{x}}] \] are both supermodular for any fixed \({\boldsymbol{x}}\), and it is easy to see that \[\begin{equation} \label{eq9.A.16} {\boldsymbol{X}}\preceq_{\text{sm}}{\boldsymbol{Y}}\Rightarrow\left\{ \begin{array}{l} \mathbb{E}[g_1(\boldsymbol{X})]=\Pr[\boldsymbol{X}>\boldsymbol{x}]\leq\mathbb{E}[g_1(\boldsymbol{Y})]=\Pr[\boldsymbol{Y}>\boldsymbol{x}],\\ \hspace{10mm}\text{ for all }\boldsymbol{x}\in{\mathbb{R}}^2,\\ \mathbb{E}[g_2(\boldsymbol{X})]=\Pr[\boldsymbol{X}\leq\boldsymbol{x}]\leq\mathbb{E}[g_2(\boldsymbol{Y})]=\Pr[\boldsymbol{Y}\leq\boldsymbol{x}],\\ \hspace{10mm}\text{ for all }\boldsymbol{x}\in{\mathbb{R}}^2,\\ \end{array} \right. \end{equation}\] It also follows from \(\eqref{eq9.A.16}\) that \[ \boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\Rightarrow X_i=_{\text{law}}Y_i\text{ for }i=1,2. \] So, the relation \(\preceq_{\text{sm}}\) can only be used within the same Fréchet space.

9.3.8 Supermodular Comparison and Joint Distribution/Tail Functions

The following result shows that in fact, \(\eqref{eq9.A.16}\) characterizes \(\preceq_{\text{sm}}\).

\begin{Characterization} Let \(\boldsymbol{X}=(X_1,X_2)\) and \(\boldsymbol{Y}=(Y_1,Y_2)\) whose distribution functions are part of the same Fréchet space \(\mathcal{F}(F_1,F_2)\). Then, \[\begin{equation} \label{eq9.A.3} \boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\Leftrightarrow F_{\boldsymbol{X}}(x_1,x_2)\le F_{\boldsymbol{Y}}(x_1,x_2),\text{ for all }\boldsymbol{x}\in{\mathbb{R}}^2, \end{equation}\] or equivalently \[\begin{equation} \label{eq9.A.4} \boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\Leftrightarrow \overline{F}_{\boldsymbol{X}}(x_1,x_2)\le\overline{F}_{\boldsymbol{Y}}(x_1,x_2),\text{ for all }\boldsymbol{x}\in{\mathbb{R}}^2. \end{equation}\] end{Characterization} ::: {.proof} Let \(g:{\mathbb{R}}^2\to{\mathbb{R}}\) satisfy \(\eqref{RegSuperm}\). Integration by parts then gives \[\begin{eqnarray*} &&\mathbb{E}[g(\boldsymbol{Y})]-\mathbb{E}[g(\boldsymbol{X})]\\ &=& \int\int_{\boldsymbol{x}\in{\mathbb{R}}^2}g(\boldsymbol{x})d\Big\{F_{\boldsymbol{Y}}(\boldsymbol{x})-F_{\boldsymbol{X}}(\boldsymbol{x})\Big\}\\ &=& \int\int_{\boldsymbol{x}\in{\mathbb{R}}^2}\frac{\partial^2}{\partial x_1\partial x_2}g(\boldsymbol{x}) \Big\{F_{\boldsymbol{Y}}(\boldsymbol{x})-F_{\boldsymbol{X}}(\boldsymbol{x})\Big\}d\boldsymbol{x} \end{eqnarray*}\] which completes the proof. :::

Remark. It is important at this stage to emphasize that Characterization \(\ref{CorrelSm}\) is specific to dimension 2 (in the sense that the reverse of implication \(\eqref{eq9.A.16}\) is not valid in dimension 3 and higher).

9.3.9 Extreme Structures of Supermodular Dependence

The Fréchet bounds correspond to the strongest supermodular dependence structures, as shown by the following result, which immediately follows from Characterization \(\ref{CorrelSm}\).

Proposition 9.7 In \(\mathcal{F}(F_1,F_2)\), we have \[ (F_1^{-1}(U),F_2^{-1}(1-U))\preceq_{\text{sm}}\boldsymbol{X}\preceq_{\text{sm}}(F_1^{-1}(U),F_2^{-1}(U)) \] where \(U\sim\mathcal{U}ni(0,1)\).

9.3.10 Supermodular Comparison and Correlation Coefficients

The following result shows that supermodular comparison of multivariate normal distributions boils down to comparing the covariance matrices.

::: {.example}[Gaussian Vectors] Let \(\boldsymbol{X}\sim \mathcal{N}or(\boldsymbol{\mu}_{\boldsymbol{X}},\boldsymbol{\Sigma}_{\boldsymbol{X}})\) and \(\boldsymbol{Y}\sim \mathcal{N}or(\boldsymbol{\mu}_{\boldsymbol{Y}},\boldsymbol{\Sigma}_{\boldsymbol{Y}})\). Then, \[ \boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\Leftrightarrow \mathbb{C}[X_1,X_2]\leq\mathbb{C}[Y_1,Y_2]. \] This comes from the following representation valid for \(g\) satisfying \(\eqref{RegSuperm}\): \[\begin{eqnarray} \label{normaleq} &&\mathbb{E}[g(\boldsymbol{Y})] - \mathbb{E}[g(\boldsymbol{X})] \nonumber\\ &=&\int\limits_0^1 \!\int\int_{\boldsymbol{x}\in{\mathbb{R}}^2} \! \Big( (\boldsymbol{\mu}_{\boldsymbol{Y}} - \boldsymbol{\mu}_{\boldsymbol{X}})^\top \nabla g(\boldsymbol{x}) \nonumber\\ &&\hspace{15mm}+ \frac{1}{2} tr\big((\boldsymbol{\Sigma}_{\boldsymbol{Y}} - \boldsymbol{\Sigma}_{\boldsymbol{X}})\boldsymbol{H}_g(\boldsymbol{x})\big) \Big) \cdot \phi_\lambda(\boldsymbol{x}) \ d\boldsymbol{x}\ d\lambda, \end{eqnarray}\]bda, \end{eqnarray} where \(\nabla g\) and \(\boldsymbol{H}_g\) denote the gradient vector and Hessian matrix associated with the function \(g\), \(tr(\boldsymbol{A})\) is the trace of matrix \(\boldsymbol{A}\) (i.e., the sum of the diagonal elements of \(\boldsymbol{A}\)), and \(\phi_\lambda\) is the density associated with the distribution \(\mathcal{N}or(\lambda \boldsymbol{\mu}_{\boldsymbol{Y}} + (1-\lambda) \boldsymbol{\mu}_{\boldsymbol{X}} ,\lambda \boldsymbol{\Sigma}_{\boldsymbol{Y}} + (1-\lambda) \boldsymbol{\Sigma}_{\boldsymbol{X}}), \ 0 \le \lambda \le 1\).

If \(\boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}\), then the means and variances of the components of \(\boldsymbol{X}\) and \(\boldsymbol{Y}\) are the same, and \(\eqref{normaleq}\) reduces to \[\begin{eqnarray*} &&\mathbb{E}[g(\boldsymbol{Y})] - \mathbb{E}[g(\boldsymbol{X})]\\ &=&\int\limits_0^1 \!\int\int_{\boldsymbol{x}\in{\mathbb{R}}^2} \! \left( \frac{1}{2} tr((\boldsymbol{\Sigma}_{\boldsymbol{Y}} - \boldsymbol{\Sigma}_{\boldsymbol{X}})\boldsymbol{H}_g(\boldsymbol{x})) \right) \cdot \phi_\lambda(\boldsymbol{x}) \ d\boldsymbol{x}\ d\lambda, \end{eqnarray*}\]da, \end{eqnarray*} which proves the announced result. :::

However, this result does not have general applicability. Aside from the Gaussian case and the case of binary variables (see Exercise \(\ref{DepBinaires}\)), supermodular comparison generally requires more than just comparing covariances. Supermodular comparison, however, never contradicts the usual correlation coefficients, as shown by the following result.

Proof. The stated result follows from \[\begin{eqnarray*} &&\Pr[f_1(X_1)>x_1,f_2(X_2)>x_2] \\ & = & \Pr[X_1>f_1^{-1}(x_1),X_2>f_2^{-1}(x_2)] \\ & \geq & \Pr[X_1>f_1^{-1}(x_1)]\Pr[X_2>f_2^{-1}(x_2)] \\ & = & \Pr[f_1(X_1)>x_1]\Pr[f_2(X_2)>x_2], \end{eqnarray*}\] which completes the proof.

Positive quadrant dependence can be detected by a positive correlation coefficient, as shown in the following result, which can be easily deduced from Property \(\ref{SmCorrel}\) (remembering the interpretation of positive quadrant dependence in terms of supermodular comparison).

We know that in general, non-correlation and independence are not equivalent (just take a look at Remark \(\ref{CorrelNIndep}\) to see this). However, these two notions become equivalent under well-chosen dependence structures, such as the one discussed in this section.

Proof. Let’s show the equivalence between (i) and (ii). The nullity of the linear correlation coefficient is equivalent to the nullity of the covariance, which allows us to write, thanks to Corollary \(\ref{TchenIn}\), \[ 0=\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{\overline{F}_{\boldsymbol{X}}(x_1,x_2)-\overline{F}_1(x_1)\overline{F}_2(x_2)\Big\}dx_1dx_2. \] Since the integrand is everywhere non-negative when \(\boldsymbol{X}\) has positive quadrant dependence, we deduce that \(\overline{F}_{\boldsymbol{X}}=\overline{F}_1\overline{F}_2\), which means that \(X_1\) and \(X_2\) are independent.

Now, let’s move on to the equivalence between (i) and (iii). Since \(\boldsymbol{X}\) is positively quadrant dependent, it dominates \(\boldsymbol{X}^\perp\) in the \(\preceq_{\text{sm}}\) sense. This allows us to write the chain of inequalities: \[ \mathbb{E}\big[F_1(X_1^\perp)F_2(X_2^\perp)\big]\leq \mathbb{E}\big[F_{\boldsymbol{X}}(X_1^\perp,X_2^\perp)\big]\leq \mathbb{E}\big[F_{\boldsymbol{X}}(X_1,X_2)\big]. \] The nullity of Kendall’s tau then guarantees \[ \mathbb{E}\big[F_1(X_1^\perp)F_2(X_2^\perp)\big]= \mathbb{E}\big[F_{\boldsymbol{X}}(X_1,X_2)\big] \] which in turn implies \[ 0=\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{{F}_{\boldsymbol{X}}(x_1,x_2)-{F}_1(x_1){F}_2(x_2)\Big\}dF_1(x_1)dF_2(x_2). \] The desired result then follows directly from this last equality (the integrand is everywhere non-negative, and the integral is zero).

Similarly, we can show the equivalence between (i) and (iv). The nullity of Spearman’s rho also guarantees \[ 0=\int_{x_1=0}^{+\infty}\int_{x_2=0}^{+\infty} \Big\{{F}_{\boldsymbol{X}}(x_1,x_2)-{F}_1(x_1){F}_2(x_2)\Big\}dF_1(x_1)dF_2(x_2), \] from which the desired result follows as above.

The following result is easily deduced from Properties \(\ref{PQDFuncInv}\) and \(\ref{Prop1.1.11bis}\).

Proposition 9.8 The pair \(\boldsymbol{X}\) has positive quadrant dependence if, and only if, the inequality \[ \mathbb{C}[g_1(X_1),g_2(X_2)]\geq 0 \] holds for all non-decreasing functions \(g_1\) and \(g_2\) such that the covariance exists.

This result sheds light on the link between covariance and positive quadrant dependence: for \(X_1\) and \(X_2\) to have positive quadrant dependence, not only must the covariance between \(X_1\) and \(X_2\) be positive, but the same must hold for the covariance between any increasing transformation of \(X_1\) and \(X_2\).

As mentioned earlier, if \(\boldsymbol{X}\) has positive quadrant dependence, it dominates \(\boldsymbol{X}^\perp\) in the supermodular sense. Returning to Proposition \(\ref{SMDecideurs}\), it is not difficult to establish the following result.

Proposition 9.9 If risks \(X_{1}\) and \(X_{2}\) have positive quadrant dependence, then \(X_{1}^{\perp }+X_{2}^{\perp }\preceq_{\text{TVaR,=}}X_{1}\,+\,X_{2}\).

If the actuary neglects the dependence between the risks and calculates the stop-loss premium retention \(d\) associated with the sum \(X_{1}\,+\,X_{2}\) as if they were independent, they will underestimate this quantity. The same holds for any quantity that can be expressed in the form \(\mathbb{E}[g(\boldsymbol{X})]\) with \(g\) supermodular.

Returning to (\(\ref{JoeEq2.1}\)) and (\(\ref{JoeEq2.2}\)), it is easy to verify that when \(\boldsymbol{X}\) has positive quadrant dependence, the random variables \(\min\{X_1,X_2\}\), \(\min\{X_1^\perp,X_2^\perp\}\), \(\max\{X_1,X_2\}\), and \(\max\{X_1^\perp,X_2^\perp\}\) are comparable in terms of \(\preceq_{\text{VaR}}\).

Proof. It suffices to note that \[\begin{eqnarray*} \Pr[\min \{X_1,X_2\}>t]&=&\overline{F}_{\boldsymbol{X}}(t,t)\\ &\geq &\overline{F}_1(t)\overline{F}_2(t)\\ &=&\Pr[\min \{X_1^\perp,X_2^\perp\}>t]. \end{eqnarray*}\] Similarly, \[\begin{eqnarray*} \Pr[\max\{X_1,X_2\}\leq t]&=&{F}_{\boldsymbol{X}}(t,t)\\ &\geq &{F}_1(t){F}_2(t)\\ &=&\Pr[\max \{X_1^\perp,X_2^\perp\}\leq t], \end{eqnarray*}\] which completes the proof.

9.3.11 Association

9.3.11.1 Definition

In order to define a notion of positive dependence between two risks \(X_1\) and \(X_2\), one might think of using \(\mathbb{C}[X_1,X_2]\geq 0\) or \(\mathbb{C}[g_1(X_1),g_2(X_2)]\geq 0\) for any pair of increasing functions \(g_1\) and \(g_2\). In the latter case, we recover the concept of positive quadrant dependence. A more restrictive condition would be to consider covariances of the form \(\mathbb{C}[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)]\) for non-decreasing functions \(\Psi_1\) and \(\Psi_2\), and demand non-negativity of these covariances. This leads to the notion of association.

Definition 9.11 Random variables \(X_1\) and \(X_2\) are said to be associated when \[\begin{equation} \label{assoc} \mathbb{C}\big[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)\big]\geq 0 \end{equation}\] for any non-decreasing functions \(\Psi_1\) and \(\Psi_2:{\mathbb{R}}^2\to{\mathbb{R}}\) such that the covariance exists.

Returning to Proposition \(\ref{PQDbis}\), we see that association is a stronger notion than positive quadrant dependence, in the sense that \[ \boldsymbol{X}\text{ associated}\Rightarrow\boldsymbol{X}\text{ positively dependent by quadrant}. \] Therefore, the results established for risks with positive quadrant dependence are still valid when they are associated.

Independence appears as a limiting case of association, as shown by the following result.

Proof. We use Property \(\ref{PropCovCond}\)(i) to write \[\begin{eqnarray*} &&\mathbb{C}\big[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)\big]\\ &=&\mathbb{E}\Big[\mathbb{C}\big[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)\big|X_2\big]\Big]\\ &&\mathbb{C}\Big[\mathbb{E}\big[\Psi_1(X_1,X_2)\big|X_2\big],\mathbb{E}\big[\Psi_1(X_1,X_2)\big|X_2\big]\Big]. \end{eqnarray*}\] The first term is positive due to Example \(\ref{PQDXX}\) and Proposition \(\ref{PQDbis}\). The functions \[ x_2\mapsto \mathbb{E}\big[\Psi_i(X_1,X_2)\big|X_2=x_2\big],\hspace{2mm}i=1,2, \] are both non-decreasing. Therefore, the random variables \(\mathbb{E}\big[\Psi_1(X_1,X_2)\big|X_2\big]\) and \(\mathbb{E}\big[\Psi_1(X_1,X_2)\big|X_2\big]\) are comonotone, so the second term is also positive, which completes the proof, as shown in Example \(\ref{PQDXX}\).

::: {.example}[Ambagaspitiya Counting Distribution] Consider the random vector \(\boldsymbol{N}\) defined based on the random vector \(\boldsymbol{M}\) as follows: \[ \left( \begin{array}{c} N_1\\ N_2\\ \end{array}\right)= \left( \begin{array}{cccc} a_{11} & a_{12} \\ a_{21} & a_{22} \\ \end{array}\right) \left( \begin{array}{c} M_1\\ M_2 \\ \end{array}\right), \] where \(a_{ij}\in \mathbb{N}\) for all \(i\) and \(j\), and \(M_1\) and \(M_2\) are independent. Such a vector is associated because for any non-decreasing functions \(\Psi_1\) and \(\Psi_2:\mathbb{R}^2\to \mathbb{R}\), there exist functions \(\widetilde{\Psi}_1\) and \(\widetilde{\Psi}_2:\mathbb{R}^2\to \mathbb{R}\) such that \[ \mathbb{C}[\Psi_1({\boldsymbol{N}}),\Psi_2({\boldsymbol{N}})]=\mathbb{C}[\widetilde{\Psi}_1({\boldsymbol{M}}),\widetilde{\Psi}_2({\boldsymbol{M}})]. \] This latter covariance is non-negative since \(\boldsymbol{M}\) is associated according to Property \(\ref{IndAssoc}\). :::

Example \(\ref{Ambagaspitiya}\) can be further generalized as follows.

We have seen in Property \(\ref{PQDNor}\) that when \(\boldsymbol{X}\) follows a bivariate normal distribution, positive correlation implies positive quadrant dependence. In fact, positive correlation is synonymous with association in this case, as shown by the following result.

Proof. Without loss of generality, we can assume \(\boldsymbol{\mu}=\boldsymbol{0}\). Let \(\boldsymbol{X}'\) be an independently sampled random vector with the same distribution as \(\boldsymbol{X}\). For \(\lambda\in[0,1]\), define the random vector \[ \boldsymbol{Y}(\lambda)=\lambda\boldsymbol{X}+\sqrt{1-\lambda^2}\boldsymbol{X}'. \] For a fixed \(\lambda\), \(\boldsymbol{Y}(\lambda)\sim\mathcal{N}or(\boldsymbol{0},\boldsymbol{\Sigma})\), and \[\begin{eqnarray*} \mathbb{C}[X_i,Y_j(\lambda)]&=&\mathbb{C}[X_i,\lambda X_j+\sqrt{1-\lambda^2}X_j']\\ &=&\mathbb{C}[X_i,\lambda X_j]+\mathbb{C}[X_i,\sqrt{1-\lambda^2}X_j']\\ &=&\lambda\sigma_{ij}. \end{eqnarray*}\] Define the function \[ \psi(\lambda)=\mathbb{E}[\Psi_1(\boldsymbol{X}),\Psi_2(\boldsymbol{Y}(\lambda))] \] where \(\Psi_1\) and \(\Psi_2\) are non-decreasing and differentiable functions. The function \(\psi\) is continuous in \(\lambda\), and \[ \psi(0)=\mathbb{E}[\Psi_1(\boldsymbol{X})]\mathbb{E}[\Psi_2(\boldsymbol{X})] \] while \[ \psi(1)=\mathbb{E}[\Psi_1(\boldsymbol{X})\Psi_2(\boldsymbol{X})]. \] Therefore, it suffices to show that the derivative \(\psi^{(1)}\) exists and is non-negative for \(0\leq\lambda<1\). This ensures \[ \psi(1)\geq\psi(0)\Leftrightarrow\boldsymbol{X}\text{ is associated}. \] Let \(\boldsymbol{C}=\{c_{ij}\}\) be the inverse of \(\boldsymbol{\Sigma}\). The density of \(\boldsymbol{X}\) can be written as \[ f_{\boldsymbol{X}}(\boldsymbol{x})=\frac{1}{(2\pi)^{-1/2}}\{|\boldsymbol{C}|\}^{-1/2}\exp\left\{-\frac{1}{2} \sum_{i,j=1}^nc_{ij}x_ix_j\right\}. \] Conditioned on \(\boldsymbol{X}=\boldsymbol{x}\), \(Y(\lambda)\) has the same distribution as \(\lambda\boldsymbol{x}+\sqrt{1-\lambda^2}\boldsymbol{X}'\) and has the conditional density \[ f_{\boldsymbol{Y}(\lambda)|\boldsymbol{X}}(\boldsymbol{y}|\boldsymbol{x})=\frac{1}{1-\lambda^2} f_{\boldsymbol{X}}\big(\sqrt{1-\lambda^2}(\lambda\boldsymbol{x}-\boldsymbol{y})\big). \] Then, \[ \psi(\lambda)=\int\int_{\boldsymbol{x}\in\mathbb{R}^2}f_{\boldsymbol{X}}(\boldsymbol{x})\Psi_1(\boldsymbol{x}) \left\{\int\int_{\boldsymbol{y}\in\mathbb{R}^2}f_{\boldsymbol{Y}(\lambda)|\boldsymbol{X}}(\boldsymbol{y}|\boldsymbol{x})\Psi_2(\boldsymbol{y})d\boldsymbol{y}\right\} d\boldsymbol{x}. \] A difficult and tedious calculation would show that the derivative is then given by \[ \psi^{(1)}(\lambda)=\frac{1}{\lambda}\int\int_{\boldsymbol{x}\in\mathbb{R}^2}f_{\boldsymbol{X}}(\boldsymbol{x}) \left\{\sum_{i,j=1}^2\sigma_{ij}\frac{\partial}{\partial x_i}\Psi_1(\boldsymbol{x}) \frac{\partial}{\partial x_j}g(\lambda,\boldsymbol{x})\right\}d\boldsymbol{x} \] where \[ g(\lambda,\boldsymbol{x})=\int\int_{\boldsymbol{y}\in\mathbb{R}^2}\Psi_2(\lambda\boldsymbol{x}-\boldsymbol{y})\frac{f_{\boldsymbol{X}}(\boldsymbol{y}/\sqrt{1-\lambda^2})} {1-\lambda^2}d\boldsymbol{y} \] is such that \[ \frac{\partial}{\partial x_1}g(\lambda,\boldsymbol{x})\geq 0\mbox{ and }\frac{\partial}{\partial x_2}g(\lambda,\boldsymbol{x})\geq 0, \] which completes the verification.

9.3.12 Conditional Growth

9.3.12.1 Definition

The abstract definition of association using inequality (\(\ref{assoc}\)) makes it a challenging concept to use and establish in a concrete situation. This is why a notion of dependence stronger than association and easily established would be quite useful. Conditional growth is one such concept.

Definition 9.12 The random vector \(\boldsymbol{X}\) is said to be conditionally increasing when:

  1. For any \(x_1\leq y_1\) in the support of \(X_1\), \[ \Pr[X_2>t|X_1=x_1]\leq\Pr[X_2>t|X_1=y_1] \] for all \(t\in{\mathbb{R}}\);
  2. For any \(x_2\leq y_2\) in the support of \(X_2\), \[ \Pr[X_1>t|X_2=x_2]\leq\Pr[X_1>t|X_2=y_2] \] for all \(t\in{\mathbb{R}}\).

Definition \(\ref{DefCIS}\) can be interpreted using the concept of comonotonicity. When \(\boldsymbol{X}\) is conditionally increasing, an increase in one of the two components makes the other component riskier in terms of comonotonicity.

It is easy to see that independence is a limit case of conditional growth (the inequalities in (i) and (ii) of Definition \(\ref{DefCIS}\) become equalities in this case), and this concept is invariant under increasing transformations, i.e., \[ \boldsymbol{X}\text{ is conditionally increasing} \] \[ \Rightarrow(f_1(X_1),f_2(X_2))\text{ is conditionally increasing} \] for all continuous and increasing functions \(f_1\) and \(f_2\).

Let’s show that conditional growth is indeed a stronger notion of dependence than association (and therefore positive quadrant dependence).

Proof. Consider non-decreasing functions \(\Psi_1\) and \(\Psi_2:{\mathbb{R}}^2\to{\mathbb{R}}\). Property \(\ref{PropCovCond}\)(i) allows us to write \[\begin{eqnarray*} &&\mathbb{C}[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)] \\ & = & \mathbb{E}\Big[\mathbb{C}[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)|X_1]\Big]\\ & & +\mathbb{C}\Big[\mathbb{E}[\Psi_1(X_1,X_2)|X_1], \mathbb{E}[\Psi_2(X_1,X_2)|X_1]\Big]. \end{eqnarray*}\] First, note that \[ \mathbb{C}[\Psi_1(x_1,X_2),\Psi_2(x_1,X_2)|X_1=x_1]\geq 0 \] which implies \[ \mathbb{E}\Big[\mathbb{C}[\Psi_1(X_1,X_2),\Psi_2(X_1,X_2)|X_1]\Big]\geq 0. \] The first term is therefore positive. Let’s show that the second term is also positive. Since \(\boldsymbol{X}\) is conditionally increasing, the functions \[ x_1\mapsto\mathbb{E}[\Psi_i(X_1,X_2)|X_1=x_1],\hspace{2mm}i=1,2, \] are both non-decreasing. Therefore, the random variables \(\mathbb{E}[\Psi_1(X_1,X_2)|X_1]\) and \(\mathbb{E}[\Psi_2(X_1,X_2)|X_1]\) are comonotonic, and \[ \mathbb{C}\Big[\mathbb{E}[\phi_1(X_1,X_2)|X_1],\mathbb{E}[\phi_2(X_1,X_2)|X_1]\Big]\geq 0. \] :::

9.4 Introduction to Copula Theory

9.4.1 Principle

Copulas (or copulae) were introduced by (Sklar 1959). The copula is also referred to as “dependence function” by (Deheuvels 1979), or uniform representation by (Kimeldorf and Sampson 1975). Kruskal also introduced this function as early as 1958 to define notions of association, (Kruskal 1958).

The idea behind the concept of copula can be presented as follows. Let’s start with a couple \((Z_1,Z_2)\sim\mathcal{N}or(\boldsymbol{0},\boldsymbol{\Sigma})\) where \[ \boldsymbol{\Sigma}=\left(\begin{array}{cc} 1&\alpha\\ \alpha&1\end{array}\right) \] and define the couple \((\Phi(Z_1),\Phi(Z_2))\). This couple has uniform margins \(\mathcal{U}ni(0,1)\) and a joint cumulative distribution function of the form \[\begin{eqnarray} \label{CopNorm} &&C_\alpha(u_1,u_2)\\ &=&\frac{1}{2\pi\sqrt{1-\alpha^2}}\int_{\xi_1=-\infty}^{\Phi^{-1}(u_1)} \int_{\xi_2=-\infty}^{\Phi^{-1}(u_2)}\exp\left\{\frac{-(\xi_1^2-2\alpha\xi_1\xi_2+\xi_2^2)} {2(1-\alpha^2)}\right\}d\xi_1d\xi_2.\nonumber \end{eqnarray}\] The corresponding probability density function is \[\begin{eqnarray*} c_\alpha(u_1,u_2)&=&\frac{\partial^2}{\partial u_1\partial u_2}C_\alpha(u_1,u_2)\\ &=&\frac{1}{2\pi\sqrt{1-\alpha^2}}\exp\left\{\frac{-(\zeta_1^2-2\alpha \zeta_1\zeta_2+\zeta_2^2)} {2(1-\alpha^2)}\right\}\\ &&\frac{d}{du_1}\Phi^{-1}(u_1)\frac{d}{du_2}\Phi^{-1}(u_2) \end{eqnarray*}\] where \(\zeta_i=\Phi^{-1}(u_i)\), \(i=1,2\). As \[ \frac{d}{du_i}\Phi^{-1}(u_i)=\frac{1}{\phi\big(\Phi^{-1}(u_i)\big)}=\sqrt{2\pi}\exp(-\zeta_i^2/2) \] we finally obtain \[\begin{equation} \label{DensCopNorm} c_\alpha(u_1,u_2) =\frac{1}{\sqrt{1-\alpha^2}}\exp\left\{\frac{-(\zeta_1^2-2\alpha \zeta_1\zeta_2+\zeta_2^2)} {2(1-\alpha^2)}\right\}\exp\left\{\frac{\zeta_1^2+\zeta_2^2}{2}\right\}. \end{equation}\] You can see in Figure \(\ref{pdfNorm1}\) graphs of this density for different values of the Kendall’s tau coefficient \[ \tau(Z_1,Z_2)=\tau\big(\Phi(Z_1),\Phi(Z_2)\big). \]

At this point, starting from the bivariate normal distribution, we have constructed a density \(\eqref{DensCopNorm}\) with marginal distributions that have been “uniformized.” Thanks to Property \(\ref{NombrAl}\), we can now move on to a density with the desired marginal distributions for the actuary (such as Pareto, Lognormal, or Gamma, for example). Specifically, let \(F_1\) and \(F_2\) be the desired marginal distributions. If we construct the couple \[ \boldsymbol{X}=\Big(F_1^{-1}\big(\Phi(Z_1)\big), F_2^{-1}\big(\Phi(Z_2)\big)\Big), \] it is easy to see that the cumulative distribution function of \(\boldsymbol{X}\) is part of \(\mathcal{F}(F_1,F_2)\), as desired. Moreover, the dependence structure is induced by that of the bivariate normal distribution. The cumulative distribution function of \(\boldsymbol{X}\) is then given by \[\begin{eqnarray*} F_{\boldsymbol{X}}(x_1,x_2)&=&\Pr\Big[F_1^{-1}\big(\Phi(Z_1)\big)\leq x_1,F_2^{-1}\big(\Phi(Z_2)\big)\leq x_2\Big]\\ &=&\Pr\Big[\Phi(Z_1)\leq F_1(x_1),\Phi(Z_2)\leq F_2(x_2)\Big]\\ &=&C_\alpha\big(F_1(x_1),F_2(x_2)\big) \end{eqnarray*}\] where \(C_\alpha\) is given by \(\eqref{CopNorm}\). The density of \(\boldsymbol{X}\) is then \[\begin{eqnarray*} f_{\boldsymbol{X}}(x_1,x_2)&=&\frac{\partial^2}{\partial x_1\partial x_2}F_{\boldsymbol{X}}(x_1,x_2)\\ &=&f_1(x_1)f_2(x_2)c_\alpha\big(F_1(x_1),F_2(x_2)\big) \end{eqnarray*}\] where \(f_1\) and \(f_2\) are the densities corresponding to \(F_1\) and \(F_2\), and where \(c_\alpha\) is the density \(\eqref{DensCopNorm}\). You can see in Figure \(\ref{pdfNorm3}\) bivariate densities with \(\mathcal{G}am(3,1)\) marginals obtained in this way.

9.4.2 Definition

Let’s now provide a precise definition of the concept of a copula.

Definition: A two-dimensional copula is a mapping \(C\) from $$ into the interval $$ satisfying: 1. \(C\left( u,0\right) =C\left( 0,u\right) =0\) and \(C\left( u,1\right) =C\left( 1,u\right) =u\) for all \(0\leq u\leq1\). 2. \(C\) is supermodular.

9.4.3 Sklar’s Theorem

This result generalizes the construction based on the bivariate normal distribution that we presented as an introduction. It plays a fundamental role in copula theory, showing how the dependence structure can be separated from the marginal distributions.

Theorem (Sklar’s Theorem): Let \(\boldsymbol{X}\) be a couple whose cumulative distribution function \(F_{\boldsymbol{X}}\) belongs to \(\mathcal{F}(F_1,F_2)\) with continuous \(F_1\) and \(F_2\). Then, there exists a unique copula \(C\) such that for all \(\boldsymbol{x}\in{\mathbb{R}}^2\): \[ F_{\boldsymbol{X}}(x_1,x_2)=C\left(F_1(x_1),F_2(x_2)\right). \] Conversely, if \(C\) is a copula and \(F_1\) and \(F_2\) are univariate cumulative distribution functions, the function \(F_{\boldsymbol{X}}\) defined by the equation above is a bivariate cumulative distribution function in \(\mathcal{F}(F_1,F_2)\).

In passing, note that we have obtained an explicit formula for the copula involved in the equation above when the marginals are continuous, specifically: $$ C() = F_{}(F_1{-1}(u_1),F_2{-1}(u_2)), ^2.

Remark. However, Sklar’s Theorem loses much of its interest when one of the two marginal cumulative distribution functions is not continuous. In this case, the copula \(C\) involved in the representation \(\eqref{SklarMain}\) is no longer unique on the unit square \([0,1]^2\) but only on the Cartesian product of the images of \(F_1\) and \(F_2\). This multiplicity of copulas makes the decomposition \(\eqref{SklarMain}\) less useful in practice.

Remark. The use of uniform laws may seem natural, but the normal law remains the reference in statistics. As noted by (Hutchinson and Lai 1991), the basic idea behind copulas is to decouple marginal behavior and dependence structure by transforming the margins. There are indeed several advantages to reducing to uniform marginal laws, such as: 1. Independence generally does not have a simple geometric interpretation. Few laws have a “simple” density when the components are independent, except for the case where the margins are \(\mathcal{U}ni(0,1)\), in which case the density is constant and equal to 1 on the square \([0,1] \times [0,1]\). However, it should be noted that the case where the components are Gaussian and have the same distribution, for example, $or( 0,1) $, is also simple: the level curves of the density are then circular. 2. Common measures of dependence, especially Kendall’s tau and Spearman’s rho, were developed for the case where the marginal distributions are uniform.

We have seen the normal copula in the introduction of this section (see formulas \(\eqref{CopNorm}\) and \(\eqref{DensCopNorm}\) for the cumulative distribution function and probability density function, respectively). Below, we review some other examples of copulas.

Copula of the upper Fréchet bound:

The copula corresponding to the upper Fréchet bound \(W\), denoted as \(C_W\), is given by: \[ C_W(u_1,u_2)=\min\{u_1,u_2\}, \hspace{3mm}(u_1,u_2)\in[0,1]\times[0,1]. \] This is the cumulative distribution function of the couple \((U,U)\), where \(U\sim\mathcal{U}ni(0,1)\). The graph of the function \(C_W\) is shown in Figure \(\ref{GraphCU}\).

The copula corresponding to the lower Fréchet bound \(M\), denoted as \(C_M\), is given by \[ C_M(u_1,u_2)=\max\{0,u_1+u_2-1\}, \hspace{3mm}(u_1,u_2)\in[0,1]\times[0,1]. \] This is the distribution function of the pair \((U,1-U)\), where \(U\sim\mathcal{U}ni(0,1)\). The graph of \(C_M\) is shown in Figure \(\ref{GraphCL}\).

The independence copula, denoted as \(C_I\), is given by \[ C_I(u_1,u_2)=u_1u_2,\hspace{3mm}(u_1,u_2)\in[0,1]\times[0,1]. \] The graph of \(C_I\) is shown in Figure \(\ref{GraphCI}\).

Introduced by (Gumbel 1960), this copula is given by \[ C_\alpha(u_1,u_2)=\exp\left(-\left\{(-\ln u_1)^{\alpha} +(-\ln u_2)^{\alpha}\right\}^{1/\alpha}\right), \hspace{2mm}\alpha\geq 1. \] The Kendall’s tau is given by \[ \tau=1-\alpha^{-1}. \] The probability density associated with \(C_\alpha\) is \[ c_\alpha(\boldsymbol{u})=\frac{C_\alpha(\boldsymbol{u})}{u_1u_2}\frac{(\ln u_1\ln u_2)^{\alpha-1}}{(-\ln u_1-\ln u_2)^{2-\frac{1}{\alpha}}} \Big\{\big((-\ln u_1)^\alpha+(-\ln u_2)^\alpha\big)^{1/\alpha}+\alpha-1\Big\}. \] The Gumbel family contains the independence copula \(C_I\) and the upper Fréchet bound copula \(C_W\), specifically, \[ C_1=C_I\text{ and }\lim_{\alpha\to +\infty}C_\alpha=C_W. \]

This family of copulas was introduced by (Kimeldorf and Sampson 1978) and (Clayton 1978): for \(\alpha > 0\), \[\begin{equation} \label{clinton} C_\alpha(u_1,u_2)=\left(u_1^{-\alpha}+u_2^{-\alpha}-1\right)^{-1/\alpha}. \end{equation}\] The Kendall’s tau is \(\alpha/(\alpha +2)\). The associated density is \[ c_\alpha(\boldsymbol{u})= \frac{1+\alpha}{(u_1u_2)^{\alpha+1}}\big(u_1^{-\alpha}+u_2^{-\alpha}-1\big)^{-2-\frac{1}{\alpha}}. \] The Clayton family contains the upper Fréchet bound and independence copulas, specifically, \[ \lim_{\alpha\to +\infty}C_\alpha(u_1,u_2)=\min\{u_1,u_2\}=C_W(\boldsymbol{u}) \] and \[ \lim_{\alpha\to 0}C_\alpha(u_1,u_2)=u_1u_2=C_I(\boldsymbol{u}). \]

The density \(c_\alpha(\boldsymbol{u})\) is depicted in Figure \(\ref{pdfClayton1}\) for various values of \(\tau\). Figure \(\ref{pdfClayton2}\) displays the probability density of a pair of random variables with \(\mathcal{N}or(0,1)\) distribution, where the dependence structure is described by the Clayton copula. The corresponding contour plots are shown in Figure \(\ref{pdfClayton3}\). It’s evident that the characteristic ellipses of bivariate normal distribution have been altered. Lastly, Figure \(\ref{pdfClayton4}\) illustrates the density associated with a pair of random variables with \(\mathcal{G}am(3,1)\) distribution, where the dependence structure is given by the Clayton copula, for \(\tau=0.25\), 0.5, and 0.75.

(M. J. Frank 1979) introduced a family of copulas obtained by solving a functional equation. This family has the distribution function: \[ C_\alpha(u_1,u_2)=-\frac{1}{\alpha}\ln\left(1+\frac{(\exp(-\alpha u_1)-1) (\exp(-\alpha u_2)-1)}{\exp(-\alpha)-1}\right), \] where \(\alpha\neq 0\). Its density is given by: \[ c_\alpha(\boldsymbol{u})=\frac{\alpha\exp\big(-\alpha(u_1+u_2)\big)\big(1-\exp(-\alpha)\big)}{\Big\{\exp\big(-\alpha(u_1+u_2)\big) -\exp(-\alpha u_1)-\exp(-\alpha u_2)+\exp(-\alpha)\Big\}^2}. \]

The probability density associated with the Frank copula is shown in Figure \(\ref{pdfFrank1}\) for \(\tau=0.1\Leftrightarrow \alpha=0.91\), \(\tau=0.4\Leftrightarrow \alpha=4.16\), \(\tau=0.7\Leftrightarrow \alpha=11.4\), \(\tau=0.9\Leftrightarrow \alpha=20.9\). For \(\tau=0.1\), 0.4, 0.7, and 0.9, Figure \(\ref{pdfFrank2}\) displays the probability density of a pair of random variables with the Frank copula and marginal distributions \(\mathcal{N}or(0,1)\). As shown in Figure \(\ref{pdfFrank3}\), the Frank copula is not limited to positive dependence: it depicts the density \(c_\alpha(\boldsymbol{u})\) for \(\tau=-0.4\) and \(\tau=-0.7\). Figure \(\ref{pdfFrank4}\) illustrates the density for \(\tau=-0.4\) and \(\tau=-0.7\) with marginal distributions \(\mathcal{N}or(0,1)\).

The Frank family contains the copulas \(C_M\), \(C_W\), and \(C_I\) as special cases; specifically, \[ \lim_{\alpha\to -\infty}C_\alpha=C_M,\hspace{2mm} \lim_{\alpha\to +\infty}C_\alpha=C_W\text{ and } \lim_{\alpha\to 0}C_\alpha=C_I. \]

9.4.4 Properties of Copulas

9.4.4.1 Survival Copulas

If \(C\) is a copula, the same applies to \(\overline{C}\) defined as \[ \overline{C}(u_1,u_2)=C(1-u_1,1-u_2)+u_1+u_2-1; \] \(\overline{C}\) is called the survival copula associated with copula \(C\). The representation of the survival function \(\overline{F}_{\boldsymbol{X}}\) of \(\boldsymbol{X}\) in terms of \(\overline{F}_1\) and \(\overline{F}_2\) is given by \[\begin{eqnarray*} \overline{F}_{\boldsymbol{X}}(\boldsymbol{x})&=&1-F_1(x_1)-F_2(x_2)+F_{\boldsymbol{X}}(\boldsymbol{x})\\ &=& \overline{C}(\overline{F}_1(x_1),\overline{F}_2(x_2)); \end{eqnarray*}\] thus, \(\overline{C}\) couples the univariate survival functions just as \(C\) coupled the univariate distribution functions in \(\eqref{SklarMain}\).

Remark. It is important not to confuse the survival copula \(\overline{C}\) with the survival function of a pair of uniform random variables \((U_1,U_2)\) whose joint cumulative distribution function is \(C\), which is given by \[ \Pr[U_1>u_1,U_2>u_2]=1-u_1-u_2+C(u_1,u_2)\neq\overline{C}(u_1,u_2). \] The copula \[ C^*(u_1,u_2)=1-u_1-u_2+C(u_1,u_2) \] is called the dual copula of \(C\).

::: {.example}[Pareto Copula] Consider the joint cumulative distribution function \(\eqref{CdfBivPar}\), and the underlying copula is given by \[ C(u_1,u_2)=u_1+u_2-1+\Big((1-u_1)^{-1/\alpha}+(1-u_2)^{-1/\alpha}-1\Big)^{-\alpha}. \] This function is also called the Pareto copula (as it corresponds to the bivariate Pareto distribution). The survival copula is \[ \overline{C}(\boldsymbol{u})=\Big(u_1^{-1/\alpha}+u_2^{-1/\alpha}-1\Big)^{-\alpha}-1, \] where we recognize the Clayton copula. The dual copula is given by \[ {C}^*(\boldsymbol{u})=\Big((1-u_1)^{-1/\alpha}+(1-u_2)^{-1/\alpha}-1\Big)^\alpha. \]

The utility of copulas comes from the fact that they are not affected by monotonically increasing transformations of the components of the vectors involved, or vary predictably when these transformations are monotonic but not increasing.

Proposition 9.10 Let \(X_1\) and \(X_2\) be two random variables with cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and their dependence structure described by copula \(C\). Let \(g_1\) and \(g_2\) be monotonic functions,

  1. if \(g_1\) and \(g_2\) are increasing, then \(\big(g_1(X_1),g_2(X_2)\big)\) also has \(C\) as its copula.
  2. if \(g_1\) is increasing and \(g_2\) is decreasing, then \(\big(g_1(X_1),g_2(X_2)\big)\) has the copula \(u_1-C(u_1,1-u_2)\).
  3. if \(g_1\) is decreasing and \(g_2\) is increasing, then \(\big(g_1(X_1),g_2(X_2)\big)\) has the copula \(u_2-C(1-u_1,u_2)\).
  4. if \(g_1\) and \(g_2\) are decreasing, then \(\big(g_1(X_1),g_2(X_2)\big)\) has the copula \(u_1+u_2-1+C(1-u_1,1-u_2)\).

Proof.

  1. The cumulative distribution function of \(g_i(X_i)\), \(i=1,2\), is \[\begin{eqnarray*} G_i(x)&=&\Pr[g_i(X_i)\leq x]\\ &=&\Pr[X_i\leq g_i^{-1}(x)]\\ &=&F_i(g_i^{-1}(x)),\hspace{2mm}i=1,2. \end{eqnarray*}\] Denoting \(\widetilde{C}\) as the copula for \((g_1(X_1),g_2(X_2))\), we have \[\begin{eqnarray*} \widetilde{C}\Big(G_1(x_1),G_2(x_2)\Big)&=&\Pr[g_1(X_1)\leq x_1,g_2(X_2)\leq x_2]\\ &=&\Pr[X_1\leq g_1^{-1}(x_1),X_2\leq g_2^{-1}(x_2)]\\ &=&C\Big(F_1\big(g_1^{-1}(x_1)\big),F_2\big(g_2^{-1}(x_2)\big)\\ &=&C\Big(G_1(x_1),G_2(x_2)\Big) \end{eqnarray*}\] which allows us to conclude that \(\widetilde{C}\equiv C\).

  2. The cumulative distribution function of \(g_1(X_1)\) is \(G_1\) as defined in (i), but that of \(g_2(X_2)\) is now given by \[\begin{eqnarray*} G_2(x)&=&\Pr[g_2(X_2)\leq x]\\ &=&\Pr[X_2\geq g_i^{-1}(x)]\\ &=&1-F_2(g_2^{-1}(x)). \end{eqnarray*}\] Therefore, denoting \(\widetilde{C}\) as the copula of \((g_1(X_1),g_2(X_2))\), we have \[\begin{eqnarray*} \widetilde{C}\Big(G_1(x_1),G_2(x_2)\Big)&=&\Pr[g_1(X_1)\leq x_1,g_2(X_2)\leq x_2]\\ &=&\Pr[X_1\leq g_1^{-1}(x_1),X_2\geq g_2^{-1}(x_2)]\\ &=&\Pr[X_1\leq g_1^{-1}(x_1)]\\ &&-\Pr[X_1\leq g_1^{-1}(x_1),X_2\leq g_2^{-1}(x_2)]\\ &=&G_1(x_1)-C\Big(G_1(x_1),1-G_2(x_2)\Big) \end{eqnarray*}\] which concludes the proof of (ii). The proof of (iii)-(iv) is entirely similar and is left as an exercise to the reader.

It is worth noting that the form of the copula for the pair \(\big(g_1(X_1),g_2(X_2)\big)\) does not explicitly depend on the functions \(g_1\) and \(g_2\) but only on whether they are increasing or decreasing. This indicates that the copula perfectly summarizes the dependence structure between the variables at hand without being influenced by the scale of each of them.

Let \((U_1,U_2)\) be a pair of random variables with copula \(C\) as their joint cumulative distribution function. The conditional cumulative distribution function of \(U_2\) given \(U_1=u_1\) is given by \[\begin{eqnarray*} C_{2|1}(u_2|u_1)&=&\Pr[U_2\leq u_2|U_1=u_1]\\ &=&\lim_{\Delta u_1\to 0} \frac{\Pr[u_1\leq U_1\leq u_1+\Delta u_1, U_2\leq u_2]}{\Pr[u_1\leq U_1\leq u_1+\Delta u_1]}\\ &=&\lim_{\Delta u_1\to 0} \frac{C(u_1+\Delta u_1,u_2)-C(u_1,u_2)}{\Delta u_1}\\ &=&\frac{\partial}{\partial u_1} C(u_1,u_2). \end{eqnarray*}\] If \(X_1\) and \(X_2\) have cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and copula \(C\), the conditional cumulative distribution function of \(X_2\) given \(X_1=x_1\) is given by \[ \Pr[X_2\leq x_2|X_1=x_1]=F_{2|1}(x_2|x_1)=C_{2|1}(F_1(x_1)|F_2(x_2)). \] By symmetry, the conditional cumulative distribution function of \(X_1\) given \(X_2=x_2\) is \[ \Pr[X_1\leq x_1|X_2=x_2]=F_{1|2}(x_1|x_2)=C_{1|2}(F_1(x_1)|F_2(x_2)) \] where \[ C_{1|2}(u_1|u_2)=\frac{\partial}{\partial u_2}C(u_1,u_2). \]

::: {.example}[Clayton Copula] Consider a pair \((U_1,U_2)\) of uniform marginals \(\mathcal{U}ni(0,1)\) with the Clayton copula \(C_\alpha\) as their joint cumulative distribution function. The conditional distribution of \(U_2\) given \(U_1=u_1\) is described by \[ C_{2|1}(u_2|u_1)=\frac{\partial}{\partial u_1}C_\alpha(u_1,u_2)=\big(1+u_1^\alpha(u_2^{-\alpha}-1)\big)^{-1-\frac{1}{\alpha}}. \] If \(X_1\) and \(X_2\) have cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and the copula \(C_\alpha\), then \[ \Pr[X_2\leq x_2|X_1=x_1]=\Big(1+\big(F_1(x_1)\big)^\alpha\Big(\big(F_2(x_2)\big)^\alpha-1\Big)\Big)^{-1-\frac{1}{\alpha}}. \]

:::

The comparison of dependence using \(\preceq_{\text{sm}}\) relies exclusively on the copulas of the pairs involved when their common marginals are continuous. Indeed, \(\preceq_{\text{sm}}\) can only be used within \(\mathcal{F}(F_1,F_2)\), and Property \(\ref{SmFonc}\) allows us to write \[\begin{eqnarray*} \boldsymbol{X}\preceq_{\text{sm}}\boldsymbol{Y}&\Leftrightarrow & \big(F_1(X_1),F_2(X_2)\big)\preceq_{\text{sm}}\big(F_1(Y_1),F_2(Y_2)\big)\\ &\Leftrightarrow &C_{\boldsymbol{X}}(u_1,u_2)\leq C_{\boldsymbol{Y}}(u_1,u_2)\text{ for all }u_1,u_2\in [0,1], \end{eqnarray*}\] where \(C_{\boldsymbol{X}}\) and \(C_{\boldsymbol{Y}}\) denote the copulas of the respective pairs \(\boldsymbol{X}\) and \(\boldsymbol{Y}\).

Example 9.8 The parameter \(\alpha\) controls the degree of dependence expressed by most of the parametric copula families presented earlier. Specifically, we have \[ C_\alpha(\boldsymbol{u})\leq C_{\alpha '}(\boldsymbol{u})\text{ for all }\boldsymbol{u}\in[0,1]^2 \] when \(\alpha\leq \alpha '\) for the Gumbel copula, Clayton copula, and Frank copula.

Property \(\ref{PQDFuncInv}\) shows that positive quadrant dependence is induced by the copula, without reference to the marginal distributions. Indeed, if \(X_1\) and \(X_2\) have cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and copula \(C\), \[\begin{eqnarray*} &&\boldsymbol{X}\text{ is positively quadrant dependent }\\ &\Leftrightarrow & \big(F_1(X_1),F_2(X_2)\big) \text{ is positively quadrant dependent }\\ &\Leftrightarrow &C(u_1,u_2)\geq C_I(u_1,u_2)=u_1u_2\text{ for all }u_1,u_2\in [0,1]. \end{eqnarray*}\] Thus, a pair is positively quadrant dependent if and only if the copula of that pair dominates the independence copula everywhere in the unit square.

Example 9.9 The copula \(C_W\) expresses positive quadrant dependence. The Gumbel copula and the Clayton copula both express positive quadrant dependence for all values of the parameter \(\alpha\). These copulas, therefore, only express positive dependence (and are not suitable for modeling situations where negative dependence between the variables is suspected).

The Frank copulas express positive quadrant dependence when \(\alpha\geq 0\).

Similarly, if \(X_1\) and \(X_2\) have cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and copula \(C\), \[\begin{eqnarray*} &&\boldsymbol{X}\text{ is associated}\\ & \Rightarrow & \big(F_1(X_1),F_2(X_2)\big)\text{ is associated}\\ & \Rightarrow & \text{ the copula }C\text{ is associated}. \end{eqnarray*}\] However, there is no simple condition to check if a given copula is associated. For conditional monotonicity, we have \[\begin{eqnarray*} &&\boldsymbol{X}\text{ is conditionally increasing} \\ & \Rightarrow & \big(F_1(X_1),F_2(X_2)\big)\text{ is conditionally increasing}\\ & \Rightarrow & \text{ the copula }C\text{ is conditionally increasing.} \end{eqnarray*}\] In the case of conditional monotonicity, we can state the following result.

Proposition 9.11 The copula \(C\) is conditionally increasing if and only if the following two conditions are simultaneously satisfied:

  1. for all \(0\leq u_2\leq 1\), \[ u_1\longmapsto \frac{\partial}{\partial u_1} C\left(u_1,u_2\right) \] is strictly increasing, i.e., $u_1C( u_1,u_2) $ is a concave function.
  2. for all \(0\leq u_1\leq 1\), \[ u_2\longmapsto \frac{\partial}{\partial u_2} C\left(u_1,u_2\right) \] is strictly increasing, i.e., $u_2C( u_1,u_2) $ is a concave function.

9.4.5 Dependence Measures and Copulas

9.4.5.1 Pearson’s Correlation Coefficient

We can express Pearson’s linear correlation coefficient in the following form: \[\begin{eqnarray*} r(X_1,X_2)&=&\frac{1}{\sqrt{\mathbb{V}[X_1]\mathbb{V}[X_2]}}\\ &&\int\int_{(u_1,u_2)\in[0,1]^2}\big\{C(u_1,u_2)-u_1u_2\big\} dF_1^{-1}(u_1)dF_2^{-1}(u_1). \end{eqnarray*}\] Thus, we see that the linear correlation coefficient \(r\) depends not only on the copula but also on the marginal distributions. This explains some of the drawbacks of this dependence measure.

Kendall’s tau and Spearman’s rho coefficients are based on the same concept: concordance. We can then generalize Kendall’s tau and Spearman’s rho as follows.

Definition 9.13 Let $( X_{1},X_{2}) $ and $( Y_{1},Y_{2}) $ be two independent pairs, with copulas \(C_{1}\) and \(C_{2}\), and their cumulative distribution functions \(F_{\boldsymbol{X}}\) and \(F_{\boldsymbol{Y}}\) are in \(\mathcal{F}(F_1,F_2)\). The concordance measure \(Q\) is defined as the difference between the probabilities of concordance and discordance of the vectors \(\boldsymbol{X}\) and \(\boldsymbol{Y}\), i.e., \[\begin{eqnarray*} Q\left( C_{1},C_{2}\right)& =&{\Pr}\big[ \left(X_{1}-Y_{1}\right) \left( X_{2}-Y_{2}\right) >0\big]\\ &&-{\Pr}\big[ \left(X_{1}-Y_{1}\right) \left( X_{2}-Y_{2}\right) <0\big]\\ &=&4\int\int_{\left[ 0,1\right]^{2}}C_{2}\left( u_1,u_2\right) dC_{1}\left( u_1,u_2\right) -1. \end{eqnarray*}\]%

Example 9.10 Taking the copulas of the Fréchet bounds \(C_M\) and \(C_W\) and the independence copula \(C_I\), we have the following table: \[\begin{equation*} \begin{tabular}{|c|ccc|} \hline $Q\left( C_{\cdot },C_{\cdot }\right) $ & $C_M$ & $C_I$ & $C_W$ \\ \hline $C_M$ & $-1$ & $-1/3$ & $0$ \\ $C_I$ & $-1/3$ & $0$ & $1/3$ \\ $C_W$ & $0$ & $1/3$ & $1$ \\ \hline \end{tabular}% \end{equation*}\]%

In general, for any copula \(C\), the following inequalities hold: \[\begin{eqnarray*} 0&\leq& Q\left(C,C_W\right) \leq 1,\\ -1&\leq& Q\left( C,C_M\right) \leq 0,\\ -1/3&\leq& Q\left( C,C_I\right) \leq 1/3. \end{eqnarray*}\]

Kendall’s tau can be expressed using the concordance measure \(Q\) introduced above, as shown in the following result.

In cases where the derivative exists, Kendall’s tau can also be expressed as \[\begin{eqnarray*} \tau \left( X_1,X_2\right)&=&4\int\int_{\left[ 0,1\right] ^{2}}\frac{\partial ^{2}}{% \partial u_1\partial u_2}C\left( u_1,u_2\right) du_1du_2-1\\ &=&1-4\int\int_{\left[ 0,1\right] ^{2}}\frac{\partial }{\partial u_1}C\left( u_1,u_2\right) \frac{\partial }{% \partial u_2}C\left( u_1,u_2\right) du_1du_2. \end{eqnarray*}\]

Now let’s move on to Spearman’s rho. The latter can be expressed as follows using the concordance measure \(Q\).

In Figure \(\ref{FIG-KENDALL-SPEARMAN-2}\), the values of $$ and $$ are represented for common families of one-parameter copulas. It can be noted that for most copulas (Clayton, Gumbel, Gaussian, Frank, etc.), a large part of the region of admissibility of Corollary \(\ref{InegaliteRhoTau}\) represented in Figure \(\ref{FIG-KENDALL-SPEARMAN-1}\) is not reached.

Remark. It is entirely possible to have both \(\rho =0\) and \(\tau =0\) without having independence between the random variables. To see this, consider the “cubic” copula, with the following expression: \[ C\left( u_1,u_2\right) =u_1u_2+\theta \big( u_1\left( u_1-1\right) \left( 2u_1-1\right) % \big) \big( u_2\left( u_2-1\right) \left( 2u_2-1\right) \big) \] where \(\theta \in \left[ -1,2\right]\). The shape of this copula is shown in Figure \(\ref{copula-cubique1}\) for extreme values \(\theta=-1\) and \(\theta=2\). For any \(\theta \neq 0\), \(\rho =0\) and \(\tau =0\) without having independence.

In economic literature, the Gini coefficient is used to study income differences, for example, between two populations.

Definition 9.14 Let \(X_1\) and \(X_2\) be two random variables with cumulative distribution functions \(F_1\) and \(F_2\), both continuous, and copula \(C\). The Gini coefficient \(\gamma\) of the pair \(\left(X_1,X_2\right)\) is given by \[\begin{eqnarray*} \gamma \left( X_1,X_2\right)& =&2\mathbb{E}\Big[ \big| F_1(X_1)+F_2(X_2)-1\big|-\big| F_1(X_1)+F_2(X_2)\big| \Big] \\ &=&2\int\int_{\left[ 0,1\right] ^{2}}\big(\left| u_1+u_2-1\right| -\left| u_1+u_2\right| \big) dC\left( u_1,u_2\right). \end{eqnarray*}\]

The Gini coefficient can be expressed in various forms, including \[\begin{eqnarray} \gamma \left( X_1,X_2\right) &=&4 \int_{0}^{1}C\left( u,1-u\right) du\nonumber\\ &&-4\int_{0}^{1}\left\{ u-C\left( u,u\right) \right\} du \label{gini2} \\ &=&4\int\int_{\left[ 0,1\right] ^{2}}C\left( u_1,u_2\right) dC_W\left(u_1,u_2\right)\nonumber\\ &&+4\int\int_{\left[ 0,1\right] ^{2}}C\left( u_1,u_2\right)dC_M\left( u_1,u_2\right) -2, \nonumber \end{eqnarray}\]% where, in the right term of \(\eqref{gini2}\), the second integral corresponds to the distance between the diagonal of the upper Fréchet bound and that of the copula \(C\).

As was the case for Kendall’s tau and Spearman’s rho, certain specific values of the Gini coefficient characterize the copulas \(C_M\), \(C_W\), and, under positive quadrant dependence, \(C_I\).

Proposition 9.12 Consider \(\boldsymbol{X}\) with cumulative distribution function \(F_{\boldsymbol{X}}\) in \(\mathcal{F}(F_1,F_2)\). Then,

  1. \(\gamma \left( X_1,X_2\right) =-1\) if and only if \(\boldsymbol{X}\) is antimonotone;
  2. \(\gamma \left( X_1,X_2\right) =1\) if and only if \(\boldsymbol{X}\) is comonotone;
  3. if \(X_1\) and \(X_2\) are positively dependent in the quadrant, then \(\gamma \left(X_1,X_2\right) =0\) if and only if \(X_1\) and \(X_2\) are independent.

Proof. We prove (2); the proofs for (1) and (3) are similar and are therefore omitted. If \(\boldsymbol{X}\) is comonotone, then trivially, \(\gamma =1\). Conversely, suppose \(\gamma =1\). Then, using $( \(\ref{gini2}\)) $, we can write \[ \gamma \left(U_1,U_2\right)=\gamma \left( U,U\right) =1, \] which in turn gives \[ \int\int_{\left[ 0,1\right] ^{2}}\underset{\geq 0\text{ for all }u_1,u_2}{\underbrace{\Big\{ C_W\left( u_1,u_2\right) -C\left( u_1,u_2\right) \Big\}}} dC_W\left( u_1,u_2\right) \] \[ +\int\int_{\left[ 0,1\right]^2}\underset{\geq 0\text{ for all }u_1,u_2}{\underbrace{\Big\{C_W\left( u_1,u_2\right) -C\left( u_1,u_2\right) \Big\}}} dC_M\left( u_1,u_2\right) =0, \] meaning that \(C=C_W\Leftrightarrow\boldsymbol{X}\) is comonotone.

9.5 Archimedean Copulas

9.5.1 Definition

Most one-parameter copulas presented in the previous section belong to a larger family with a functional parameter: Archimedean copulas.

Definition 9.15 Let \(\varphi :\left[ 0,1\right] \rightarrow {\mathbb{R}}^+\) be a convex and strictly decreasing function, such that \(\varphi \left( 1\right) =0\). Let us define \(\varphi ^{\left[ -1\right] }\), the pseudo-inverse of the function $$, as \[\begin{equation*} \varphi ^{[-1]}\left( t\right) =\left\{ \begin{array}{ll} \varphi ^{-1}\left( t\right), & \text{if }0\leq t\leq \varphi \left( 0\right), \\ 0, & \text{if }\varphi \left( 0\right) \leq t\leq +\infty. \end{array}% \right. \end{equation*}\]% This pseudo-inverse defines a continuous function, decreasing on \({\mathbb{R}}^{+}\), and strictly decreasing on $$. For simplicity, we will denote \(\varphi ^{-1}\) this pseudo-inverse.

The function \[\begin{equation*} C\left( u_1,u_2\right) =\varphi ^{-1}\left( \varphi \left( u_1\right) +\varphi \left( u_2\right) \right) \text{ for }0\leq u_1,u_2\leq 1, \end{equation*}\]% is a copula, called an Archimedean copula. The function $$ is called the generator of the copula

Remark. It is worth noting that there is uniqueness of the generator, up to a multiplicative constant: thus $$ and $$ generate the same copula, for any \(\kappa >0\), and conversely, if \[ \varphi_1 ^{-1}\left( \varphi_1 \left( u\right) +\varphi_1 \left( v\right) \right) =\varphi_2 ^{-1}\left( \varphi_2 \left( u\right) +\varphi_2 \left( v\right) \right) \] then there exists \(\kappa >0\) such that $_2 =_1 $. The proof is provided in (B. Schweizer and Sklar 1983) in the context of probability metric spaces and for Archimedean binary operations (Theorem 5.4.8), but the proof remains unchanged for Archimedean copulas.

From now on, we will refer to the generator of an Archimedean copula, even though it is defined up to a multiplicative constant.

Example 9.11 The independence copula \(C_I\) is an Archimedean copula with the generator $( t) =( 1/t) $. Similarly, the Gumbel copula is an Archimedean copula with the generator \(\varphi \left( t\right) =\left( -\log t\right) ^{\alpha }\).

Of course, not all copulas are Archimedean, as the following example demonstrates.

Example 9.12 The copula of the upper Fréchet bound, denoted as \(C_W\), is not archimedean. Indeed, if there existed a generator \(\varphi\) such that the archimedean representation of Definition \(\ref{DefArchi}\) applied, we should have \[ C_\varphi(u,u)=u\text{ for all $0<u<1$,} \] which would imply \[ 2\varphi(u)=\varphi(u)\text{ for all $0<u<1$,} \] which is clearly impossible.

Remark. Archimedean copulas can easily generate copulas with two parameters. For this purpose, consider a family of one-parameter generators, denoted as \(\varphi_\theta\), and consider \[ \varphi_{\alpha,\theta}\left(t\right) = \varphi_\theta\left(t^\alpha\right) \text{ and } \varphi_{\theta,\beta}\left(t\right) = \left[\varphi_\theta\left(t\right)\right]^\beta. \] If \(\beta \geq 1\), then \(\varphi_{\theta,\beta}\) is a generator of an archimedean copula. If \(\alpha \in (0,1)\), then \(\varphi_{\alpha,\theta}\) is a generator of an archimedean copula (in fact, more generally, if \(\varphi_{\theta}\) is twice differentiable, and if \(t\varphi_{\theta}^{\prime}\left(t\right)\) is an increasing function on \((0,1)\), then \(\varphi_{\alpha,\theta}\) is a generator for any \(\alpha > 0\)). Moreover, if \(\varphi^{\prime}\left(1\right) \neq 0\), then, denoting \(C_{\alpha,\theta}\) and \(C_{\theta,\beta}\) as the archimedean copulas generated by \(\varphi_{\alpha,\theta}\) and \(\varphi_{\theta,\beta}\), respectively, we have \[ \lim_{\alpha \rightarrow 0}C_{\alpha,\theta}\left(u_1,u_2\right) = C_I\left(u_1,u_2\right) \] and \[ \lim_{\beta \rightarrow \infty}C_{\theta,\beta}\left(u_1,u_2\right) = C_W\left(u_1,u_2\right). \] One can also generate three-parameter archimedean copulas by considering generators of the form \[ \varphi_{\alpha,\beta,\theta}\left(t\right) = \left(\varphi_\theta\left(t^\alpha\right)\right)^\beta. \]

9.5.2 Frailty Models and Archimedean Copulas

Archimedean copulas are closely related to frailty models. Specifically, suppose that \(X_1\) and \(X_2\) are conditionally independent given a third unobservable variable \(Z\) (which represents, for example, a level of risk or the magnitude of a catastrophe). Suppose that \[ \Pr[X_1 \leq x_1 | Z=z] = \left\{B_1(x_1)\right\}^z \] and \[ \Pr[X_2 \leq x_2 | Z=z] = \left\{B_2(x_2)\right\}^z \] for distribution functions \(B_1\) and \(B_2\). The joint distribution function of the pair \(\boldsymbol{X}\) is then given by \[\begin{eqnarray*} F_{\boldsymbol{X}}(\boldsymbol{x}) &=& \mathbb{E}\left[\left\{B_1(x_1)\right\}^z \left\{B_2(x_2)\right\}^z\right]\\ &=& L_Z\left(-\ln B_1(x_1)-\ln B_2(x_2)\right), \quad \boldsymbol{x}\in{\mathbb{R}}^2. \end{eqnarray*}\] This distribution function satisfies the archimedean representation with \(\varphi^{-1}=L_Z\) and marginal distributions \[ F_i(x_i) = L_Z(-\ln B_i(x_i)), \quad i=1,2. \]

Example 9.13 For example, the Clayton family is obtained by considering variables \(Z\) with a Gamma distribution (i.e., \({L}_Z\left(t\right) =\left( 1+t\right) ^{-1/\alpha }\) where \(\alpha>0\)).

Example 9.14 Hougaard proposed in 1986 to take the stable distribution of Laplace transform for \(Z\), i.e., \(L_Z(t) = \exp(-t^\alpha)\) with parameter \(\alpha\). Then we obtain \[ \overline{F}_{\boldsymbol{X}}(\boldsymbol{x})=\exp\left\{-\left(-\ln\overline{B}_1(x_1)-\ln\overline{B}_2(x_2)\right)^\alpha\right\}. \] In particular, by taking marginal distributions as Weibull distributions, i.e., \[ \overline{B}_i(x)=-\exp(-\alpha_ix^{\beta_i}), \quad x\in{\mathbb{R}}^+, \] we have \[ \overline{F}_{\boldsymbol{X}}(\boldsymbol{x})=\exp\left\{-\left(\alpha_1x_1^{\beta_1}+\alpha_2x_2^{\beta_2}\right)^\alpha\right\}. \] This choice ensures that both marginal and conditional distributions are Weibull.

9.5.3 Survival Function

Archimedean copulas can also be obtained in a third way. Consider pairs $( X_1,X_2) $ such that their joint survival function satisfies \[ \overline{F}_{\boldsymbol{X}}(x_1,x_2) = \overline{G}(x_1+x_2) \] where \(\overline{G}\) is a convex survival function with \(\overline{G}(0)=1\). The marginal distributions of \(X_1\) and \(X_2\) must then be identical, i.e., \[ \overline{F}_1(x) = \overline{F}_2(x) = \overline{G}(x). \] In this case, the survival copula of the pair $( X_1,X_2) $ is \[\begin{eqnarray*} \overline{C}(u_1,u_2) &=& \overline{F}_{\boldsymbol{X}}\left(\overline{F}_1^{-1}(u_1),\overline{F}_2^{-1}(u_2)\right)\\ &=& \overline{G}\left(\overline{G}^{-1}(u_1)+\overline{G}^{-1}(u_2)\right), \end{eqnarray*}\]% which defines an archimedean copula with \(\varphi=\overline{G}^{-1}\).

9.5.4 Regression Function

The regression function \(x_1\mapsto\mathbb{E}[X_2|X_1=x_1]\) is one of the most appreciated tools to describe the dependence between \(X_1\) and \(X_2\). In the archimedean case, the conditional tail function of \(X_2\) is given by \[\begin{eqnarray*} \Pr[X_2>x_2|X_1=x_1] &=& 1-C_{2|1}\big(F_1(x_1)|F_2(x_2)\big)\\ &=& 1-\frac{\partial}{\partial u_1}C\big(F_1(x_1),F_2(x_2)\big)\\ &=& 1-\frac{\varphi^{(1)}(F_1(x_1))}{\varphi^{(1)}\big(C\big(F_1(x_1)|F_2(x_2)\big)\big)}. \end{eqnarray*}\] We then obtain \[ \mathbb{E}[X_2|X_1=x_1]=\int_{x_2=0}^{+\infty}\left(1- \frac{\varphi^{(1)}(F_1(x_1))}{\varphi^{(1)}\big(C_\varphi(F_1(x_1),F_2(x_2))\big)}\right)dx_2. \]

::: {.example}[Frank Copula] In this case, we have \[ \frac{\partial}{\partial u_1}C(\boldsymbol{u})=\frac{\exp\big(-\alpha(u_1+u_2)\big)-\exp(-\alpha u_1)}{\exp\big(-\alpha(u_1+u_2)\big) -\exp(-\alpha u_1)-\exp(-\alpha u_2)+\exp(-\alpha)}. \] From which we can derive \[\begin{eqnarray*} \mathbb{E}[U_2|U_1=u_1] %&=&\int_{u_2=0}^1\left\{1-C_\alpha^{(1,0)}(u_1,u_2)\right\}du_2\\ &=&\frac{\big(1-\exp\alpha\big)u_1\exp(\alpha u_1)+\exp(\alpha)\big(\exp(\alpha u_1)-1\big)} {\big(\exp(\alpha u_1)-1\big)\big(\exp(\alpha)-\exp(\alpha u_1)\big)}. \end{eqnarray*}\] :::

9.5.5 Bivariate Integral Transformation

Given a pair \(\boldsymbol{U}\) of distribution functions \(C\), let \(K\) be the distribution function of the random variable \(C\left(U_1,U_2\right)\), i.e., \[ K\left(z\right) = {\Pr}[C\left(U_1,U_2\right) \leq z]. \] The following property is derived from (Genest and Rivest 1993).

The value \(K\left(t\right)\) has a geometric interpretation: the tangent to the curve \(y=\varphi\left(x\right)\) passing through the point \(t\) intersects the x-axis at the point \(K\left(t\right)\).

When the generator is a sufficiently regular function, it is possible to obtain the density of the copula in terms of this generator.

Proposition 9.13 Let \(C\) be an archimedean copula with generator \(\varphi\) that is twice differentiable. Then the copula \(C\) has the density, \[\begin{equation*} c\left(u_1,u_2\right) = -\frac{\varphi^{\prime\prime}\left(C\left(u_1,u_2\right)\right)\varphi^{\prime}\left(u_1\right)\varphi^{\prime}\left(u_2\right)}{\left\{\varphi^{\prime}\left(C\left(u_1,u_2\right)\right)\right\}^3}, \text{ for }u_1,u_2\in\left[0,1\right]. \end{equation*}\] \end{Proposition} ::: {.proof} Let’s start by taking the derivative of \(C\) with respect to \(u_1\), which yields \[\begin{equation} \label{ExpDens1} \varphi^{(1)}\left(C\left(\boldsymbol{u}\right)\right)\frac{\partial}{\partial u_1}C\left(\boldsymbol{u}\right)=\varphi^{(1)}\left(u_1\right). \end{equation}\] If we then take the derivative of this expression with respect to \(u_2\), we obtain \[ \varphi^{(2)}\left(C\left(\boldsymbol{u}\right)\right)\frac{\partial}{\partial u_2}C\left(\boldsymbol{u}\right) \frac{\partial}{\partial u_1}C\left(\boldsymbol{u}\right)+\varphi^{(1)}\left(C\left(\boldsymbol{u}\right)\right)\frac{\partial^2}{\partial u_1^2} C\left(\boldsymbol{u}\right)=0. \] By plugging the expression obtained in \(\eqref{ExpDens1}\) for \(\frac{\partial}{\partial u_1}C\left(\boldsymbol{u}\right)\) and \(\frac{\partial}{\partial u_2}C\left(\boldsymbol{u}\right)\), we get \[ c\left(\boldsymbol{u}\right)=\frac{\partial^2}{\partial u_1\partial u_2}C\left(\boldsymbol{u}\right)= -\frac{\varphi^{(2)}\left(C\left(\boldsymbol{u}\right)\right)\varphi^{(1)}\left(u_1\right)\varphi^{(1)}\left(u_2\right)} {\left\{\varphi^{(1)}\left(C\left(\boldsymbol{u}\right)\right)\right\}^3} \] which completes the proof.

We are now able to establish the expression of Kendall’s Tau associated with an archimedean copula.

Proof. Since \(C(\boldsymbol{u})=0\) for all \(\boldsymbol{u}\) such that \(\varphi(u_1)+\varphi(u_2)=\varphi(0)\), Proposition \(\ref{CopArchiDens}\) allows us to write \[ \tau_\varphi = 4\int\int_{\mathcal{D}}C(\boldsymbol{u}) \frac{\varphi^{(2)}(C(\boldsymbol{u}))\varphi^{(1)}(u_1)\varphi^{(1)}(u_2)} {\{\varphi^{(1)}(C(\boldsymbol{u}))\}^3}du_1du_2-1 \] where \[ \mathcal{D}=\big\{\boldsymbol{u}\in[0,1]^2\big|\varphi(u_1)+\varphi(u_2)<\varphi(0)\big\}. \] Now, let’s perform the change of variables \[ \left\{ \begin{array}{l} v_1=C(\boldsymbol{u})=\varphi^{-1}\big(\varphi(u_1)+\varphi(u_2)\big),\\ v_2=u_1, \end{array} \right. \] which ensures \(\boldsymbol{v}\in[0,1]^2\). For a fixed value of \(v_1\), it is easy to see that \(v_1\leq v_2\leq 1\). The Jacobian of this transformation is given by \[ \left\|\frac{\partial\boldsymbol{v}}{\partial\boldsymbol{u}}\right\| = \text{det}\left( \begin{array}{cc} \frac{\varphi '(u_1)}{\varphi '\big(C(\boldsymbol{u})\big)}&\frac{\varphi '(u_2)}{\varphi '\big(C(\boldsymbol{u})\big)}\\ 1&0 \end{array} \right) =-\frac{\varphi '(u_2)}{\varphi '\big(C(\boldsymbol{u})\big)}, \] and we easily obtain the announced result.

9.5.6 Order Relations for Archimedean Copulas

We have seen that when marginal distribution functions are continuous, supermodular comparison relies exclusively on underlying copulas. In the case of archimedean copulas, we can reduce a supermodular comparison to conditions on the generators, as shown in the following results.

Proposition 9.14 Let \(C_{1}\) and \(C_{2}\) be two archimedean copulas with respective generators \(\varphi _{1}\) and \(\varphi _{2}\). Then \(C_{1}\lsmC_{2}\) (i.e., $C_{1}() C_{2}() $ on $$) if, and only if, the function \(\varphi _{1}\circ \varphi _{2}^{-1}\) is subadditive, i.e., \[ \varphi _{1}\circ \varphi _{2}^{-1}\left( x+y\right) \leq \varphi _{1}\circ \varphi _{2}^{-1}\left( x\right) +\varphi _{1}\circ \varphi _{2}^{-1}\left( y\right) . \]

Proof. Let \(f=\varphi_1\circ\varphi_2^{-1}\). The function \(f\) thus defined is continuous, non-decreasing, and satisfies \(f(0)=0\). Clearly, \[ C_1(\boldsymbol{u})\leq C_2(\boldsymbol{u})\text{ for all }\boldsymbol{u}\in[0,1]^2 \] if, and only if, \[\begin{equation} \label{eqNels1} \varphi_1^{-1}\big(\varphi_1(u_1)+\varphi_1(u_2)\big)\leq \varphi_2^{-1}\big(\varphi_2(u_1)+\varphi_2(u_2)\big). \end{equation}\] Let \(x=\varphi_2(u_1)\) and \(y=\varphi_2(u_2)\). In this case, \(\eqref{eqNels1}\) is equivalent to \[\begin{equation} \label{eqNels2} \varphi_1^{-1}\big(f(x)+f(y)\big)\leq\varphi_2^{-1}(x+y) \end{equation}\] for all \(x,y\in[0,\varphi_2(0)]\). Furthermore, if \(x>\varphi_2(0)\) and \(y>\varphi_2(0)\), then each term in \(\eqref{eqNels2}\) vanishes.

Now suppose that \(C_1(\boldsymbol{u})\leq C_2(\boldsymbol{u})\) for all \(\boldsymbol{u}\in[0,1]^2\). Transform both sides of \(\eqref{eqNels2}\) by the generator \(\varphi_1\) and note that \(\varphi_1\circ\varphi_2^{-1}(u)\leq u\) for all \(u\geq 0\). This allows us to state that \(f(x+y)\leq f(x)+f(y)\) for all \(x,y\in{\mathbb{R}}^+\), so \(f\) is indeed subadditive. Conversely, if \(f\) is subadditive, then by applying \(\varphi_1^{-1}\) to each term of the inequality \(f(x+y)\leq f(x)+f(y)\) and noting that \(\varphi_1^{-1}\circ f=\varphi_2^{-1}\), we obtain \(\eqref{eqNels1}\), which concludes the proof.

However, verifying the subadditivity of \(f=\varphi_1\circ\varphi_2^{-1}\) may be as difficult as verifying directly that \(C_1\) and \(C_2\) are such that \(C_1(\boldsymbol{u})\leq C_2(\boldsymbol{u})\) for all \(\boldsymbol{u}\in[0,1]^2\). That’s why we present below several sufficient conditions for \(\varphi_1\circ\varphi_2^{-1}\) to be subadditive.

Example 9.15 As noted in Remark (\(\ref{archimedien alpha beta}\)), if $$ is a generator, then $_{,1}( t) =(t^{}) $ and \(\varphi _{1,\beta}\left( t\right) =\left[ \varphi \left( t\right) \right] ^{\beta }\) for $(0,1] $ and \(\beta \geq 1\) are also generators. Noting \(C_{\beta _{i}}\) as the archimedean copula with generator \(\varphi _{1,\beta _{i}}\), then if \(\beta _{1}\leq \beta _{1}\), \(C_{\beta _{1}}\preceq_{\text{sm}}C_{\beta _{2}}\). A similar relation exists for the \(C_{\alpha _{i}}\) archimedean copulas with generator \(\varphi\).

9.5.7 Study of a Function of Two Correlated Risks

9.5.7.1 Problem Presentation

Given two possibly correlated risks, say \(X_1\) and \(X_2\), it is interesting to study the distribution of their sum, \(X_1+X_2\). Kolmogorov formulated this problem relatively early; however, it was not until (Makarov 1981) that an answer to this question was obtained.

Due to the technicality of the developments, most of the results in this section are stated without proof.

We introduced the convolution product \(\star\) in Section \(\ref{SecProdConv}\) to obtain the cumulative distribution function of the sum of independent risks. But what happens if the variables are no longer independent? In particular, one may wonder if the “worst” case corresponds to comonotonicity and the “best” case to antimonotonicity (here, “worst” and “best” are with respect to the \(\preceq_{\text{VaR}}\) relation). This result is generally false, as shown in the following example.

Example 9.16 Consider the risks \(X_1\sim\mathcal{Exp}(1)\) and \(X_2\sim\mathcal{Exp}(1)\), possibly correlated. The inequalities \[ \exp(-x)\leq\Pr[X_1+X_2> x]\leq\exp(-(x-2\ln 2)_+)/2 \] hold for all \(x\in\mathbb{R}^+\). Figure \(\ref{Bounds}\) shows the bounds on the tail function of \(X_1+X_2\), as well as the values corresponding to independence (the tail function of the \(\mathcal{Gam}(2,1)\) distribution) and comonotonicity (the tail function of the \(\mathcal{Exp}(2)\) distribution). The same figure shows the bounds on the Value-at-Risk (VaR).

It can be seen that the curves corresponding to independence and comonotonicity intersect only once, and the tail function or extreme VaR values can be considerably larger than those obtained based on comonotonicity.

More generally, it is interesting to study the bounds of the cumulative distribution function of \(\Psi(X_1,X_2)\) for functions \(\Psi:\mathbb{R}^{2}\rightarrow\mathbb{R}\) with certain properties. In the probabilistic literature, this problem is a problem of probabilistic arithmetic (a very good overview of which is given in (Robert Charles Williamson et al. 1989); also see (Robert C. Williamson and Downs 1990)): two continuous random variables, \(X_1\) and \(X_2\), with cumulative distribution functions belonging to \(\mathcal{F}(F_1,F_2)\) are considered, and one seeks to explain the behavior of \(\Psi(X_1,X_2)\). The cumulative distribution function of \(Z=\Psi(X_1,X_2)\) satisfies \[ F_Z(z) = \int\int_{\mathcal{D}_\Psi(z)} dC\left(F_1(x_1), F_2(x_2)\right) \] where \[ \mathcal{D}_\Psi(z) = \left\{(x_1, x_2)\in\mathbb{R}^2 \,|\, \Psi(x_1, x_2) < z\right\} \] and \(C\) is the copula associated with the pair \((X_1,X_2)\). In particular, the convolution product \(F_1\star F_2\) of \(F_1\) and \(F_2\) corresponds to the cumulative distribution function of \(X_1+X_2\) when \(X_1\) and \(X_2\) are independent. It can also be written in the form \[\begin{equation*} \left(F_1\star F_2\right)(z) = \int\int_{\mathcal{D}_+(z)} dC_I\left(F_1(x_1), F_2(x_2)\right), \end{equation*}\] where \[ \mathcal{D}_+(z) = \left\{(x_1, x_2)\in\mathbb{R}^2 \,|\, x_1 + x_2 < z\right\}. \] This convolution can then be generalized to other forms of dependence (changing the copula \(C_I\) to a more general copula \(C\)) and to other types of operations than addition. This leads to the following definition.

Definition 9.16 Given a copula \(C\) and a function \(\Psi\), the \(\sigma\)-convolution for \(\Psi\) is defined as \[ \sigma _{C,\Psi}\left(F_1,F_2\right)(z) = \int\int_{\mathcal{D}_\Psi(z)} dC\left(F_1(x_1), F_2(x_2)\right). \] The cumulative distribution function of \(Z=\Psi(X_1,X_2)\) is \(\sigma_{C,\Psi}\left(F_1,F_2\right)\).

By taking certain specific functions \(\Psi\), the following result is obtained.

(B. Schweizer and Sklar 1983) then introduced specific functions: the sup-convolution and inf-convolution.

Definition 9.17 Given a copula \(C\) and a function \(\Psi\), the sup-convolution is defined as \[\begin{equation} \tau_{C,\Psi}\left(F,G\right)(z) = \sup \left\{ C\left(F(x), G(y)\right) \,|\, \Psi(x,y) = z\right\} \label{borne sup Schweizer et Sklar (1981)} \end{equation}\] and the inf-convolution as \[\begin{equation} \rho_{C,\Psi}\left(F,G\right)(z) = \inf \left\{ C\left(F(x), G(y)\right) \,|\, \Psi(x,y) = z\right\} \label{borne inf Schweizer et Sklar (1981)} \end{equation}\]

(M. J. Frank, Nelsen, and Schweizer 1987) tackled Makarov’s problem using copula theory.

Proposition 9.15 Let \(X_1\) and \(X_2\) be two random variables with respective cumulative distribution functions \(F_1\) and \(F_2\), both continuous. Then, for any \(t\in\mathbb{R}\), the following inequalities hold: \[\begin{equation*} \tau_{C_M}(F_{X},F_{Y})(t) \leq \Pr[X_1+X_2\leq t] \leq \rho_{C_M}(F_{X},F_{Y})(t) \end{equation*}\] where \[\begin{equation*} \tau_{C}(F_1,F_2)(t) = \sup \left\{C\left(F_1(u),F_2(v)\right) \,|\, u+v=t\right\} \end{equation*}\] and \[\begin{equation*} \rho_{C}(F_1,F_2)(t) = \inf \left\{\widetilde{C}\left(F_1(u),F_2(v)\right) \,|\, u+v=t\right\}. \end{equation*}\]

Example 9.17 Let \({\cal SE}xp(\alpha , \theta )\) be the translated negative exponential distribution with cumulative distribution function \[ F(x) = 1 - \exp \left( -(x-\theta)/\alpha\right) \] for \(\alpha > 0\) and \(x \ge \theta\). If \(X_1\sim{\cal SE}xp(\alpha_1, \theta_1)\) and \(X_2\sim{\cal SE}exp(\alpha_2, \theta_2)\), a direct calculation shows that \(\tau _{C_M}\) corresponds to the cumulative distribution function of the \({\cal SE}xp \left( \alpha_1 + \alpha_2, {\tilde \theta} \right)\) distribution, where \[ {\tilde \theta} = \theta_1 + \theta_2 + (\alpha_1 + \alpha_2)\log (\alpha_1+\alpha_2) - \alpha_1 \log(\alpha_1) - \alpha_2 \log (\alpha_2) . \] Similarly, it can be shown that \(\rho _{C_M}\) corresponds to the cumulative distribution function of the \({\cal SE}xp ( \max (\alpha_1 , \alpha_2) ,\theta_1 + \theta_2 )\) distribution.

Example 9.18 Let \({\cal SP}ar (\alpha , \lambda , \theta)\) be the translated Pareto distribution with cumulative distribution function \[ F(x) = 1 - \left\{ \frac{\lambda}{\lambda + (x- \theta) } \right\}^{\alpha}, \quad x \ge \theta \] where \(\alpha, \lambda > 0\). If \(X_1\sim{\cal SP} (\alpha , \lambda_1,\theta_1)\) and \(X_2\sim{\cal SP} (\alpha , \lambda_2,\theta_2)\), it can be verified that \(\tau _{C_M}\) corresponds to the cumulative distribution function of the \({\cal SP}ar \left( \alpha, {\tilde \lambda}, \theta_1 + \theta_2 + {\tilde \lambda} - \lambda_1 - \lambda_2 \right)\) distribution, where \[ {\tilde \lambda} = \left( \lambda_1^{\beta} + \lambda_2^{\beta}\right)^{1/\beta}\text{ with }\beta = \alpha/(\alpha + 1). \] Similarly, \(\rho_{C_M}\) corresponds to the cumulative distribution function of the \({\cal SP}ar ( \alpha, \max (\lambda_1, \lambda_2 ),\theta_1 + \theta_2)\) distribution.

It is possible to refine these results by assuming that additional information is available.

Proposition 9.16 Suppose that two copulas \(C_{1}\) and \(C_{2}\) bounding \(C\) are known, i.e., such that \[ C_{1}\left( u_1,u_2\right) \leq C\left( u_1,u_2\right) \] and \[ \widetilde{C}\left(u_1,u_2\right) \leq \widetilde{C}_{2}\left( u_1,u_2\right) \] for all \(0\leq u_1,u_2\leq 1\). Then, \[\begin{equation*} \tau_{C_{1}}\left( F_1,F_2\right) \left( t\right) \leq \Pr[X_1+X_2\leq t] \leq \rho_{\widetilde{C}_{2}}\left( F_1,F_2\right) \left( t\right) \end{equation*}\]

In particular, if it is known that \(X_1\) and \(X_2\) are positively quadrant-dependent, then Proposition \(\ref{Embrechtsetal}\) applies with \(C_1=C_I\).

Example 9.19 (continuation of Example \(\ref{ExDGM1}\)) Suppose that \(X_i\sim {\cal SE}xp(\alpha_i, \theta_i)\), and denote \(f_i\) as the probability density function, \(i=1,2\). Suppose also that \(X_1\) and \(X_2\) are positively quadrant-dependent. In this case, it is not possible to improve the upper bound on \(\Pr[X_1+X_2\leq t]\). However, the lower bound \(\tau_{C_I}\) is implicitly defined for \(s \ge \theta_1 + \theta_2\) by \[ \tau _{C_I}(s) =F_1(t_s)F_2(s-t_s) \] where \(t_s \in [\theta_1, s-\theta_2]\) is the unique solution of the equation \[\begin{equation}\label{truc} \frac{f_1(t)}{F_1(t)} = \frac{f_2(s-t)}{F_2(s-t)}. \end{equation}\]

In the particular case where \(\alpha_1 = \alpha_2\), we have \[ t_s = \frac{s-\theta_1-\theta_2}{2\alpha}, \] which allows us to explicitly obtain the lower bound, which is \[ \tau _{C_I}(s) = \left\{ 1 - \exp\left(-\frac{s-\theta_1-\theta_2}{2\alpha}\right) \right\}^2 \] for all \(s \ge \theta_1 + \theta_2\).

Example 9.20 (Continuation of Example \(\ref{ExDGM2}\)) Suppose that \(X_i\sim{\cal SP}ar(\alpha , \lambda_i, \theta_i)\), and denote \(f_i\) as the probability density functions, \(i=1,2\). If \(X_1\) and \(X_2\) are positively quadrant-dependent, it is not possible to improve the upper bound on the cumulative distribution function of \(X_1+X_2\). However, the lower bound becomes \[ \tau _{C_I}(s) = F_1(t_s) F_2(s-t_s)\text{ for }s \ge \theta_1 + \theta_2; \] with \(\theta_1 \le t_s \le s-\theta_2\) being the solution of the equation (\(\ref{truc}\)).

One can also be interested in bounds on the Value-at-Risk (VaR) of a sum \(X_1+X_2\). It can be shown that for any probability level \(\alpha\), the following inequalities hold: \[\begin{equation*} \inf_{C_M(u,v)=\alpha}\big\{ {\text{VaR}}[X_1;u]+{\text{VaR}}[X_2;v]\big\} \leq {\text{VaR}}[X_1+X_2;\alpha] \end{equation*}\]% and \[\begin{equation*} {\text{VaR}}[X_1+X_2;\alpha] \leq \sup_{\left( \widetilde{C}_M\right)(u,v)=\alpha}\big\{ {\text{VaR}}[X_1;u]+{\text{VaR}}[X_2;v]\big\}. \end{equation*}\]

The principle of duality stated by (M. Frank and Schweizer 1979) says that if \(C_{0}\leq C,\) \(\widetilde{C}\leq \widetilde{C}_{1}\), and if $:^{2}$ is a continuous increasing function (with $-a<b+$), then, for \(0\leq \alpha<1\), \[ {\text{VaR}}\big[\Psi(X_1,X_2);\alpha\big]\geq \inf_{C_{0}\left( u,v\right)=\alpha} \Psi \Big( {\text{VaR}}[X_1;u],{\text{VaR}}[X_2;v]\Big), \] and \[ {\text{VaR}}\big[\Psi(X_1,X_2);\alpha\big]\leq \sup_{\widetilde{C}_{1}\left( u,v\right)=\alpha} \Psi \Big( {\text{VaR}}[X_1;u],{\text{VaR}}[X_2;v]\Big). \]

9.6 Multivariate Discrete Distributions

9.6.1 Two Classes of Correlated Risks Model

9.6.1.1 Definition

Here we consider the case where the total loss amount \(S\) can be expressed as the sum of risks within two classes of risks, \(S=S_{1}+S_{2}\), where \[\begin{equation} \label{DefTotalLoss} S_1=\sum_{j=1}^{N_1}X_{1,j}\text{ and }S_2=\sum_{j=1}^{N_2}X_{2,j}, \end{equation}\] where the random variables \(N_1\) and \(N_2\) represent the numbers of claims for classes 1 and 2. We then make the following assumptions:

  1. \(X_{1,1},X_{1,2},\ldots\) are independent positive random variables with the same cumulative distribution function \(F_{1}\);
  2. \(X_{2,1},X_{2,2},\ldots\) are independent positive random variables with the same cumulative distribution function \(F_{2}\);
  3. the random variable \(N_1\) is independent of the sequence \(X_{1,1},X_{1,2},\ldots\); likewise, \(N_2\) is independent of the sequence \(X_{2,1},X_{2,2},\ldots\).
  4. the sequences \(X_{1,1},X_{1,2,\ldots}\) and \(X_{2,1},X_{2,{2}},\ldots\) are assumed to be mutually independent.

Example 9.21 Consider a portfolio consisting of individual auto insurance policies and multi-risk home insurance policies. In the event of a major flood, for example, both classes of risks can be affected, which leads to the rejection of the independence assumption between classes.

In this simple model, the dependence between \(S_1\) and \(S_2\) is induced by the correlation between \(N_1\) and \(N_2\), as shown in the following result.

Proposition 9.17 In the model \(\eqref{DefTotalLoss}\) based on the assumptions (1)-(4) stated above, \[\begin{equation*} \mathbb{C}[S_{1},S_{2}] =\mathbb{E}[X_{1,1}]\mathbb{E}[X_{2,1}]\mathbb{C}[N_{1},N_{2}] \end{equation*}\]

Proof. Since \[ \mathbb{E}[S_{1}S_{2}|N_{1},N_{2}]=N_{1}\mathbb{E}[X_{1,1}] \times N_{2}\mathbb{E}[X_{2,1}] \] we have \[ \mathbb{E}[S_{1}S_{2}] =\mathbb{E}[N_{1} N_{2}]\mathbb{E}[X_{1,1}]\mathbb{E}[X_{2,1}] \] which completes the proof.

To study this type of model, it is necessary to have distributions for the pair \((N_1,N_2)\). This is precisely the subject of the following sections.

9.6.2 Multivariate Bernoulli Distribution

Let $N_1er( _1) $ and $N_2er( _2) $. The distribution of the pair is described in the following table: \[\begin{equation*} \vline \begin{array}[t]{rccl} \hline N_1\backslash N_2\text{ }\vline & 0 & 1 & \vline\text{ } \\ \hline 0\text{ }\vline & p_{00} & p_{01} & \vline\text{ }\overline{\pi }_1=1-\pi_1 \\ 1\text{ }\vline & p_{10} & p_{11} & \vline\text{ }\pi _1 \\ \hline \text{ }\vline & \overline{\pi }_2=1-\pi _2& \pi _2 & \vline\text{ }1 \\ \hline \end{array}% \vline \end{equation*}\]

The following result shows that the linear correlation coefficient between \(N_1\) and \(N_2\) cannot reach the bounds of -1 and +1.

Proposition 9.18 The correlation between \(N_1\) and \(N_2\) is bounded as follows, \[\begin{equation*} \max \left\{ -\sqrt{\frac{\pi _1\pi _2}{\overline{\pi }_1\overline{\pi }_2}},-\sqrt{\frac{\overline{\pi }_1\overline{\pi }_2}{\pi _1\pi _2 }}\right\} \leq r\left( N_1,N_2\right) \leq \sqrt{\frac{\pi _{\min }\left( 1-\pi _{\max }\right) }{\pi _{\max }\left( 1-\pi _{\min }\right) }}, \end{equation*}\]% where \[\begin{equation*} \left\{ \begin{array}{c} \pi _{\min }=\min \left\{ \pi _1,\pi _2\right\} \\ \pi _{\max }=\max \left\{ \pi _1,\pi _2\right\}% \end{array}% \right. \end{equation*}\]

The graphs in Figure \(\ref{FIG-BORNES-BERNOULLI}\) show how the bounds change depending on \(\pi _1\) and \(\pi_2\), and the evolution of the bounds depending on \(\pi _1\), assuming that \(\pi _2\) takes on the values \(0.1\), \(0.4\), \(0.6\), and \(0.9\).

9.6.3 Common Shock Poisson Model: Bivariate Poisson Distribution

Given three independent Poisson variables, \(M_{1}\), \(M_{2}\), and \(L\), with parameters \(\lambda _{1}\), \(\lambda _{2}\), and $$, the bivariate Poisson distribution $oi( {1},{2},) $ is the distribution of the pair $( N_{1},N_{2}) $, where \(N_{1}=M_{1}+L\) and \(N_{2}=M_{2}+L\).

Definition 9.18 The bivariate Poisson distribution $oi( {1}, {2},) $ is defined by \[\begin{eqnarray*} &&{\Pr}[N_{1}=n_{1},N_{2}=n_{2}]\\ &=&\exp \left( -\lambda _{1}-\lambda _{2}-\mu \right) \sum_{j=0}^{\min \left( n_{1},n_{2}\right) }\frac{\mu ^{j}}{j!}\frac{\lambda _{1}^{n_{1}-j}}{\left( n_{1}-j\right) !}\frac{\lambda _{2}^{n_{2}-j}}{\left( n_{2}-j\right) !}, \end{eqnarray*}\]% for \(n_{1},n_{2}\in {\mathbb{N}}\).

For the marginal distributions \(N_{i}\), we can note that \[\begin{equation*} \mathbb{E}[N_{i}] =\mathbb{V}[ N_{i}] =\lambda_{i}+\mu \text{ for }i=1,2. \end{equation*}\] We also know that \[ \mathbb{E}[N_{1}|N_{2}=n_{2}] =n_{2}\frac{\mu}{\lambda _{2}+\mu }+\lambda _{1}, \] and \[ \mathbb{V}[N_{1}|N_{2}=n_{2}] =n_{2}\frac{\mu \lambda _{2}}{\left( \lambda _{2}+\mu \right) ^{2}}+\lambda _{1}. \] Now, let’s study the correlation coefficient between \(N_1\) and \(N_2\).

Proposition 9.19 The correlation coefficient between \(N_{1}\) and \(N_{2}\) is given by \[\begin{equation*} r\left( N_{1},N_{2}\right) =\frac{\mu }{\sqrt{\left( \lambda _{1}+\mu \right) \left( \lambda _{2}+\mu \right) }} \end{equation*}\]% which satisfies \[\begin{equation*} 0\leq r\left( N_{1},N_{2}\right) \leq \max \left\{ \sqrt{\frac{\lambda _{1}+\mu }{\lambda _{2}+\mu }},\sqrt{\frac{\lambda _{2}+\mu }{\lambda _{1}+\mu }}\right\} . \end{equation*}\]

9.6.4 Common Shock Bernoulli Model: The Marceau Model

Consider a portfolio consisting of \(n\) policies where the claim costs follow the standard individual model, i.e., \[\begin{eqnarray*} X_{i}&=&\left\{ \begin{array}{cc} Y_{i} & \text{if\thinspace }I_{i}=1 \\ 0 & \text{if\thinspace }I_{i}=0% \end{array}% \right. \\ &=&Y_{i}I_{i}\text{ where }I_{i}\thicksim \mathcal{B}er\left(q_{i}\right) \end{eqnarray*}\]% for \(i=1,2,...,n\). Let the random variables \(Y_{1},...,Y_{n}\) and \(I_{1},...,I_{n}\) be independent, while the \(Y_{i}\) are mutually independent. In contrast, assume that the \(I_{i}\) are correlated, with dependence modeled as follows: \[\begin{eqnarray*} I_{i}&=&\min \left\{ J_{i}+J_{0},1\right\}\\ &&\text{where }J_{i}\thicksim \mathcal{B}er\left( q_{i}^{\prime }\right) \text{ and }J_{0}\thicksim \mathcal{B}er\left( q_{0}\right) \\ && \text{are independent variables}. \end{eqnarray*}\] The probability generating function of \(I_{i}\) is then \[\begin{equation*} \mathbb{E}[s^{I_{i}}] =p_{0}\mathbb{E}[s^{J_{i}}] +q_{0}s=p_{0}p_{i}^{\prime }+\left( 1-p_{0}p_{i}^{\prime }\right) s \end{equation*}\]% meaning that \[ I_{i}\thicksim \mathcal{B}er\left( q_{i}\right)\text{ with }q_{i}=1-p_{0}p_{i}^{\prime }. \]

The variance of the cumulative cost \(S=X_{1}+...+X_{n}\) is then \[\begin{equation*} \mathbb{V}[S] =\sum_{i=1}^{n}\mathbb{V}[X_{i}]+2\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\mathbb{C}[X_{i},X_{j}] \end{equation*}\]% where% \[\begin{eqnarray*} \mathbb{C}[X_{i},X_{j}] &=&\mathbb{E}[ X_{i}X_{j}] -\mathbb{E}[X_{i}] \mathbb{E}[ X_{j}]\\ &=&\mathbb{E}[Y_{i}I_{i}Y_{j}I_{j}] -\mathbb{E}[ Y_{i}I_{i}] \mathbb{E}[ Y_{j}I_{j}] \\ &=&\mathbb{E}[ Y_{i}] \mathbb{E}[ Y_{j}] \mathbb{E}[ I_{i}I_{j}] -\mathbb{E}[ Y_{i}]\mathbb{E}[Y_{j}] \mathbb{E}[ I_{i}] \mathbb{E}[ I_{j}] \\ &=&\mathbb{E}[ Y_{i}] \mathbb{E}[ Y_{j}]\mathbb{C}[ I_{i},I_{j}] \end{eqnarray*}\]% with \[\begin{eqnarray*} \mathbb{C}[ I_{i},I_{j}] &=&\mathbb{E}[ I_{i}I_{j}] -\mathbb{E}[ I_{i}] \mathbb{E}[ I_{j}] \\ &=&\mathbb{E}\big[ \mathbb{E}[ I_{i}I_{j}|J_{0}] \big] -q_{i}q_{j} \\ &=&q_{0}+p_{0}q_{i}^{\prime }q_{j}^{\prime }-q_{i}q_{j}\\ &=&\frac{q_{0}}{1-q_{0}}\left( 1-q_{i}\right) \left( 1-q_{j}\right), \end{eqnarray*}\]% so that the correlation between \(I_{i}\) and \(I_{j}\) is positive and increases with \(q_{0}\). The variance of the sum \(S\) is then \[\begin{equation*} \mathbb{V}[ S] =\sum_{i=1}^{n}\mathbb{V}[ X_{i}] +2\frac{q_{0}}{1-q_{0}}\sum_{i=1}^{n-1}\sum_{j=1+1}^{n}p_{i}p_{j}\mathbb{E}[Y_{i}] \mathbb{E}[ Y_{j}]. \end{equation*}\]

Consider a portfolio consisting of \(n\) policies where the claim costs follow the standard individual model, i.e., \[\begin{eqnarray*} X_{i}&=&\left\{ \begin{array}{cc} Y_{i} & \text{if\thinspace }I_{i}=1 \\ 0 & \text{if\thinspace }I_{i}=0% \end{array}% \right. \\ &=&Y_{i}I_{i}\text{ where }I_{i}\thicksim \mathcal{B}er\left(q_{i}\right) \end{eqnarray*}\]% for \(i=1,2,...,n\). Let the random variables \(Y_{1},...,Y_{n}\) and \(I_{1},...,I_{n}\) be independent, while the \(Y_{i}\) are mutually independent. In contrast, assume that the \(I_{i}\) are correlated, with dependence modeled as follows: \[\begin{eqnarray*} I_{i}&=&\min \left\{ J_{i}+J_{0},1\right\}\\ &&\text{where }J_{i}\thicksim \mathcal{B}er\left( q_{i}^{\prime }\right) \text{ and }J_{0}\thicksim \mathcal{B}er\left( q_{0}\right) \\ && \text{are independent variables}. \end{eqnarray*}\] The probability generating function of \(I_{i}\) is then \[\begin{equation*} \mathbb{E}[s^{I_{i}}] =p_{0}\mathbb{E}[s^{J_{i}}] +q_{0}s=p_{0}p_{i}^{\prime }+\left( 1-p_{0}p_{i}^{\prime }\right) s \end{equation*}\]% meaning that \[ I_{i}\thicksim \mathcal{B}er\left( q_{i}\right)\text{ with }q_{i}=1-p_{0}p_{i}^{\prime }. \]

The variance of the cumulative cost \(S=X_{1}+...+X_{n}\) is then \[\begin{equation*} \mathbb{V}[S] =\sum_{i=1}^{n}\mathbb{V}[X_{i}]+2\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\mathbb{C}[X_{i},X_{j}] \end{equation*}\]% where% \[\begin{eqnarray*} \mathbb{C}[X_{i},X_{j}] &=&\mathbb{E}[ X_{i}X_{j}] -\mathbb{E}[X_{i}] \mathbb{E}[ X_{j}]\\ &=&\mathbb{E}[Y_{i}I_{i}Y_{j}I_{j}] -\mathbb{E}[ Y_{i}I_{i}] \mathbb{E}[ Y_{j}I_{j}] \\ &=&\mathbb{E}[ Y_{i}] \mathbb{E}[ Y_{j}] \mathbb{E}[ I_{i}I_{j}] -\mathbb{E}[ Y_{i}]\mathbb{E}[Y_{j}] \mathbb{E}[ I_{i}] \mathbb{E}[ I_{j}] \\ &=&\mathbb{E}[ Y_{i}] \mathbb{E}[ Y_{j}]\mathbb{C}[ I_{i},I_{j}] \end{eqnarray*}\]% with \[\begin{eqnarray*} \mathbb{C}[ I_{i},I_{j}] &=&\mathbb{E}[ I_{i}I_{j}] -\mathbb{E}[ I_{i}] \mathbb{E}[ I_{j}] \\ &=&\mathbb{E}\big[ \mathbb{E}[ I_{i}I_{j}|J_{0}] \big] -q_{i}q_{j} \\ &=&q_{0}+p_{0}q_{i}^{\prime }q_{j}^{\prime }-q_{i}q_{j}\\ &=&\frac{q_{0}}{1-q_{0}}\left( 1-q_{i}\right) \left( 1-q_{j}\right), \end{eqnarray*}\]% so that the correlation between \(I_{i}\) and \(I_{j}\) is positive and increases with \(q_{0}\). The variance of the sum \(S\) is then \[\begin{equation*} \mathbb{V}[ S] =\sum_{i=1}^{n}\mathbb{V}[ X_{i}] +2\frac{q_{0}}{1-q_{0}}\sum_{i=1}^{n-1}\sum_{j=1+1}^{n}p_{i}p_{j}\mathbb{E}[Y_{i}] \mathbb{E}[ Y_{j}]. \end{equation*}\]

9.7 Exercices

::: {.exercise}[Loi exponentielle bivariée de Gumbel] Considérons la fonction de répartition \[ F_{\boldsymbol{X}}\left(\boldsymbol{x}\right) =1-\exp \left( -x_1\right) -\exp \left(-x_2\right) +\exp \big( -x_1-x_2-\theta x_1x_2\big) \]\(\theta \in \left[ 0,1\right]\). Montrez que

  1. la densité est donnée par \[\begin{equation*} f_{\boldsymbol{X}}\left( \boldsymbol{x}\right) =\exp \left( -x_1-x_2-\theta x_1x_2 \right) % \Big( \left( 1+\theta x_1\right) \left( 1+\theta x_2\right) -\theta\Big) . \end{equation*}\]%
  2. cette loi traduit une absence de mémoire au sens où \[ \mathbb{E}[X_1-x_1|X_1>x_1\text{ et }X_2>x_2] =\mathbb{E}[ X_1|X_2>x_2]. \]

::: {.exercise}[Loi exponentielle bivariée de Marshall & Olkin]

Soient \(Z_{1}\sim\mathcal{E}xp(\lambda_1)\), \(Z_{2}\sim\mathcal{E}xp(\lambda_2)\) et \(Z_{12}\sim\mathcal{E}xp(\lambda_{12})\) trois variables aléatoires indépendantes, et posons \[ X_1=\min \left\{ Z_{1},Z_{12}\right\}\text{ et }X_2=\min \left\{Z_{2},Z_{12}\right\}. \]

  1. Montrez que \[\begin{eqnarray*} \overline{F}_{\boldsymbol{X}}\left( \boldsymbol{x}\right) &=&\exp \big( -\lambda _{1}x_1-\lambda _{2}x_2-\lambda _{12}\max \left\{ x_1,x_2\right\} \big)\label{Marshall Olkin survie} \end{eqnarray*}\]%
  2. Montrez que \[ \overline{F}_1\left( x_1\right) =\exp \big( -\left( \lambda _{1}+\lambda _{12}\right) x_1\big) \] et \[ \overline{F}_2\left(x_2\right) =\exp \big( -\left( \lambda _{2}+\lambda _{12}\right)x_2\big). \]
  3. Montrez que cette fonction de queue vérifie une autre propriété d’absence de mémoire, \[\begin{equation*} \overline{F}_{\boldsymbol{X}}\left( x_1+h,x_2+h\right) =\overline{F}_{\boldsymbol{X}}\left( x_1,x_2\right) \overline{F}_{\boldsymbol{X}}\left( h,h\right). \end{equation*}\]%
  4. En définissant \[ \alpha =\frac{\lambda _{12}}{\lambda _{1}+\lambda _{12}}\text{ et }\beta =\frac{\lambda_{12}} { \lambda_{2}+\lambda _{12}}, \] montrez que la copule associée à la cette loi s’écrit \[\begin{eqnarray*} C\left( u_1,u_2\right) &=&\min \left\{ u_1^{1-\alpha }u_2,u_1u_2^{1-\beta}\right\}\\ &=&\left\{ \begin{array}{cc} u_1^{1-\alpha }u_2 & \text{ si }u_1^{\alpha }\geq u_2^{\beta } \\ u_1u_2^{1-\beta } & \text{ si }u_1^{\alpha }\leq u_2^{\beta }% \end{array}% \right. \end{eqnarray*}\] \end{enumerate} :::

::: {.exercise}[Bivariate Gumbel Exponential Distribution] Consider the cumulative distribution function \[ F_{\boldsymbol{X}}\left(\boldsymbol{x}\right) =1-\exp \left( -x_1\right) -\exp \left(-x_2\right) +\exp \big( -x_1-x_2-\theta x_1x_2\big) \] where \(\theta \in \left[ 0,1\right]\). Show that

  1. the probability density function is given by \[\begin{equation*} f_{\boldsymbol{X}}\left( \boldsymbol{x}\right) =\exp \left( -x_1-x_2-\theta x_1x_2 \right) % \Big( \left( 1+\theta x_1\right) \left( 1+\theta x_2\right) -\theta\Big) . \end{equation*}\]%
  2. this distribution exhibits a lack of memory in the sense that \[ \mathbb{E}[X_1-x_1|X_1>x_1\text{ and }X_2>x_2] =\mathbb{E}[ X_1|X_2>x_2]. \]

:::

::: {.exercise}[Marshall & Olkin Bivariate Exponential Distribution]

Let \(Z_{1}\sim\mathcal{E}xp(\lambda_1)\), \(Z_{2}\sim\mathcal{E}xp(\lambda_2)\), and \(Z_{12}\sim\mathcal{E}xp(\lambda_{12})\) be three independent random variables, and define \[ X_1=\min \left\{ Z_{1},Z_{12}\right\}\text{ and }X_2=\min \left\{Z_{2},Z_{12}\right\}. \]

  1. Show that \[\begin{eqnarray*} \overline{F}_{\boldsymbol{X}}\left( \boldsymbol{x}\right) &=&\exp \big( -\lambda _{1}x_1-\lambda _{2}x_2-\lambda _{12}\max \left\{ x_1,x_2\right\} \big)\label{Marshall Olkin survie} \end{eqnarray*}\]%
  2. Show that \[ \overline{F}_1\left( x_1\right) =\exp \big( -\left( \lambda _{1}+\lambda _{12}\right) x_1\big) \] and \[ \overline{F}_2\left(x_2\right) =\exp \big( -\left( \lambda _{2}+\lambda _{12}\right)x_2\big). \]
  3. Show that this tail function satisfies another memoryless property, \[\begin{equation*} \overline{F}_{\boldsymbol{X}}\left( x_1+h,x_2+h\right) =\overline{F}_{\boldsymbol{X}}\left( x_1,x_2\right) \overline{F}_{\boldsymbol{X}}\left( h,h\right). \end{equation*}\]%
  4. By defining \[ \alpha =\frac{\lambda _{12}}{\lambda _{1}+\lambda _{12}}\text{ and }\beta =\frac{\lambda_{12}} { \lambda_{2}+\lambda _{12}}, \] show that the copula associated with this distribution can be written as \[\begin{eqnarray*} C\left( u_1,u_2\right) &=&\min \left\{ u_1^{1-\alpha }u_2,u_1u_2^{1-\beta}\right\}\\ &=&\left\{ \begin{array}{cc} u_1^{1-\alpha }u_2 & \text{ if }u_1^{\alpha }\geq u_2^{\beta } \\ u_1u_2^{1-\beta } & \text{ if }u_1^{\alpha }\leq u_2^{\beta }% \end{array}% \right. \end{eqnarray*}\] \end{enumerate} :::

::: {.exercise}[Basu & Block Bivariate Exponential Distribution]

Consider the following bivariate tail function: \[\begin{eqnarray*} \overline{F}_{\boldsymbol{X}}\left( \boldsymbol{x}\right) &=&\frac{\lambda _{1}+\lambda _{2}+\lambda _{12}}{\lambda _{1}+\lambda _{2}}\exp\big( -\lambda _{1}x_1-\lambda _{2}x_2-\lambda _{12}\max \left\{ x_1,x_2\right\}\big)\\ &&-\frac{\lambda _{12}}{% \lambda _{1}+\lambda _{2}}\exp \big( -\left( \lambda _{1}+\lambda _{2}+\lambda _{12}\right) \max \left\{ x_1,x_2\right\} \big). \end{eqnarray*}\]%

  1. Show that the tail function of \(X_1\) is given by \[\begin{eqnarray*} \overline{F}_1\left( x\right) &=&\frac{\lambda _{1}+\lambda _{2}+\lambda _{12}}{\lambda _{1}+\lambda _{2}}\exp \big( -\left( \lambda _{1}-\lambda_{12}\right) x\big)\\ &&-\frac{\lambda _{12}}{\lambda _{1}+\lambda _{2}}\exp % \big( -\left( \lambda _{1}+\lambda _{2}+\lambda _{12}\right)x\big). \end{eqnarray*}\]%
  2. Show that \(\min \left( X_1,X_2\right) \sim \mathcal{E}xp\left( \lambda _{1}+\lambda _{2}+\lambda _{12}\right) .\)

:::

::: {.exercise}[Cherian Distribution]

Consider three independent random variables, $Z_{i}xp( _{i}) $, and define \[ X_1=Z_{1}+Z_{3}\text{ and }X_2=Z_{2}+Z_{3}. \] Show that the density of the couple \(\boldsymbol{X}\) is given by \[\begin{equation*} f_{\boldsymbol{X}}\left(\boldsymbol{x}\right) =\frac{\exp \left( -x_1-x_2 \right) }{ \Gamma \left( \theta _{1}\right) \Gamma \left( \theta _{2}\right) \Gamma \left( \theta _{3}\right) }\int_{0}^{\min \left\{x_1,x_2\right\} }\left( x_1-t\right) ^{\theta _{1}-1}\left( x_2-t\right) ^{\theta _{2}-1}t^{\theta _{3}-1}dt \end{equation*}\]% :::

::: {.exercise}[Bivariate Beta Distribution]

Let $( X,Y) $ be a couple with density given by \[\begin{equation*} f_{\boldsymbol{X}}\left( \boldsymbol{x}\right) =\frac{\Gamma \left( \theta _{1}+\theta _{2}+\theta _{3}\right) }{\Gamma \left( \theta _{1}\right) \Gamma \left( \theta _{2}\right) \Gamma \left( \theta _{3}\right) }x_1^{\theta _{1}-1}x_2^{\theta _{2}-1}(1-x_1-x_2) ^{\theta _{3}-1}, \end{equation*}\]

  1. Show that $X_1et( {1},{2}+{3}) $ and $X_2et({2},{1}+{3}) $.
  2. Show that conditionally on \(X_1=x_1\), \[ \frac{X_2}{1-x_1}\sim\mathcal{B}er\left( \theta _{2},\theta_{3}\right) . \]
  3. Consider three independent random variables, $Z_1xp( _1) $, $Z_2xp( _2) $ and $Z_3xp( _3) $, and define \[ X_1=\frac{Z_{1}}{Z_{1}+Z_{2}+Z_{3}}\text{ and }X_{2}=\frac{Z_2}{Z_{1}+Z_{2}+Z_{3}}. \] Show that \(\boldsymbol{X}\) has the density given above. \end{enumerate} :::

::: {.exercise}[Bivariate Return Periods]

The return period of a risk \(X\) with level \(x^{\ast }\) (a “catastrophe” being defined as an event such that \(X>x^{\ast }\)) is defined by the random variable $T_{X}( x^{}) $ corresponding to the time between two catastrophes. Let \(Z_{i}\) be the time between two events, and $N_{X}( x^{}) $ be the number of events between two catastrophes, so that \[ T_{X}\left( x^{\ast }\right) =Z_{1}+Z_{2}+...+Z_{N_{X}\left( x^{\ast }\right) } \] and \[ \mathbb{E}[T_{X}\left( x^{\ast }\right)] =\mathbb{E}[ N_{X}\left( x^{\ast }\right)]\mathbb{E}[ Z_1] , \]

  1. Show that $N_{X}( x^{}) $ follows a geometric distribution, i.e., \[\begin{equation*} \Pr[ N_{X}( x^{\ast })=n] =F_{X}\left( x^{\ast }\right) ^{n-1}\left[ 1-F_{X}\left( x^{\ast }\right) \right]. \end{equation*}\]%
  2. Deduce that \[\begin{equation*} \mathbb{E}[ T_{X}\left( x^{\ast }\right)] =\frac{\mathbb{E}[ Z] }{1-F_{X}\left( x^{\ast }\right)}. \end{equation*}\]%
  3. In the case where both risks \(X\) and \(Y\) are based on the same events (i.e., the same \(Z_{i}\)), it is possible to consider the following two return periods: $T_{XY}^{-}( x{},y{}) $ and \(T_{XY}^{+}\left( x^{\ast },y^{\ast }\right)\), corresponding respectively to the return periods \(\{X>x^{\ast }\) or \(Y>y^{\ast }\}\), and \(\{X>x^{\ast }\) and \(Y>y^{\ast }\}\). Show that \[\begin{equation*} \mathbb{E}[T_{XY}^{-}\left( x^{\ast },y^{\ast }\right)] =\frac{\mathbb{E}[ Z] }{1-F_{XY}\left( x^{\ast },y^{\ast}\right)}, \end{equation*}\]% and \[\begin{equation*} \mathbb{E}[ T_{XY}^{-}\left( x^{\ast },y^{\ast }\right)] =% \frac{\mathbb{E}[ Z] }{1-F_{X}\left( x^{\ast }\right) -F_{Y}\left( y^{\ast }\right) +F_{XY}\left( x^{\ast },y^{\ast }\right) }. \end{equation*}\]
  4. It is also possible to consider conditional return periods \(T_{X|Y}\left( x^{\ast }|y^{\ast }\right)\), for example, \(X>x^{\ast }\) given \(Y>y^{\ast }\). Show that \[\begin{eqnarray*} &&\mathbb{E}[ T_{X|Y}\left( x^{\ast }|y^{\ast }\right) ]\\ &=&\frac{\mathbb{E}[ Z] }{\big( 1-F_{Y}\left( y^{\ast }\right) \big) \big( 1-F_{X}\left( x^{\ast }\right) -F_{Y}\left( y^{\ast }\right) +F_{XY}\left( x^{\ast },y^{\ast }\right) \big) }. \end{eqnarray*}\]

:::

Exercise 9.1 Suppose that \(\boldsymbol{X}{\preceq_{\text{sm}}}\boldsymbol{Y}\). Show that in this case: \[\begin{eqnarray*} \mathbb{E}[X_2\big|X_1>x_1] &\leq&\mathbb{E}[Y_2\big|Y_1>x_1]. \end{eqnarray*}\]

Exercise 9.2 Let \(X_1\) and \(X_2\) be independent random variables. Prove that \(X_1\) and \(X_1+X_2\) are associated.

Exercise 9.3 Suppose \(I_1\sim\mathcal{B}er(q_1)\) and \(I_2\sim\mathcal{B}er(q_2)\). Show that the following conditions are equivalent:

  1. \(\mathbb{C}[I_1,I_2]\geq 0\);
  2. \(I_1\) and \(I_2\) are positively quadrant-dependent;
  3. \(I_1\) and \(I_2\) are associated;
  4. \(I_1\) and \(I_2\) are conditionally increasing.

Exercise 9.4 Consider the random variable: \[ Z=F_{\boldsymbol{X}}\left( X_1,X_2\right) -F_1\left(X_1\right) F_2\left( X_2\right). \] Show that: \[ \mathbb{E}[Z] =\frac{ 3\tau -\rho}{12}. \]

::: {.exercise}[Tail Decreasing]

(Esary and Proschan 1972) introduced the notions of left-tail decreasing (\(LTD\)) and right-tail increasing (\(RTI\)) as follows: \(Y\) is left-tail decreasing with respect to \(X\) (denoted $LTD( Y|X) $) if and only if, for all \(x,x^{\prime },y\): \[\begin{equation*} x<x^{\prime }\text{ \ }\Longrightarrow \text{\ \ }{\Pr}[Y\leq y|X\leq x^{\prime }] \leq {\Pr}[ Y\leq y|X\leq x] \end{equation*}\] and similarly, \(Y\) is right-tail increasing with respect to \(X\) (denoted $RTI(Y|X) $) if and only if, for all \(x,x^{\prime },y\): \[\begin{equation*} x<x^{\prime }\text{ \ }\Longrightarrow \text{\ \ }{\Pr}[Y\leq y|X>x^{\prime }] \leq {\Pr}[Y\leq y|X>x]. \end{equation*}\]

Show that:

  1. $LTD( Y|X) $ if, and only if, for all \(0\leq v\leq 1\), \[ u\longmapsto \frac{C\left( u,v\right)}{u} \] is strictly decreasing in \(u\).
  2. $RTI( Y|X) $ if, and only if, for all \(0\leq v\leq 1\), \[ u\longmapsto \frac{ 1-u-v+C\left( u,v\right)}{1-u} \] is strictly increasing in \(u\), or equivalently, \[ u\longmapsto \frac{ v-C\left( u,v\right)}{ 1-u} \] is strictly decreasing in \(u.\)
  3. $LTD( Y|X) $ if, and only if, for all \(0\leq v\leq 1\), \[ \frac{\partial}{\partial u} C\left( u,v\right)\leq \frac{C\left( u,v\right)}{u} \] almost everywhere in \(u\).
  4. $RTI( Y|X) $ if, and only if, for all \(0\leq v\leq 1\), \[ \frac{\partial}{\partial u} C\left( u,v\right)\geq \frac{ v-C\left( u,v\right)}{ 1-u} \] almost everywhere in \(u\).

:::

::: {.exercise}[Notions of Negative Dependence]

The concept of positive quadrant dependence can be reversed to define a notion of negative dependence. Specifically, $( X_1,X_2) $ is said to be negatively quadrant dependent if, for all \(x_1\) and \(x_2\): \[\begin{equation*} {\Pr}[X_1\leq x_1,X_2\leq x_2] \leq {\Pr}[X_1\leq x_1] {\Pr}[X_2\leq x_2]. \end{equation*}\] Show:

  1. Consider \(\boldsymbol{N}\sim\mathcal{M}ult(n,p_1,p_2)\). Show that \(N_1\) and \(N_2\) are negatively quadrant dependent.
  2. Show that the Frank copulas express negative quadrant dependence when \(\alpha\leq 0\).

:::

::: {.exercise}[Min-Max Copula]

Consider the pair \(\left( X_{1:n},X_{n:n}\right)\) where \[ X_{1:n}=\min\{X_1,\ldots,X_n\}\text{ and }X_{n:n}=\max\{X_1,\ldots,X_n\}, \] with \(X_1,\ldots,X_n\) being independent and identically distributed with cumulative distribution function \(F\). Show that:

  1. The cumulative distribution functions of \(X_{1:n}\) and \(X_{n:n}\) are given by: \[\begin{equation*} F_{1:n}\left( x\right) =1-(\overline{F}\left( x\right)) ^{n}\text{ and }% F_{n:n}\left( x\right) =( F\left( x\right)) ^{n},\hspace{2mm}x\in{\mathbb{R}}. \end{equation*}\]
  2. The cumulative distribution function of the pair \(\left( X_{1:n},X_{n:n}\right)\) is given by:
  3. The cumulative distribution function of the pair \(\left( X_{1:n}, X_{n:n}\right)\) is given by% \[\begin{equation*} F_{1,n}\left( x_1, x_2\right) =\left\{ \begin{array}{ll} \big(F\left( x_1\right)\big) ^{n} - \left( F\left( x_2\right) - F\left( x_1\right)\right) ^{n}, & \text{ if }x_1 < x_2, \\ \big(F\left( x_2\right)\big) ^{n}, & \text{ if }x_1 \geq x_2. \end{array}% \right. \end{equation*}\]%
  4. The copula of the pair \(\left(X_{1:n}, X_{n:n}\right)\), denoted as \(C_{n}\), is in the following form% \[\begin{equation*} C_{n}\left( u_1, u_2\right) =\left\{ \begin{array}{l} u_2 - \left( u_2^{1/n} + \left( 1-u_1\right) ^{1/n}-1\right) ^{n} ,\\ \hspace{15mm}\text{ if }% 1-\left( 1-u_1\right) ^{1/n} < u_2^{1/n}, \\ u_2, \text{ if }1-\left( 1-u_1\right) ^{1/n} \geq u_2^{1/n}. \end{array}% \right. \end{equation*}\]%
  5. If $n$, then \(C_{n}\rightarrow C_I\).
  6. The Kendall’s tau and Spearman’s rho associated with the min-max copula are given respectively by% \[\begin{equation*} \tau =\frac{1}{2n-1}, \end{equation*}\]% and \[\begin{equation*} \rho =3-12\left( \begin{array}{c} 2n \\ n% \end{array}% \right) ^{-1}\sum_{k=0}^{n}\frac{\left( -1\right) ^{k}}{2n-k}\left( \begin{array}{c} 2n \\ n+k% \end{array}% \right) +12\frac{\left( n!\right) ^{3}}{\left( 3n\right) !}\left( -1\right) ^{n}. \end{equation*}\]% \end{enumerate} :::

Exercise 9.5 Consider the normal copula given in \(\eqref{CopNorm}\). Show that

  1. \(\tau=\frac{2}{\pi}\arcsin(\alpha)\).
  2. When \(\alpha\leq\alpha '\), \[ C_\alpha(\boldsymbol{u})\leq C_{\alpha '}(\boldsymbol{u}),\text{ for all }\boldsymbol{u}\in[0,1]^2, \] so that \(C_\alpha\) increases with \(\alpha\) in the sense of \(\preceq_{\text{sm}}\).
  3. \(C_1=C_W\), \(C_0=C_I\), and \(C_{-1}=C_M\).
  4. Show that \(C_\alpha\) expresses positive quadrant dependence whenever \(\alpha\geq 0\).
  5. \(\rho =\frac{6}{\pi} \arcsin \left( \alpha/2\right) .\)

Exercise 9.6 Morgenstein introduced the family in 1956 \[ C_\alpha(u_1,u_2)=u_1u_2\big(1+\alpha(1-u_1)(1-u_2)\big),\hspace{2mm} \alpha\in[-1,1]. \] Show that

  1. The associated density is \[ c_\alpha(\boldsymbol{u})=1+\alpha(1-2u_1)(1-2u_2). \]
  2. The Kendall’s tau associated is \[ \tau=\frac{2}{9}\alpha\in\left[-\frac{2}{9},\frac{2}{9}\right]. \]
  3. The linear correlation coefficient for a pair of random variables with joint distribution function \(C_\alpha\) satisfies \(r\in [-1/3,1/3]\).
  4. For any pair \(\boldsymbol{X}\) with \(C_\alpha\) as its copula, \(r(X_1,X_2)\leq 1/3\).
  5. If \(X_1\) and \(X_2\) are exponentially distributed with rate parameter 1 and have \(C_\alpha\) as their copula, \(r(X_1,X_2)\in[-1/4,1/4]\).

Exercise 9.7 The family of copulas \[ C(u,v,\theta )=uv/\left[ 1-\theta (1-u)(1-v)\right] \] where \(-1\leq \theta\leq +1\) is known as the Ali-Mikhail-Haq copula. Show that

  1. The associated density is \[\begin{equation*} c(u,v,\theta )=\frac{1+\theta \left[ \left( u+1\right) \left( v+1\right) -3% \right] +\theta ^{2}\left[ \left( u-1\right) \left( v-1\right) \right] }{% \left[ 1-\theta (1-u)(1-v)\right] ^{3}} \end{equation*}\]
  2. Kendall’s tau is given by% \[\begin{eqnarray*} \tau &=&\frac{3\theta -2}{3\theta }-\frac{2\left( 1-\theta \right) ^{2}}{% 3\theta ^{2}}\log \left( 1-\theta \right) . \end{eqnarray*}\]
  3. Spearman’s rho is given by% \[\begin{eqnarray*} \rho &=&-\frac{12\left( 1+\theta \right) }{\theta ^{2}}\int_{0}^{\theta }\frac{% \log \left( 1-x\right) }{x}dx-\frac{3\left( 12+\theta \right) }{\theta }\\ &&-% \frac{24\left( 1-\theta \right) }{\theta ^{2}}\log \left( 1-\theta \right) . \end{eqnarray*}\]
  4. This copula is an Archimedean copula with generator  \[ \varphi \left( t\right) =\log \left( 1-\theta \left[ 1-t\right] \right) -\log t. \]

Exercise 9.8 (Joe 1993) introduced the family \[ C(u,v,\theta )=1-\left( [1-u]^{\theta }+[1-v]^{\theta }-[1-u]^{\theta }[1-v]^{\theta }\right) ^{1/\theta } \] where \(\theta \geq 1\). Show that

  1. This copula has the density \[\begin{eqnarray*} c(u,v,\theta )&=&\Big([1-u]^{\theta }+[1-v]^{\theta }-[1-u]^{\theta}[1-v]^{\theta }\Big)^{-2+1/\theta }\\ &&[1-u]^{\theta -1}[1-v]^{\theta-1}\\ &&\Big(\theta -1+[1-u]^{\theta }+[1-v]^{\theta }-[1-u]^{\theta }[1-v]^{\theta }\Big). \end{eqnarray*}\]
  2. This copula is Archimedean with generator \[ \varphi \left( t\right) =-\log \left[ 1-\left( 1-t\right) ^{\theta }\right]. \]

Exercise 9.9 The Farlie-Gumbel-Morgenstein family has the copula \[\begin{equation*} C(u,v,\theta )=\frac{uv}{\left[ 1+(1-u^{\theta })(1-v^{\theta })\right] ^{1/\theta }} \end{equation*}\]% where \(0<\theta \leq +1\). Show that

  1. The associated density is \[\begin{equation*} c(u,v,\theta )=\frac{4-2\left( u^{\theta }+v^{\theta }\right) +\left( 1-\theta \right) u^{\theta }v^{\theta }}{\left( 2-u^{\theta }-y^{\theta }+u^{\theta }v^{\theta }\right) ^{2+1/\theta }}. \end{equation*}\]
  2. This copula is Archimedean with generator \[ \varphi \left( t\right) =\log \left( 2t^{-\theta }-1\right). \]

::: {.exercise}[Mardia Copula]

Show that the copulas associated with the bivariate distribution functions defined in Remark \(\ref{FrechetNonVide}\) are \[\begin{equation*} C\left( u,v\right) =\theta C_M\left( u,v\right) +\left( 1-\theta \right) C_W\left( u,v\right) \text{ for all }\theta \in \left[ 0,1\right] \text{.% } \end{equation*}\]% and \[\begin{eqnarray*} C\left( u,v\right) &=&\frac{\theta ^{2}\left( 1-\theta \right) }{2}C_M\left( u,v\right) +\left( 1-\theta ^{2}\right) C_I\left( u,v\right) \\ &&+\frac{% \theta ^{2}\left( 1+\theta \right) }{2}C_W\left( u,v\right), \end{eqnarray*}\] for \(\theta \in \left[ 0,1\right]\). :::

Exercise 9.10 Show that the Gumbel-Barnett copula, given by \[ C\left( u,v,\theta \right) =uv\exp \left( -\theta \log u\log v\right), \] is Archimedean with generator $ ( t) =( 1-t) $.

9.8 Bibliographical Notes

References on dependence modeling in statistics and probability are numerous, starting with (Lehmann 1966) and (Esary, Proschan, and Walkup 1967).

A comprehensive reference on the study of dependence is undoubtedly (Joe 1997). This highly detailed book covers all the concepts developed in this chapter, with a focus on temporal dependence through Markov chains, as well as longitudinal and categorical data. The properties of dependence measures originate from (Scarsini 1984). Other bibliographical references on the study of multiple risks include (Müller and Stoyan 2002) and (Shaked and Shanthikumar 2007). This essential reference book on risk comparison addresses both single random variables and random vectors. Several chapters of (Denuit et al. 2006) are also dedicated to dependence concepts, dependence relationships, and copulas. The connections between dependence concepts and stochastic orders were developed, for example, in (Dhaene and Goovaerts 1996).

Comonotonicity has seen significant development in actuarial science in recent years. See the two review articles by (Dhaene et al. 2002b) and (Dhaene et al. 2002a). Antimonotonicity is discussed in (Dhaene and Denuit 1999).

Readers interested in further studying copulas introduced by (Sklar 1959) can refer to (Nelsen 1998). Among the pioneers, (Deheuvels 1979) is also mentioned. Those interested in a historical approach to copulas (and in particular, the connection with \(t\)-norms in metric spaces of probability) can consult (Berthold Schweizer 1991). Also noteworthy is the article by (Frees and Valdez 1998), which introduced the concept of copula in actuarial science, and the comprehensive study by (Wang 1998).

Finally, on risk aggregation, note (Robert C. Williamson and Downs 1990), which deals mathematically with the sum of non-independent random variables or, more generally, transformations of multiple random variables. Another quality reference is (M. Frank 1991). Applications to actuarial science can be found in (Helene Cossette et al. 2001) and (Hélène Cossette, Denuit, and Marceau 2002).

For readers wishing to delve deeper into the mathematical problems associated with multiple risks, the reference work is (B. Schweizer and Sklar 1983).

References

Clayton, David G. 1978. “A Model for Association in Bivariate Life Tables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence.” Biometrika 65 (1): 141–51.
Cossette, Helene, Michel Denuit, Jan Dhaene, and Etienne Marceau. 2001. “Stochastic Approximations of Present Value Functions.” Bulletin of the Swiss Association of Actuaries 1: 15–28.
Cossette, Hélène, Michel Denuit, and Etienne Marceau. 2002. “Distributional Bounds for Functions of Dependent Risks.” Schweiz. Aktuarver. Mitt 1 (45-65): 69.
Deheuvels, Paul. 1979. “Propriétés d’existence Et Propriétés Topologiques Des Fonctions de dépendance Avec Applicationsa La Convergence Des Types Pour Des Lois Multivariées.” Comptes Rendus de l’Academie Des Sciences de Paris Série AB 288 (2): A145–48.
Denuit, Michel, Jan Dhaene, Marc Goovaerts, and Rob Kaas. 2006. Actuarial Theory for Dependent Risks: Measures, Orders and Models. John Wiley & Sons.
Dhaene, Jan, and Michel Denuit. 1999. “The Safest Dependence Structure Among Risks.” Insurance: Mathematics and Economics 25 (1): 11–21.
Dhaene, Jan, Michel Denuit, Marc J Goovaerts, Rob Kaas, and David Vyncke. 2002a. “The Concept of Comonotonicity in Actuarial Science and Finance: Applications.” Insurance: Mathematics and Economics 31 (2): 133–61.
———. 2002b. “The Concept of Comonotonicity in Actuarial Science and Finance: Theory.” Insurance: Mathematics and Economics 31 (1): 3–33.
Dhaene, Jan, and Marc J Goovaerts. 1996. “Dependency of Risks and Stop-Loss Order1.” ASTIN Bulletin: The Journal of the IAA 26 (2): 201–12.
Esary, James D, and Frank Proschan. 1972. “Relationships Among Some Concepts of Bivariate Dependence.” The Annals of Mathematical Statistics, 651–55.
Esary, James D, Frank Proschan, and David W Walkup. 1967. “Association of Random Variables, with Applications.” The Annals of Mathematical Statistics 38 (5): 1466–74.
Frank, Maurice J. 1979. “On the Simultaneous Associativity of \(F (x, y)\) and \(x+ y- F (x, y)\).” Aequationes Mathematicae 19: 194–226.
Frank, Maurice J, Roger B Nelsen, and Berthold Schweizer. 1987. “Best-Possible Bounds for the Distribution of a Sum—a Problem of Kolmogorov.” Probability Theory and Related Fields 74 (2): 199–211.
Frank, MJ. 1991. “Convolutions for Dependent Random Variables.” In Advances in Probability Distributions with Given Marginals: Beyond the Copulas, 75–93. Springer.
Frank, MJ, and B Schweizer. 1979. “On the Duality of Generalized Infimal and Supremal Convolutions.” Rendiconti Di Matematica 12 (1): 1–23.
Frees, Edward W, and Emiliano A Valdez. 1998. “Understanding Relationships Using Copulas.” North American Actuarial Journal 2 (1): 1–25.
Genest, Christian, and Louis-Paul Rivest. 1993. “Statistical Inference Procedures for Bivariate Archimedean Copulas.” Journal of the American Statistical Association 88 (423): 1034–43.
Gumbel, Emil Julius. 1960. “Multivariate Extremal Distributions.” Annals of Mathematical Statistics 31 (4): 1216–16.
Hutchinson, TP, and Chin Diew Lai. 1991. The Engineering Statistician’s Guide to Continuous Bivariate Distributions. Rumsby Scientific Publisher.
Joe, Harry. 1993. “Parametric Families of Multivariate Distributions with Given Margins.” Journal of Multivariate Analysis 46 (2): 262–82.
———. 1997. Multivariate Models and Multivariate Dependence Concepts. CRC press.
Kimeldorf, George, and Allan Sampson. 1975. “Uniform Representations of Bivariate Distributions.” Communications in Statistics–Theory and Methods 4 (7): 617–27.
Kimeldorf, George, and Allan R Sampson. 1978. “Monotone Dependence.” The Annals of Statistics, 895–903.
Kruskal, William H. 1958. “Ordinal Measures of Association.” Journal of the American Statistical Association 53 (284): 814–61.
Lehmann, Erich Leo. 1966. “Some Concepts of Dependence.” The Annals of Mathematical Statistics 37 (5): 1137–53.
Makarov, GD. 1981. “Inequalities for a Distribution Function of a Sum of Two Random Variables When Marginals Are Fixed.” Teoriya Veroyatnostei i Ee Primeneniya 26 (4): 815–17.
Müller, Alfred, and Dietrich Stoyan. 2002. Comparison Methods for Stochastic Models and Risks. Wiley.
Nelsen, Roger B. 1998. An Introduction to Copulas. Springer.
Scarsini, Marco. 1984. “On Measures of Concordance.” Stochastica 8 (3): 201–18.
Schweizer, Berthold. 1991. “Thirty Years of Copulas.” In Advances in Probability Distributions with Given Marginals: Beyond the Copulas, 13–50. Springer.
Schweizer, B., and A. Sklar. 1983. Probabilistic Metric Spaces. North-Holland.
Shaked, Moshe, and J George Shanthikumar. 2007. Stochastic Orders. Springer.
Sklar, M. 1959. “Fonctions de répartition à \(n\) Dimensions Et Leurs Marges.” Annales de l’Institut de Statistique de l’Université de Paris 8 (3): 229–31.
Wang, Shaun. 1998. “Aggregation of Correlated Risk Portfolios: Models and Algorithms.” In Proceedings of the Casualty Actuarial Society, 85:848–939. 163. Citeseer.
Williamson, Robert C, and Tom Downs. 1990. “Probabilistic Arithmetic. I. Numerical Methods for Calculating Convolutions and Dependency Bounds.” International Journal of Approximate Reasoning 4 (2): 89–158.
Williamson, Robert Charles et al. 1989. “Probabilistic Arithmetic.” PhD thesis, University of Queensland Brisbane.