^{1}

^{2}

^{3}

In this paper, we develop statistical inference for an important health inequality index proposed by Lv, Wang and Xu [1] for ordinal data. Asymptotic distributions of the indices are established. This allows us to make inference for the indices. Generalizations of the indices to multiple population setting are also studied. We demonstrate the effectiveness of our procedure using the health inequality data of several areas in Switzerland, and our results classify these areas into three classes based on their health inequalities.

The qualitative nature of SRHS data prevents the straightforward use of conventionally developed indices for measuring income inequality. A reasonable index for SRHS data should be invariant to rescalings of variables which preserve the order of categories.

Assessment on health inequality for ordered data has received attention in the last ten years, [

Recently, [

The reminder of the paper is organized as follows. In Section 2, we review the indices developed by [

According to [

Let d = ( d 1 , ⋯ , d m ) T and f = ( f 1 , ⋯ , f m ) T be the empirical and population frequencies of the health categories, respectively, while d i = n − 1 ∑ l = 1 n I ( V l = h i ) represents the relative frequency of individuals with health statuses equal to h i . Further let

Ω 0 ( f ) = diag ( f ) − f f T , (1)

and let G be a m × m matrix with the ( i , j ) th entry being 2 g ( | i − j | ) , where g ( ⋅ ) is a function of nonnegative integers, such that 0 = g ( 0 ) < g ( 1 ) < g ( 2 ) < ⋯ < g ( m − 1 ) . Lv, Wang and Xu [

I ( f ) = ∑ i = 1 m ∑ j ≠ i g ( | i − j | ) f i f j . (2)

Two typical choices for g ( ⋅ ) include the following:

g ( i ) = 2 i m − 1 , i = 1 , ⋯ , m − 1 ; g ( i ) = 2 α m − 1 − i , i = 1 , ⋯ , m − 1 , 0 < α < 1. (3)

Intuitively, the index is estimated by I ( d ) , an empirical plug-in estimator in statistics.

Base on the above indices, we establish the following asymptotic distribution.

Theorem 1. Using the delta method, we can establish that

n ( I ( d ) − I ( f ) ) ⇒ N ( 0 , σ 2 ) ,

where σ 2 = f T G T Ω 0 ( f ) G f .

In practice, σ 2 is unknown and must be estimated. Given that d is a consistent estimator of f , the asymptotic variance can be estimated by σ ^ 2 = d T G T Ω 0 ( d ) G d . Based on the asymptotic result, the two-sided symmetric 100 ( 1 − α ) % asymptotic confidence interval for the health inequality index I ( f ) can be constructed as

( I ( d ) − z 1 − α / 2 σ ^ 2 / n , I ( d ) + z 1 − α / 2 σ ^ 2 / n ) ,

where z 1 − α / 2 is the 1 − α / 2 quantile of the standard normal distribution.

We first consider two populations with f i = ( f i 1 , ⋯ , f i m ) T , d i = ( d i 1 , ⋯ , d i m ) T and V i = ( V i 1 , ⋯ , V i n i ) T , i = 1 , 2 . Our analysis considers the cases of mutually de-

pendent samples and independent samples, with the former being relevant in examining the evolution of health inequalities in a single group (e.g., changes in health inequality over time), while the latter being relevant in comparing health inequality between two groups (e.g., cross-national). The sampling is performed independently within each group.

Lemma 1. Using the delta method, we have

n ( I ( d ) − I ( f ) ) = n f T G T ( d − f ) + o p ( 1 ) = 1 n ∑ l = 1 n f T G T ( I ( V l = h ) − f ) + o p ( 1 )

Theorem 2. Let σ i j be the ( i , j ) th entry of two populations’ covariance matrix. Denote n ˜ = max { n 1 , n 2 } , λ i = lim n ˜ → ∞ ( n i / n ˜ ) , 0 < λ i ≤ 1 , i = 1 , 2 . The asymptotic distribution of I ( d 1 ) − I ( d 2 ) is

n ˜ [ ( I ( d 1 ) − I ( d 2 ) ) − ( I ( f 1 ) − I ( f 2 ) ) ] ⇒ N ( 0 , σ 11 2 / λ 1 + σ 22 2 / λ 2 − 2 σ 12 2 / λ 1 λ 2 ) .

Now we consider hypothesis testing problem,

H 0 , 1 : I ( f 1 ) = I ( f 2 ) v . s . H a , 1 : I ( f 1 ) ≠ I ( f 2 ) .

We introduce the following Wald statistic:

T n = n ˜ ( I ( d 1 ) − I ( d 2 ) ) 2 σ 11 2 / λ 1 + σ 22 2 / λ 2 − 2 σ 12 2 / λ 1 λ 2 . (4)

Then under the null hypothesis H 0 , T n ⇒ X 1 2 as n → ∞ . The corresponding p -value can be computed by the following formula:

p = P r ( T ≥ t o b s ) = 1 − F X 1 2 ( t o b s ) , (5)

where F X 1 2 ( ⋅ ) represents the cumulative distribution function of the chi-squared variable with one degree of freedom.

These results are general, an assumption of independent populations is not required, this implies that our test work with the unbalanced designs case. If these two populations are treated as independent, then Cov { I ( V 1 l = h ) , I τ ( V 2 l = h ) } = 0 and thus σ 12 2 = 0 . For a particular circumstance, when the sample sizes of these two populations are equal, n 1 = n 2 = n , we can have λ 1 = λ 2 = 1 and then the asymptotic distribution in Theorem 2 reduces to

n [ ( I ( d 1 ) − I ( d 2 ) ) − ( I ( f 1 ) − I ( f 2 ) ) ] ⇒ N ( 0 , σ 11 2 + σ 22 2 − 2 σ 12 2 ) . (6)

We propose statistical inference procedures to test the equality between samples in terms of their health inequality indices. This equality issue often emerges when checking for the similarity of the health inequalities in the whole country or in a specified region. For example, China, a country consists of many administrative regions, such as Eastern China, North China, and Central Region, with each region having several provinces. Those provinces in the same region have similar economic and/or social behaviors. Therefore, those provinces in the same region are assumed to have the same health inequalities. We also examine whether the health inequality index of a province is the same as the average index of the entire region. The above two testing problems lead to another application. If the preceding analysis reveals that the provinces within each region have equal indices, then we can check whether the common means in two regions are also the same. Accordingly, we cluster the regions based on the test results. In other words, if several regions have the same health inequality, then we can view these regions as one cluster.

Suppose there are r ( r ≥ 3 ) populations with f i = ( f i 1 , ⋯ , f i m ) T ,

d i = ( d i 1 , ⋯ , d i m ) T and V i = ( V i 1 , ⋯ , V i n i ) T , i = 1 , ⋯ , r . For the dependent sam-ples, we can obtain the similar results as those presented in Section 2. However, the covariance structure becomes too complex to be practical when more samples are used. We only consider independent samples for simplification. A global test can be constructed as:

H 0 , 2 : I ( f 1 ) = ⋯ = I ( f r ) v . s . H a , 2 : I ( f i ) ≠ I ( f j ) (7)

for some i ≠ j ( i , j = 1 , 2 , ⋯ , r ) .

Define the matrix R = [ I ( r − 1 ) × ( r − 1 ) , − L ( r − 1 ) × 1 ] , where I ( r − 1 ) × ( r − 1 ) is an identity matrix with r − 1 dimension, and L ( r − 1 ) × 1 is a r − 1 dimensional vector with all the elements being 1. Then, Hypothesis in (7) can be rewritten as follows:

H 0 , 3 : R I ( f ) = 0 , v . s . H a , 3 : R I ( f ) ≠ 0 ,

where I ( f ) = ( I ( f 1 ) , ⋯ , I ( f r ) ) T .

Define n ˜ = max { n i , i = 1 , ⋯ , r } , λ i = lim n ˜ → ∞ ( n i / n ˜ ) , 0 < λ i ≤ 1 , i = 1 , ⋯ , r . Given the independence of the r groups of samples, we can obtain

n ˜ ( I ( d ) − I ( f ) ) ⇒ N ( 0 , Σ ) , (8)

where Σ = diag ( σ 11 2 / λ 1 , ⋯ , σ r r 2 / λ r ) . Therefore,

n ˜ ( R I ( d ) − R I ( f ) ) ⇒ N ( 0 , R Σ R T )

and

n ˜ ( R I ( d ) − R I ( f ) ) T ( R Σ R T ) − 1 ( R I ( d ) − R I ( f ) ) ⇒ X r − 1 2 . (9)

Note that under the null hypothesis, R I ( f ) = 0 in (9). Consequently, a Wald type of test statistic can be defined as

T r = n ˜ I T ( d ) R T ( R Σ ^ R T ) − 1 R I ( d ) , (10)

where Σ ^ = diag ( n ˜ σ ^ 11 2 / n 1 , ⋯ , n ˜ σ ^ r r 2 / n r ) is an estimator of Σ . Given the central role of the test statistic T r , we state the asymptotic behavior of T r under the null hypothesis in the following theorem.

Theorem 3. Let n ˜ = max { n i , i = 1 , ⋯ , r } , λ i = lim n ˜ → ∞ ( n i / n ˜ ) , 0 < λ i ≤ 1 , i = 1 , ⋯ , r , then under the null hypothesis H 0 , 1 : I ( f 1 ) = ⋯ = I ( f r ) in (7), we have T r ⇒ X r − 1 2 .

The corresponding p-value can be computed by:

p = P r ( T r ≥ t r , o b s ) = 1 − F X r − 1 2 ( t r , o b s ) , (11)

where F X r − 1 2 ( ⋅ ) represents the cumulative distribution function of the chi-squared variable with r − 1 degrees of freedom. The equality hypothesis (7) can be regarded as a generalization of the two-sample comparison case. The availability of this hypothesis can be seen clearly in our empirical application.

Another interesting problem in the multiple sample case is whether the health inequality of a specified population is the same as the average health inequality of entire population. For instance, one may interest to investigate the health inequality level in Hebei province is higher or lower than the average level of all provinces in the North China region. Accordingly, we propose the following testing hypothesis:

H 0 , 4 : I ( f j ) = I ( f 0 ) v . s . H a , 4 : I ( f j ) ≠ I ( f 0 ) (12)

for some j ( j = 1 , 2 , ⋯ , r ) . If the null hypothesis H 0 , 1 in (7) holds, then null hypothesis H 0 , 4 holds naturally. In other words, hypothesis H 0 , 4 only becomes meaningful when hypothesis H 0 , 1 is not true.

Define a j = ( − 1 / r , ⋯ , − 1 / r , 1 − 1 / r , − 1 / r , ⋯ , − 1 / r ) T , that is, a j is a r × 1 vector with its j -th element being 1 − 1 / r and other elements all being − 1 / r . Hypothesis (12) can be rewritten as follows

H 0 , 4 : a j T I ( f ) = 0 , v . s . H a , 4 : a j T I ( f ) ≠ 0.

Recall that n ˜ ( I ( d ) − I ( f ) ) ⇒ N ( 0 , Σ ) in (8) holds, we can obtain

n ˜ ( a j T I ( d ) − a j T I ( f ) ) ⇒ N ( 0 , a j T Σ a j ) .

Similar to the derivation of T r , we can construct the following test statistic

T r a j = n ˜ I T ( d ) a j ( a j T Σ a j ) − 1 a j T I ( d ) . (13)

Under the null hypothesis in (12), T r a j ⇒ X 2 ( 1 ) . Then the p-value can be determined similarly as that for T r .

Further, we discuss the hypothesis testing between clusters. Assume now that our preliminary analysis reveals that the provinces the corresponding region (cluster), such as Eastern China region, have the same health inequality indices. We may then examine whether the health inequalities between two regions are similar. To this end, we choose two representative provinces in each region and then compare their health inequality indices following the proposed approaches in Section 2. However, this method does not employ all information in these groups. To use all underlying information, we compare the common means of these two regions. We consider the following hypothesis:

H 0 , 5 : I ( f 01 ) = I ( f 02 ) versus H a , 5 : I ( f 01 ) ≠ I ( f 02 ) (14)

where I ( f 01 ) = ∑ i = 1 τ 1 I ( f i ) / r 1 and I ( f 02 ) = ∑ i = r 1 + 1 τ 1 + r 2 I ( f i ) / r 2 .

Without loss of generality, we assume that the first r 1 populations are clustered in one group with a common health inequality I ( f 01 ) , while the r 1 + 1 to r 1 + r 2 populations are clustered in another group with another common health inequality I ( f 02 ) . H 0 , 5 only becomes meaningful when null hypothesis H 0 , 1 in (7) is not true. Define

b = ( 1 / r 1 , ⋯ , 1 / r 1 , − 1 / r 2 , ⋯ , − 1 / r 2 , 0 , ⋯ , 0 ) T . .

That is, b is a r × 1 vector with its first r 1 elements being 1 / r 1 , the r 1 + 1 to r 1 + r 2 elements being − 1 / r 2 and the other elements being 0. Similar to the derivation of T r a , we can construct the test statistic as follows:

T r b = n ˜ I T ( d ) b ( b T Σ b ) − 1 b T I ( d ) . (15)

Under the null hypothesis in (14), T r b ⇒ X 2 ( 1 ) , thus p-value can be determined similarly as that for T r .

To illustrate our proposed procedures, we present a real application by using the data of the Swiss Health Survey [SHS] in 2002, conducted by Switzerland's Federal Statistical Office. A total of 19,706 observations were collected from seven areas in Switzerland. The survey respondents were asked to rate their health statuses on a five-point scale ranging from very bad to very good. This dataset was also analyzed by [

Area | F1 | F2-1 | F2-2 | F2-3 |
---|---|---|---|---|

Leman | 0.3934(0.0073)1 | 0.8985(0.0102)1 | 0.3226(0.0058)1 | 0.0750(0.0034)1 |

North-West | 0.3589(0.0090)3 | 0.8277(0.0142)3 | 0.2937(0.0068)3 | 0.0651(0.0033)3 |

Central | 0.3151(0.0084)6 | 0.7930(0.0159)4 | 0.2601(0.0065)6 | 0.0439(0.0018)7 |

Middle-Land | 0.3665(0.0064)2 | 0.8572(0.0100)2 | 0.3013(0.0051)2 | 0.0662(0.0027)2 |

East | 0.3211(0.0070)4 | 0.7899(0.0131)5 | 0.2637(0.0054)4 | 0.0467(0.0016)5 |

Ticino | 0.3205(0.0170)5 | 0.7245(0.0303)7 | 0.2605(0.0134)5 | 0.0583(0.0056)4 |

Zurich | 0.3138(0.0066)7 | 0.7735(0.0125)6 | 0.2579(0.0051)7 | 0.0456(0.0015)6 |

Due to the reason of random sampling of the data set, it is natural to ask questions, like, do East and Ticino have different health inequalities in fact? Do Central and Zurich have the same health inequality actually? We use statistical inferences to address these problems. To fully answer these questions, various interesting two-sample comparison tests are carried out, the results are reported in

Based on the above analysis, we classify North-West and Middle-Land, East and Ticino, and Central and Zurich into three groups. However, can we combine two groups, such as the East and Ticino group with the Central and Zurich group? The question is equivalent to ask whether the average health inequality of the East and Ticino group is the same as that of the other group. The p-values of tests by using the above four measures are 0.5505, 0.1778, 0.7105 and 0.0140, respectively, which are all larger than 5 % except for F2-3. Therefore, East, Ticino, Central, and Zurich may be clustered into one group. We also check whether these four regions have the same health inequality levels. The p-values for this global equality hypothesis testing are 0.8805, 0.1824, 0.8946 and 0.0942, respectively, which suggest that these regions have the same inequality levels. We then examine whether this four-member group can be enlarged by including the North-West and Middle-Land group? We propose two hypotheses to investigate this question. First, are the average inequalities of North-West and Middle-Land similar to those of the other groups? Second, do these six regions have the same health inequality levels? For these two hypotheses, all the p-values resulting from tests with the four measures are significantly smaller than 5 % , which indicate that the average health inequality of the North-West and Middle-Land group is different from that of the four-member group. We then examine whether the

Area | F1 | F2-1 | F2-2 | F2-3 |
---|---|---|---|---|

Leman vs Middle-Land | 0.0054 | 0.004 | 0.0056 | 0.0378 |

North-West vs Middle-land | 0.4765 | 0.0902 | 0.3723 | 0.7969 |

North-West vs East | 0.0006 | 0.0494 | 0.0005 | 4.04E−07 |

North-West vs Ticino | 0.0436 | 0.002 | 0.0273 | 0.3002 |

Central vs Zurich | 0.9032 | 0.3349 | 0.7904 | 0.4494 |

East vs Ticino | 0.9739 | 0.0476 | 0.8275 | 0.0473 |

health inequality level of Leman is the same as the average level of the North- West and Middle-Land group. The p-values of all four measures are strongly smaller than 5 % , which indicate that the health inequality level of Leman is different from the average level of the North-West and Middle-Land group. In sum, we classify these seven regions into three groups, that is, Leman, North- West and Middle-Land, and the other four regions.

In this paper, we propose several statistical inference procedures for the novel health inequality indices introduced in [

This research was funded by the Fundamental Research Funds for the Central Universities, China Postdoctoral Science Foundation (2016M600951), National Natural Science Foundation of China (11101432) and Natural Science Foundation of Guangdong Province, China (2016A030313856). The authors thank the editor, the associate editor and the anonymous referee for their constructive comments and suggestions which led to a substantial improvement of an early manuscript. All correspondence should be addressed to Xuejun, Jiang, Department of Mathematics, Southern 6 University of Science and Technology, Shenzhen, China, E-mail: jiangxj@sustc.edu.cn.

Niu, C.Z., Hong, S.X. and Jiang, X.J. (2017) Statistical Inference for a Novel Health Inequality Index. Theoretical Economics Letters, 7, 251-262. https://doi.org/10.4236/tel.2017.72021

Proof of Theorem 1.

Given that d is a consistent estimator of f , for the empirical frequency of the health categories, we can easily obtain the following:

n ( d − f ) ⇒ N ( 0 , Ω 0 ( f ) ) , (A.1)

where

Ω 0 ( f ) = diag ( f ) − f f T .

Note that g ( | i − j | ) f i f j = g ( | j − i | ) f j f i . Therefore,

I ( f ) = 2 ∑ i = 1 m − 1 ∑ j = i + 1 m g ( | i − j | ) f i f j .

Define J = ( ∂ I ( f ) ∂ f 1 , ⋯ , ∂ I ( f ) ∂ f m ) T It can be easily shown that

∂ I ( f ) ∂ f i = 2 ∑ l ≠ i g ( | l − i | ) f l . (A.2)

Alternatively, we can have J = G f . Keeping only the first two terms of the Taylor expansion, we can estimate I ( f ) as

I ( d ) ≈ I ( f ) + J T ⋅ ( d − f ) .

Then the variance of I ( d ) − I ( f ) is approximated by

Var ( J T ⋅ ( d − f ) ) = J T ⋅ Cov ( d − f , d − f ) ⋅ J = f T G T ( Ω 0 / n ) G f .

Also E ( I ( d ) ) = E ( I ( f ) ) since d is a consistent estimator of f , it follows that

n ( I ( d ) − I ( f ) ) ⇒ N ( 0 , σ 2 ) ,

where σ 2 = f T G T Ω 0 ( f ) G f .

Proof of Theorem 2.

Define n ˜ = max { n 1 , n 2 } , λ i = lim n ˜ → ∞ ( n i / n ˜ ) , 0 < λ i ≤ 1 , i = 1 , 2 . From Lemma 1, we have

n ( I ( d ) − I ( f ) ) = n f T G T ( d − f ) + o p ( 1 ) = 1 n ∑ l = 1 n f T G T ( I ( V l = h ) − f ) + o p ( 1 ) .

Let σ h 2 = Cov ( I ( V 1 l = h ) I T ( V 2 l = h ) ) . After that, similarly as the proof of Theorem 1, it can be derived that

n ˜ ( I ( d 1 ) − I ( f 1 ) I ( d 2 ) − I ( f 2 ) ) = ( n ˜ n 1 × n 1 ( I ( d 1 ) − I ( f 1 ) ) n ˜ n 2 × n 2 ( I ( d 2 ) − I ( f 2 ) ) ) = ( n ˜ n 1 × 1 n 1 ∑ l = 1 n 1 f 1 T G T ( I ( V 1 l = h ) − f 1 ) n ˜ n 2 × 1 n 2 ∑ k = 1 n 2 f 2 T G T ( I ( V 2 k = h ) − f 2 ) ) + o p ( 1 ) ⇒ N ( ( 0 0 ) , ( σ 11 2 / λ 1 σ 12 2 / λ 1 λ 2 σ 21 2 / λ 1 λ 2 σ 22 2 / λ 2 ) ) . (A.3)

Here,

σ 11 2 = f 1 T G T Ω 0 ( f 1 ) G f 1 , σ 12 2 = σ 21 2 = f 1 T G T Cov ( I ( V 1 l = h ) I T ( V 2 l = h ) ) G f 2 = f 1 T G T σ h 2 G f 2 , σ 22 2 = f 2 T G 0 T ( f 2 ) G f 2

Plug in the consistent estimators of f i and σ h 2 by d i and

σ ^ h 2 = { min ( n 1 , n 2 ) } − 1 ∑ l = 1 min { n 1 , n 2 } I ( V 1 l = h ) I T ( V 2 l = h ) − d 1 d 2 T ,

respectively, thus we can easily estimate σ i , j 2 , i , j = 1 , 2 consistently.

Combining the above result in (A.3), we can obtain

n ˜ [ ( I ( d 1 ) − I ( d 2 ) ) − ( I ( f 1 ) − I ( f 2 ) ) ] = n ˜ [ ( I ( d 1 ) − I ( f 1 ) ) − ( I ( d 2 ) − I ( f 2 ) ) ] ⇒ N ( 0 , σ 11 2 / λ 1 + σ 22 2 / λ 2 − 2 σ 12 2 / λ 1 λ 2 ) .