Depth based multivariate Wilcoxon test for a scale difference.
mWilcoxonTest(x, y, alternative = "two.sided", depth_params = list())
x | data matrix |
---|---|
y | data matrix |
alternative | a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". |
depth_params | list of parameters for function depth (method, threads, ndir, la, lb, pdim, mean, cov, exact). |
Having two samples \( {X} ^ {n} \) and \( {Y} ^ {m} \) using any depth function, we can compute depth values in a combined sample \( {Z} ^ {n + m} = {X} ^ {n} \cup {Y} ^ {m} \), assuming the empirical distribution calculated basing on all observations, or only on observations belonging to one of the samples \( {X} ^ {n} \) or \( {Y} ^ {m} \).
For example if we observe \( {X}_{l}'s \) depths are more likely to cluster tightly around the center of the combined sample, while \( {Y}_{l}'s \) depths are more likely to scatter outlying positions, then we conclude \( {Y} ^ {m} \) was drawn from a distribution with larger scale.
Properties of the DD plot based statistics in the i.i.d setting were studied by Li & Liu (2004). Authors proposed several DD-plot based statistics and presented bootstrap arguments for their consistency and good effectiveness in comparison to Hotelling \( T ^ 2 \) and multivariate analogues of Ansari-Bradley and Tukey-Siegel statistics. Asymptotic distributions of depth based multivariate Wilcoxon rank-sum test statistic under the null and general alternative hypotheses were obtained by Zuo & He (2006). Several properties of the depth based rang test involving its unbiasedness was critically discussed by Jureckova & Kalina (2012). Basing on DD-plot object, which is available within the DepthProc it is possible to define several multivariate generalizations of one-dimensional rank and order statistics in an easy way. These generalizations cover well known Wilcoxon rang-sum statistic.
The depth based multivariate Wilcoxon rang sum test is especially useful for the multivariate scale changes detection and was introduced among other by Liu & Singh (2003) and intensively studied by Jureckowa & Kalina (2012) and Zuo & He (2006) in the i.i.d. setting.
For the samples \( {{{X}} ^ {m}} = \{{{{X}}_{1}}, ..., {{{X}}_{m}}\} \), \( {{{Y}} ^ {n}} = \{{{{Y}}_{1}}, ..., {{{Y}}_{n}}\} \), their \( d_{1} ^ {X}, ..., d_{m} ^ {X} \), \( d_{1} ^ {Y}, ..., d_{n} ^ {Y} \), depths w.r.t. a combined sample \( {{Z}} = {{{X}} ^ {n}} \cup {{{Y}} ^ {m}} \) the Wilcoxon statistic is defined as \( S = \sum\limits_{i = 1} ^ {m}{{{R}_{i}}} \), where \( {R}_{i} \) denotes the rang of the i-th observation, \( i = 1, ..., m \) in the combined sample \( R({{{y}}_{l}}) = \sharp\left\{ {{{z}}_{j}} \in {{{Z}}}:D({{{z}}_{j}}, {{Z}}) \le D({{{y}}_{l}}, {{Z}}) \right\}, l = 1, ..., m \).
The distribution of \( S \) is symmetric about \( E(S) = \frac{ 1 }{ 2 }m(m + n + 1) \), its variance is \( {{D} ^ {2}}(S) = \frac{ 1 }{ 12 }mn(m + n + 1) \).
Jureckova J, Kalina J (2012). Nonparametric multivariate rank tests and their unbiasedness. Bernoulli, 18(1), 229--251. Li J, Liu RY (2004). New nonparametric tests of multivariate locations and scales using data depth. Statistical Science, 19(4), 686--696. Liu RY, Singh K (1995). A quality index based on data depth and multivariate rank tests. Journal of American Statistical Association, 88, 252--260. Zuo Y, He X (2006). On the limiting distributions of multivariate depth-based rank sum statistics and related tests. The Annals of Statistics, 34, 2879--2896.
# EXAMPLE 1 x <- mvrnorm(100, c(0, 0), diag(2)) y <- mvrnorm(100, c(0, 0), diag(2) * 1.4) mWilcoxonTest(x, y)#> #> Multivariate Wilcoxon test for equality of dispersion #> #> data: dep_x and dep_y #> W = 6120, p-value = 0.006231 #> alternative hypothesis: true dispersion ratio is not equal to 1 #>#> #> Multivariate Wilcoxon test for equality of dispersion #> #> data: dep_x and dep_y #> W = 6108, p-value = 0.006809 #> alternative hypothesis: true dispersion ratio is not equal to 1 #># EXAMPLE 2 data(under5.mort) data(inf.mort) data(maesles.imm) data2011 <- na.omit(cbind(under5.mort[, 22], inf.mort[, 22], maesles.imm[, 22])) data1990 <- na.omit(cbind(under5.mort[, 1], inf.mort[, 1], maesles.imm[, 1])) mWilcoxonTest(data2011, data1990)#> #> Multivariate Wilcoxon test for equality of dispersion #> #> data: dep_x and dep_y #> W = 22212, p-value = 3.855e-11 #> alternative hypothesis: true dispersion ratio is not equal to 1 #>