Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments with Randomization Based Covariance Adjustment

May 10, 2018 | Author: Anonymous | Category: Documents
Report this link


Description

This article was downloaded by: [Umeå University Library] On: 20 November 2014, At: 23:23 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Statistics in Biopharmaceutical Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/usbr20 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments with Randomization Based Covariance Adjustment Atsushi Kawaguchi, Gary G. Koch & Xiaofei Wang Published online: 01 Jan 2012. To cite this article: Atsushi Kawaguchi, Gary G. Koch & Xiaofei Wang (2011) Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments with Randomization Based Covariance Adjustment, Statistics in Biopharmaceutical Research, 3:2, 217-231, DOI: 10.1198/sbr.2010.10007 To link to this article: http://dx.doi.org/10.1198/sbr.2010.10007 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions http://www.tandfonline.com/loi/usbr20 http://www.tandfonline.com/action/showCitFormats?doi=10.1198/sbr.2010.10007 http://dx.doi.org/10.1198/sbr.2010.10007 http://www.tandfonline.com/page/terms-and-conditions http://www.tandfonline.com/page/terms-and-conditions Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments with Randomization Based Covariance Adjustment Atsushi KAWAGUCHI, Gary G. KOCH, and Xiaofei WANG Methodology for comparing two randomly assigned treatments for strictly ordinal response variables has been discussed throughout the literature on multivari- ate Mann–Whitney estimators with stratification adjust- ment. Although such estimators can be computed di- rectly as weighted linear combinations of within-stratum Mann–Whitney estimators, consistent estimation of their covariance matrix is done using methods for multivariate U-statistics. The scope of these methods includes ways of managing randomly missing data and ways to invoke randomization-based covariance adjustment for no dif- ferences between treatments for background or baseline covariables. The assessment of treatment differences can be done using confidence intervals or statistical tests for the adjusted Mann–Whitney estimators. The methods in this article are illustrated using three examples. The first example is a randomized clinical trial with eight strata and a univariate ordinal response variable. The second example is a randomized clinical trial with four strata, two covariables, and four ordinal response vari- ables. The third example is a randomized two-period crossover clinical trial with four strata, three covariables (as age, screening, first baseline), three response vari- ables (as first period response, second baseline, second period response), and missing data. For these examples, the results are interpretable through the probability of better outcomes for one treatment over the other. Key Words: Randomization-based nonparametric covariance ad- justment; Rank measures of association for pairwise compar- isons; Strictly ordinal response variable; U-statistics; Weighted least squares estimation. 1. Introduction Many confirmatory randomized clinical trials for the comparison of two treatments have stratified designs for which there is separate and independent random assign- ment of patients within their corresponding strata. For multicenter studies, the centers are often the strata. In other cases, the strata can correspond to one or more background or baseline characteristics such as gender or categories for disease severity, and analytical adjustment for such strata can enable the comparison of the two treat- ments to have better power when the strata have strong associations with the response variables. Regardless of its nature, stratification is applicable to both multivisit clinical trials, which have random assignment of treat- ments to separate parallel groups, and to crossover clin- ical trials, which have random assignment of sequences of one or more treatments to successive periods. An ad- c© American Statistical Association Statistics in Biopharmaceutical Research 2011, Vol. 3, No. 2 DOI: 10.1198/sbr.2010.10007 217 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 http://pubs.amstat.org/loi/sbr http://www.amstat.org Statistics in Biopharmaceutical Research: Vol. 3, No. 2 ditional consideration is that one or more post-stratified background or baseline characteristics can be factors for stratification (through their conditional distributions sub- sequent to randomization of patients to treatments). The response variables for many clinical trials have a strictly ordinal measurement scale. The values for such strictly ordinal response variables can be ordered cat- egories such as pain severity according to none, mild, moderate, or severe, or they can be essentially continu- ous determinations for which an interval scale is not ap- plicable (e.g., a narrow range for better outcomes at one end versus a wide range for poorer outcomes at the oppo- site end); examples of the latter type include visual ana- logue scales for assessments of pain from osteoarthritis and number of painful joints for rheumatoid arthritis. For strictly ordinal response variables, arithmetic operations such as addition and subtraction are not applicable, and so their comparison for the two treatments needs to be through corresponding rankings. The primary analyses for confirmatory clinical trials (and particularly those with regulatory objectives) should have protocol specified methods that have minimal as- sumptions. For clinical trials without stratified designs (i.e., with only one stratum for all patients), the Wilcoxon (or Mann–Whitney) test statistic for the comparison be- tween two treatments through the ranking of a response variable for all patients (in the pooled treatment groups) has essentially no assumptions (under the null hypothesis of no treatment differences) beyond valid randomization as its basis. For a randomized clinical trial with stratifi- cation, the van Elteren (1960) extension of the Wilcoxon rank sum statistic similarly has essentially no assump- tions, mainly because it addresses a weighted linear com- bination of the within-stratum Wilcoxon rank sum statis- tics with the weights being the reciprocals of the cor- responding within-stratum sample sizes plus one; and thereby, it addresses a corresponding linear combination of Mann–Whitney statistics as discussed in Section 2.1. Moreover, the van Elteren test statistic is applicable re- gardless of the sample sizes within the strata (as long as each treatment has at least one patient), and essen- tially exact p-values for it can have determination (by enumeration or simulation); also, it is identical to Mc- Nemar’s test statistic when each treatment has exactly one patient within each stratum as would apply when the strata correspond to matched pairs of patients for the treatments. However, the power of the van Elteren test statistic can have limitations when the sample sizes within the strata are relatively small (e.g.,< 6), and so the number of factors for stratification usually cannot exceed three. For this reason, other baseline factors for adjust- ment under minimal assumptions need management by randomization-based covariance methods, as reviewed in LaVange, Durham, and Koch (2005). These methods have essentially no assumptions through the invocation of constraints for no differences between treatments for means of covariables (as an implied consequence of ran- domization). Although the van Elteren test statistic and its coun- terpart with randomization-based covariance adjustment have the advantages of no assumptions (beyond valid ran- domization) and the availability of exact p-values, they have the limitation of not having a structure for the de- termination of a corresponding confidence interval for the description of the extent to which a strictly ordi- nal response variable has different distributions for the two treatment groups. Another noteworthy limitation for these methods is that they do not have a convenient mul- tivariate extension for situations with possibly missing data for one or more of the response variables. Some potential ways to address confidence intervals with respect to Mann–Whitney statistics been discussed by Brunner, Puri, and Sun (1995) and Ryu and Agresti (2008). In this regard, Brunner et al. (1995) mainly con- sider a univariate response variable in a stratified setting through an essentially equally weighted average of the Mann–Whitney statistics for the respective strata (as op- posed to the estimator corresponding to the van Elteren test statistic), although they do not emphasize the avail- ability of confidence intervals from what they present. They also discussed related methods for a mixed model with random center effects, both with and without a base- line variable. The construction of confidence intervals corresponding to the Mann–Whitney statistic is a major focus of attention in Ryu and Agresti (2008), particularly for an ordered categorical response variable. Their article shows that a method involving inversion of a score test to obtain a confidence interval has good statistical proper- ties (and better behavior for small samples than a Wald confidence interval); and it additionally provides exten- sions to address strata and covariables through a logistic model for Mann–Whitney estimators. Another method which does not have the previously noted limitations of the van Elteren test statistic and its counterpart with randomization-based covariance adjust- ment is the proportional odds logistic regression model together with its extension to multivariate response vari- ables via generalized estimating equations (GEE); see Stokes, Davis, and Koch (2000, chap. 9, 15). Although the description of the treatment comparison for a re- sponse variable with a confidence interval for an odds ratio is an advantage of this method, a noteworthy limita- tion for confirmatory clinical trials with regulatory objec- tives is uncertainty about the applicability of this model, particularly the assumption of proportional odds (as it jointly involves treatments, strata, and covariables). Ad- ditional difficulties for the use of this model can come from adjusting for many strata with some having very 218 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments small sample sizes (e.g., < 8) or dealing with many potential outcomes of a strictly ordinal response vari- able (as can occur for a visual analogue scale) when many of them have very small sample sizes (e.g., < 2). Issues pertaining to proportional odds and strata with small sample size can have some resolution through the randomization-based methods for covariance and stratifi- cation adjustment in Tangen and Koch (1999), mainly be- cause they can work with differences between treatments for stratified means for covariables and for log odds of stratified cumulative proportions for response variables, and they only invoke proportional odds for the compar- ison between the two treatments. However, the current scope for these methods is restricted to ordered categori- cal response variables (with perhaps at most seven cate- gories), multivariate response variables with no missing data, and sufficient sample size for each treatment within each stratum (e.g. ≥ 8). Otherwise, as noted by Agresti (2002, p. 301) as well as other references, the score test for comparing treatments for an ordered categorical re- sponse variable through a proportional odds model that has no strata nor covariables is equivalent to the Mann– Whitney test. Such equivalence would seem applicable as well in some sense for the van Elteren test and the score test to compare treatments from a proportional odds model that provides adjustment for strata (but has no in- clusion of covariables). The objective of this article is to provide and to il- lustrate the application of a method which does not have the limitations of the previously described methods. It is based on the Mann–Whitney estimator for the probabil- ity that a randomly selected patient from one treatment group has better status for a response variable within a stratum than a randomly selected patient from the other treatment group (with ties being randomly broken with probability 0.5). Such Mann–Whitney estimators can be combined across the strata to provide a stratified esti- mator that addresses the same comparison between the two treatment groups as the van Elteren test statistic. The multivariate vector of such stratified Mann–Whitney es- timators for multivariate response variables is discussed in Section 2.1 and the Appendix for situations that can have missing completely at random (MCAR) data for one or more response variables. A consistent estimator for the covariance matrix for this multivariate vector be constructed through the methods for ratios of multivari- ate U-statistics in the Appendix. As shown in Section 2.2 and the Appendix, randomization-based covariance ad- justment can have invocation for the multivariate vector of stratified Mann–Whitney estimators by initially ex- panding this vector to include the stratified differences between means of covariables, producing a consistent es- timator of the corresponding covariance matrix by the methods in the Appendix and then constraining the dif- ferences for covariables to 0’s. Estimation of the covari- ance and stratification adjusted Mann–Whitney estima- tors is by the application of weighted least squares to the previously specified expanded vector, with the weighting based on the inverse of the corresponding estimated co- variance matrix; see Stokes et al. (2000, chap. 13) and Koch et al. (1998). With sufficiently large overall sam- ple sizes for each treatment group and at least minimal sample sizes within each stratum (as outlined in Section 2 and the Appendix), the resulting vector of adjusted esti- mators has an approximately multivariate normal distri- bution with a consistent estimator of the corresponding covariance matrix. This structure can then enable deter- mination of confidence intervals for linear functions of the covariance and stratification adjusted Mann–Whitney estimators. Section 2 describes the methods outlined here and Section 3 illustrates those methods for three exam- ples. 2. Methods As outlined in Section 1, a stratified Mann–Whitney estimator and corresponding confidence interval can be constructed so as to address the same comparison be- tween two randomized groups as the van Elteren test statistic. The specifications for the formal structure for this method are in Section 2.1 for a multivariate set of r response variables. Section 2.2 discusses how to invoke randomization-based covariance adjustment for stratified multivariate Mann–Whitney estimators through constraints for no expected differences between random- ized groups for stratified means of covariables. 2.1 Formal Structure for a Stratified Multivariate Mann–Whitney Estimator Consider a study that has q strata within which pa- tients are randomized to two groups. In the subsequent discussion, an underlying assumption for the patients in this study is that they represent a corresponding tar- get population for the treatments within the strata in the sense of simple random samples (with this structure be- ing random sampling from multinomial distributions for ordered categorical response variable); see the Appendix. The two groups can correspond to test and control treat- ments in a parallel groups design or to sequences of one or more treatments in a crossover design with two or more periods. For this study, let h = 1, 2, . . . , q index the strata, and let i = 1, 2 index the groups. Let k = 1, 2, . . . , r index the strictly ordinal response variables which have observation for the nhi patients in the ith group for the hth stratum. These response variables could correspond to one or more criteria at one or more visits, and some 219 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 of them could be baselines at times prior to any post- treatment visit, or in crossover studies, they could per- tain to washout periods prior to the treatments which are subsequent to the first treatment. Also, for convenience, some of the response variables could be strictly ordinal covariables with observation prior to randomization since the methods here are comparably applicable to both. Let ` = 1, 2, . . . , nh, where nh = (nh1 + nh2) in- dex the patients in the hth stratum, and let Rhik` de- note the rank (with larger ranks for better response and midranks for ties) of the kth response variable among the nh∗k = (nh1k +nh2k) patients with observed values for the kth response variable in the hth stratum, with nhik being the number of such patients among the nhi patients for the ith group in the hth stratum. Also, any missing values for the kth response variable are assumed to be missing completely at random. Let Rhik = ( ∑ nhik `=1 Rhik`/nhik ) de- note the mean rank for the kth response variable in the ith group of the hth stratum (where ∑ ignores patients who have missing Rhik`), and let dhk = ( Rh1k−Rh2k ) /nh∗k. Then, the Mann–Whitney estimator for the comparison between groups 1 and 2 for the hth stratum and the kth response variable is ξ̂hk = (dhk + 0.5). In this regard, ξ̂hk is interpretable for the hth stratum as estimating the prob- ability that the kth response variable is better for a ran- domly selected patient from group 1 than such a patient from group 2 (where ties are randomly broken with prob- ability 0.5). The usual test statistic for the stratification adjusted comparison between groups 1 and 2 with respect to the {dhk} for the kth response variable is the van El- teren (1960) extension of the Wilcoxon rank sum statis- tic. This test addresses dk = ∑ q h=1 whkdhk where whk = (nh1knh2k)/(nh∗k + 1), and so it correspondingly ad- dresses (ξ̂k− 0.5) where ξ̂k = ( ∑ q h=1 whkξ̂hk/∑ q h=1 whk ) is the stratified Mann–Whitney estimator for the kth response variable. Under the global null hypothesis H0 of no differences between the two groups in all of the strata, the randomization based variance for dk is vd,k,0 = ∑ q h=1 ( nh1knh2kvhk,0/nh∗k ) where vhk,0 ={ ∑2i=1 ∑ nhik `=1 (Rhik`/(nh∗k + 1)−0.5) 2 /(nh∗k−1) } . With sufficiently large overall sample sizes for the two groups (e.g., n+ik = ∑ q h=1 nhik ≥ 30 with at least minimal sam- ple sizes nhik ≥ 2 within each stratum), the van Elteren statistic QV E,k = (d2k/vd,k,0) for the kth response variable has the chi-squared distribution with one degree of free- dom (d.f. = 1) under H0. The van Elteren statistic QV E,k would typically be the primary method for the stratifi- cation adjusted comparison between two randomly as- signed groups for a strictly ordinal response variable, par- ticularly if it had an essentially continuous distribution (with no ties among the Rhik` within the respective strata). One reason for QV E,k having this primary role is that it can have exact assessment (via simulation or enumera- tion) from its randomization distribution with no assump- tions (beyond valid stratified randomization). A second reason is its locally most powerful property, as noted in references such as Lehmann (1975). The principal limita- tion of QV E,k is that it does not have an inherent counter- part for confidence interval estimation for a strictly ordi- nal response variable, mainly because the randomization based variance vd,k,0 for dk is only applicable under the global null hypothesis H0. A second limitation is that it does not have a convenient multivariate extension to a set of r strictly ordinal response variables in situations with possibly missing data for one or more of the response variables. As noted in the Appendix, the stratified multivari- ate Mann–Whitney estimators ξ̂ξξ = (ξ̂1, ξ̂2, . . . , ξ̂r)′ do not have either of the previously noted limitations of the QV E,k since VVV ξ̂ξξ from (A.4) provides a consistent estima- tor for their covariance matrix regardless of whether H0 applies, given that any missing data are missing com- pletely at random. Accordingly, with sufficient sample size for ξ̂ξξ to have an approximately multivariate nor- mal distribution (e.g., n+ik ≥ 50 and all nhik ≥ 4 with these guidelines being somewhat larger than those for QV E,k), then a two-sided 100(1− 2α)% confidence in- terval for the linear statistic ccc′ξ̂ξξ for the comparison be- tween the two groups is ( ccc′ξ̂ξξ ± zα √ ccc′VVV ξ̂ξξ ccc ) where zα is the 100(1−α) percentile of the standard normal distri- bution with mean 0 and variance 1. The vector ccc can cor- respond to the separate univariate response variables via ccc = δδδ j where δδδ j is a vector with the jth element equal to 1 and all others equal to 0, the average across all response variables via ccc = (111r/r) where 111a is the (a× 1) vector of 1’s, or for contrasts among the ξ̂k so as to address their homogeneity for the comparisons between the two groups. In this latter context, when the r response vari- ables correspond to r visits for the same response crite- rion, QC = ξ̂ξξ ′ CCC′(CCCVVV ξ̂ξξ CCC′)−1CCCξ̂ξξ for CCC = [III(r−1),−111(r−1)], with IIIa as the (a× a) identity matrix, provides an over- all test statistic for homogeneity (or group × visit in- teraction) with approximately the chi-squared distribu- tion with d.f. = (r− 1). Also, when QC does not sug- gest any departure of the ξ̂k from homogeneity across the r response variables (or visits), a common estimator ξ̂ = (111′VVV−1 ξ̂ξξ ξ̂ξξ )/(111′VVV−1 ξ̂ξξ 111) can be determined by weighted least squares, together with the consistent estimator v ξ̂ = (111′VVV−1 ξ̂ξξ 111)−1 for its variance on the basis of the methods 220 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments in Stokes et al. (2000, chap. 13), The corresponding two- sided 100(1−2α)% confidence interval is (ξ̂ ±zα √v ξ̂ ). The estimator ξ̂ represents the stratification adjusted ex- tent of homogenously better response for one group than the other for all response variables (or visits). An additional consideration of interest for both the QV E,k and the ξ̂k is that an essentially continuous re- sponse variable (with no ties among the Rhikl within the respective strata) has vhk,0 = nh∗k/12(nh∗k + 1) un- der H0 and so vd,k,0 = [ ∑ q h=1 {nh1knh2k/12(nh∗k + 1)} ] = [ ∑ q h=1 {whk/12} ] under H0. Since the ξ̂hk have vari- ances (nh∗k + 1)/12nh1knh2k = (1/12whk) under H0, ξ̂k is a weighted average of the ξ̂hk with the recip- rocals of the variances of the ξ̂hk under H0 as the weights. Accordingly, the variance of ξ̂k under H0 is[ 1/ { 12∑ q h=1 whk }] ; and this quantity is also the vari- ance of dk = ( dk/∑ q h=1 whk ) = (ξ̂k − 0.5). When the two groups have equal sample sizes within each of the strata and none of the response variables have any miss- ing data, the nhik = nhi = (nh/2), the whk = n2h/4(nh + 1); and so the variance of ξ̂k and dk under H0 is( 1/ [ 3∑ q h=1 { n2h/(nh + 1) }]) . Another estimator of interest from the dhk is d̃k = ( ∑ q h=1 w̃hkdhk ) / ( ∑ q h=1 w̃hk ) where w̃hk = (nh1knh2k/nh∗k). This estimator is identical to the es- timator from two-way analysis of variance for the difference between the stratification adjusted means for the two groups with respect to the within stra- tum standardized ranks {Rhik`/(nh∗k + 1)} for the kth response variable [with these quantities also being called modified ridit scores in the use of the cmh option in the FREQ Procedure of the SAS System to produce QV E,k in the way discussed by Stokes et al. (2000, chap. 4)]. Accordingly, (ξ̂k − 0.5) = dk = d̃k ( ∑ q h=1 w̃hk ) / ( ∑ q h=1 whk ) ; and when all nhik = (nh/2), (ξ̂k − 0.5) = dk = d̃k [ N/∑ q h=1 { n2h/(nh + 1) }] where N = ∑ q h=1 nh. With all nh ≥ 10, n 2 h/(nh + 1) ≈ (nh− 1), and so (ξ̂k−0.5) = dk ≈ d̃k(N/(N−q)), and the variance of ξ̂k and dk under H0 is about {1/3(N−q)}. Of course, VVV ξ̂ξξ provides more robust estimators for the variances of the ξ̂k. 2.2 Randomization-Based Covariance Adjustment Let m = 1, 2, . . . , M index a set of M numeric covari- ables that have observation prior to the randomization of the patients to the two groups. These covariables can in- clude dichotomous covariables as 1 versus 0 and categor- ical covariables (via sets of dichotomous covariables that correspond to all except one of the categories with the excluded category serving as a reference category). Any strictly ordinal covariables would have inclusion for con- venience as response variables so as to be taken into ac- count among the elements of ξ̂ξξ . Also, all of the M covari- ables are assumed to have no missing data as would usu- ally be the case for baseline data prior to the randomiza- tion of the patients to the two groups. Let xhim` denote the observed value of the mth covariable for the `th patient in the ith group of the hth stratum. Let xhim = ∑ nhi `=1 xhim`/nhi denote the mean of the xhim` for the ith group of the hth stratum. Let gm = ( ∑ q h=1 w̃h(xh1m− xh2m)/∑ q h=1 w̃h ) with w̃h = nh1nh2/(nh1 + nh2). Thus, gm is the estimator from two-way analysis of variance for the difference between the stratification adjusted means of the mth covariable for the two groups. Let ggg = (g1,g2, . . . ,gM)′ and let fff = (ξ̂ξξ ′ ,ggg′)′. A consistent estimator for the covariance ma- trix of fff from methods for multivariate U-statistics is VVV fff = HHHVVV GGGHHH ′ where VVV GGG and HHH are, respectively, from (A.7) and (A.8) of the Appendix. By construction, the first r elements of fff address comparisons between the two randomized groups for the r response variables with stratification adjusted Mann–Whitney estimators, and the last M elements address comparisons between the two randomized groups for the M covariables with stratifi- cation adjusted differences between the corresponding means. Since ggg would be expected to be null on the ba- sis of randomization of patients to the two groups, randomization-based covariance adjustment of ξ̂ξξ is pos- sible by fitting the model PPP = [IIIr,000rM]′ to fff by weighted least squares; see Koch et al. (1998) and LaVange et al. (2005). The resulting adjusted counterpart bbb for ξ̂ξξ is bbb = (PPP′VVV−1fff PPP) −1PPP′VVV−1fff fff = (ξ̂ξξ −VVV ξ̂ξξ gggVVV −1 ggg ggg), where VVV ξ̂ξξ ggg cor- responds to the covariances of ξ̂ξξ with ggg and VVV ggg corre- sponds to the covariance matrix of ggg. A consistent esti- mator for the covariance matrix of bbb is VVV bbb = (PPP ′VVV−1fff PPP) −1 = VVV ξ̂ξξ −VVV ξ̂ξξ ggg VVV−1ggg VVV ′ ξ̂ξξ ggg . Additional models that address the variation of the elements of bbb that corresponds to the response variables can be fit by weighted least squares. One such model can invoke randomization based co- variance adjustment for any strictly ordinal covariables among the response variables by having a structure like PPP with rows of 0’s corresponding to the strictly ordinal covariables and rows of an identity matrix correspond- ing to the strictly ordinal response variables. Through the resulting estimator bbbadj with full adjustment for all co- variables and the strata as well, confidence intervals can have determination for linear statistics ccc′bbbadj in a man- ner like that previously described for ccc′ξ̂ξξ in Section 2.1. 221 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 Table 1. Distributions of pain status after treatment for four weeks according to center, diagnosis, and treatment (Example 1) Pain status after treatment for four weeks Stratum Center Diagnosis Treatment Excellent Good Moderate Fair Poor 1 I A Test 1 3 2 5 1 1 I A Control 2 4 3 4 3 2 I B Test 3 10 1 4 2 2 I B Control 2 4 1 5 2 3 I C Test 6 1 1 1 0 3 I C Control 0 5 1 1 3 4 I D Test 3 5 1 6 1 4 I D Control 3 3 2 4 5 5 II A Test 0 4 3 1 8 5 II A Control 0 3 3 0 5 6 II B Test 2 3 3 0 2 6 II B Control 1 8 0 0 5 7 II C Test 2 2 1 0 1 7 II C Control 1 1 0 1 1 8 II D Test 0 1 2 2 3 8 II D Control 1 1 1 0 7 Similarly, test statistics like QC in Section 2.1 are appli- cable to assessments of homogeneity across the response variables for the adjusted estimators bbbadj for the differ- ences between the two groups. Aspects of the scope of such analyses and the corresponding models have more specific discussion through the examples in Section 3. 3. Examples Three examples illustrate the methods in this arti- cle. In Section 3.1, the first example has a univariate ordered categorical response variable for the compari- son of two randomized treatments for the management of chronic pain, and it has eight strata according to the cross-classification of two centers and four diagnoses. There are four ordered categorical response variables at four post-baseline visits for the second example in Sec- tion 3.2 which pertains to the comparison of two ran- domized treatments for a respiratory disorder; and there are four strata according to the cross-classification of two centers and two genders, two covariables as age and base- line response, and no missing data (because the no longer available original data had any missing values for a pa- tient replaced by their last observation carried forward (LOCF)). A crossover design with two randomized se- quence groups to compare two treatments for pain from osteoarthritis during two periods is discussed in Section 3.3; it has four strata according to baseline pain severity, five ordinal response variables for pain at five visits ac- cording to a visual analogue scale with the first two visits being covariables prior to treatment, age as an additional covariable, and substantial missing data during the last two visits with occurrence after the end of the first treat- ment period. 3.1 Randomized Clinical Trial of Chronic Pain with Univariate Ordinal Response Variable The data in Table 1 are from a multicenter random- ized clinical trial to compare test and control treatments for the management of chronic pain, and they have had previous consideration in Stokes et al. (2000, chap. 13). This clinical trial has 8 strata for which the range of sam- ple sizes is 10 to 34 and a univariate ordinal response variable with 5 categories (as excellent, good, moderate, fair, poor) for pain status after treatment for 4 weeks. The 8 strata correspond to the cross-classification of two centers (as I, II) and four diagnoses (as A, B, C, D). As noted by Stokes et al. (2000, chap. 13), the Mann– Whitney estimator for the probability of better response with test treatment than control for each stratum sepa- rately can have determination through the MEASURES option of the FREQ Procedure in the SAS System via {(Somers’ D C |R) + 1}/2. A summary of these Mann– Whitney estimators with their standard errors and sample sizes is in Table 2. For six of the eight strata, the Mann– Whitney estimators support better response for the test treatment by exceeding 0.50, whereas only two strata have the Mann–Whitney estimator supporting better re- sponse for control by being lower than 0.50. The usual method for the comparison of the test and control treat- ments for the combined strata is the van Elteren (1960) extension of the Wilcoxon rank sum statistic; see van El- teren (1960) and Lehmann (1975). 222 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments Table 2. Mann–Whitney estimators with standard errors and sample sizes for strata according to center and diagnosis (Example 1) Stratum Center Diagnosis Sample size Mann–Whitney estimator Standard error 1 I A 28 0.492 0.106 2 I B 34 0.595 0.096 3 I C 19 0.839 0.092 4 I D 33 0.601 0.096 5 II A 27 0.469 0.105 6 II B 24 0.529 0.115 7 II C 10 0.604 0.181 8 II D 18 0.600 0.126 As noted by Stokes et al. (2000, chap. 4), determi- nation of the van Elteren statistic is possible with the FREQ Procedure in the SAS System through specifica- tions of “tables strata * treatments * response / cmh2 scores=modridits”. For the data in Table 1, such com- putation produces QV E = 3.89 for the version of the van Elteren statistic with an approximately chi-squared dis- tribution with one degree of freedom, and thereby two- sided p = 0.0486. Thus, pain status after treatment for 4 weeks is significantly better for patients with test treat- ment than for patients with control. The van Elteren statistic addresses the comparison between test and control through d in (1) where h = 1, 2, . . . , 8 denotes the strata, i = T , C denotes the treatments, Rhi denotes the d = 8 ∑ h=1 w̃h(RhT −RhC)/(nh + 1) (1) mean of the ranks for the ith treatment within the hth stratum (with higher ranks for better response), nh = (nhT + nhC) is the sample size for the hth stratum (with nhi as the sample size for the ith treatment in the hth stratum), and w̃h = (nhT nhC/nh). Since (RhT −RhC)/nh = (ξ̂h− 0.5) where ξ̂h is the Mann–Whitney estimator for the hth stratum, d = ∑8h=1 {w̃hnh/(nh + 1)} { ξ̂h−0.5 } . With the methods in this article, the difference between the test and control treatments has description with ξ̂ as shown in (2) (ξ̂ −0.5) = 8 ∑ h=1 wh(ξ̂h−0.5) = 8 ∑ h=1 wh(RhT −RhC)/nh, (2) where wh = (nhT nhC/(nh + 1))/∑ 8 h′=1(nh′T nh′C/(nh′ + 1)). Also of interest for such description is d̃ = (d/∑ q h=1 w̃h) through two-way analysis of variance (ANOVA) for the {Rhi`/(nh + 1)} where Rhi` denotes the rank within the hth stratum for the `th patient in the ith treatment group. The result from this computation is d̃ = 0.0772, and its standard error under the global null hypothesis H0 of no treatment differences in all of the strata is s̃.e.(d̃) = √ d̃2/QV E = 0.0392; but this result does not enable the determination of a confidence inter- val based on d̃ because s̃.e.(d̃) is applicable only under H0. Alternatively, ξ̂ = 0.5804 and its robust standard error from the methods in the Appendix is s.e.(ξ̂ ) = 0.0417. The resulting two-sided 0.95 confidence interval for ξ̂ is (0.4988, 0.6621). Since this confidence interval in- cludes 0.50 as the value of ξ for no difference between the test and control treatments, it is slightly in conflict with its counterpart for this comparison of p = 0.0486 from the van Elteren statistic. A possible reason for this slight disagreement in conclusions from the van Elteren statistic and the confidence interval for ξ̂ is that s.e.(ξ̂ ) may slightly overestimate the standard error of ξ̂ , per- haps because the nhi is managed in the Appendix as random rather than fixed. In this regard, when the nhi are sufficiently large (e.g., all nhi ≥ 20), an alterna- tive estimator for the standard error of ξ̂ is s.e.w(ξ̂ ) ={ ∑ q h=1 w 2 h[s.e.(ξ̂h)] 2 } where s.e.(ξ̂h) are from the meth- ods described for Table 2. Since s.e.w(ξ̂ ) = 0.0390 for the data in Table 1 (and the results shown for it in Table 2), the corresponding 0.95 confidence interval for ξ̂ is (0.5040, 0.6568) for which the lower limit exceeds 0.50. However, s.e.w(ξ̂ ) tends to be somewhat smaller than ap- propriate to support 0.95 coverage for this type of confi- dence interval when many of the nhi are only minimally large (e.g., 4 ≤ nhi ≤ 10), particularly for ordered cate- gorical response variables with many ties in their within stratum rankings. Otherwise, as previously noted, the van Elteren statistic would usually be the primary method for testing the global null hypothesis H0, and the methods in this article would have a supportive role in terms of providing a correspondingly related confidence interval. Counterparts to ξ̂ can have separate determination for each center and for each diagnosis, together with corre- 223 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 Table 3. Mann–Whitney estimators for centers adjusted for strata by diagnosis and Mann–Whitney estimators for diagnoses adjusted for strata by center, with standard errors and sample sizes (Example 1) Subpopulation Sample size Mann–Whitney estimator Standard error Center 1 114 0.5335 0.0638 Center 2 79 0.6121 0.0544 Diagnosis A 55 0.6007 0.0770 Diagnosis B 58 0.7622 0.0965 Diagnosis C 29 0.5674 0.0750 Diagnosis D 51 0.4808 0.0753 sponding standard errors. These results for centers I, II and for diagnoses A, B, C, D are shown in Table 3. Homogeneity of the estimators for the two centers or of the estimators for the four diagnoses can have assess- ment with QH = ∑h̃ { (ξ̂h̃−ξ )/s.e.(ξ̂h̃) }2 where h̃ corre- sponds to I and II for centers or A, B, C, D for diagnoses and ξ is the weighted mean of the ξ̂h̃ with (1/s.e.(ξ̂h̃)) 2 as the weights. For centers, QH = 0.88 with p = 0.3483 from an approximate chi-squared distribution with d.f. = 1; and for diagnoses QH = 5.39 with p = 0.1455 from an approximate chi-squared distribution with d.f. = 3. Thus, there is no suggestion of departures from homo- geneity across centers or across diagnoses. For homo- geneity of all the ξ̂h̃ across centers*diagnoses, QH = 9.31 with p = 0.2314 from an approximately chi-squared dis- tribution with d.f. = 7, although sample sizes are not re- ally large enough for this application. Nevertheless, there is no suggestion of departure from homogeneity. Another method of interest for this example is the proportional odds logistic regression model for the cu- mulative probabilities of better response; see Stokes et al. (2000, chap. 9). For the specification of this model with main effects for treatments, centers, and diagnoses (as a categorical variable with four classes), the score test for the proportional odds assumption from PROC LOGISTIC in the SAS System has p = 0.072 from a chi-squared statistic QS = 23.61 with d.f. = 15, and so the applicability of this model has this limitation. Nev- ertheless, the estimated odds ratio from this model for better response with test treatment is 1.82 with (1.09, 3.05) as the corresponding 0.95 confidence interval and two-sided p = 0.022. As a further consideration, the use of this method suggests treatment*diagnosis inter- action to some extent via p = 0.032 when it has addi- tional inclusion in the model; but the score test for the proportional odds assumption for this expanded model has p = 0.018 for QS = 40.60 with d.f. = 24 (al- though this result might exaggerate departures from the model because relatively small counts for the diagno- sis*treatment*pain status cross-classification could un- dermine its chi-squared approximation). Thus, how to use the proportional odds model for this example and how to interpret its results are unclear from the perspec- tive of robustness. 3.2 Randomized Clinical Trial with Ordinal Re- sponses at Four Visits for a Respiratory Disor- der This example addresses a randomized clinical trial to compare a test treatment to placebo for a respiratory disorder, and listings of the data appear in Stokes et al. (2000, chap. 15, pp. 495–496) and Koch et al. (1990). This clinical trial had 111 patients from two centers, and it four post-baseline visits with corresponding ordinal re- sponse variables for patient global ratings of symptom control according to 5 categories (as excellent, good, fair, poor, terrible). This example has gender (female or male) as an additional factor for stratification, and so there are four strata for center*gender; and it has age and the base- line rating of symptom control (with the same ordinal categories as the response variables) as two covariables. For this example, the methods in this article have in- vocation with four strata according to center*gender, age as a numeric covariable, the baseline rating of symptom control as an ordinal covariable, and the four ratings of symptom control at the four post-baseline visits as the four ordinal response variables. The resulting vector of stratification adjusted estimators for the comparisons be- tween the test treatment and placebo contains ξ̂0, ξ̂1, ξ̂2, ξ̂3, ξ̂4, in (3) for the Mann–Whitney estimators that cor- respond to the baseline visit and visits 1, 2, 3, 4 and for the difference between mean ages g (with ξ̂0 and g rep- resenting covariables). The corresponding estimated co- variance matrix from the methods in the Appendix is in 224 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments (4). fff = [ξ̂0, ξ̂1, ξ̂2, ξ̂3, ξ̂4, g]′ = [0.4799, 0.6005, 0.7139, 0.6535, 0.6155, 1.0501]′ (3) VVV fff =        31.90 15.23 8.76 8.76 8.82 171.74 15.23 28.52 14.10 14.08 13.21 −4.36 8.76 14.10 23.36 16.59 16.38 −163.27 8.76 14.08 16.59 28.34 20.84 −157.79 8.82 13.21 16.38 20.84 27.83 −4.44 171.74 −4.36 −163.27 −157.79 −4.44 68220.05        ×10−4 (4) Since both (ξ̂0− 0.5) and g have null expected val- ues on the basis of randomization of patients to the two treatment groups (and QCCC = 0.33 with d.f. = 2 and p = 0.850 for the corresponding assessment of random imbalance), randomization-based covariance adjustment of (ξ̂1−0.5), (ξ̂2−0.5), (ξ̂3−0.5), (ξ̂4−0.5) is possible by fitting the model PPP = [0004, III4,0004]′ to fff (after subtrac- tion of 0.5 from each of the ξ̂k) by weighted least squares; here 0004 is the 4×1 vector of 0’s and III4 is the 4×4 iden- tity matrix. The resulting adjusted estimates bbb from the methods in Section 2.2 are in (5), and the corresponding estimated covariance matrix VVV bbb is in (6): bbb = [0.1116, 0.2230, 0.1625, 0.1219]′ (5) VVV bbb =     21.14 9.65 9.63 8.93 9.65 20.30 13.55 13.80 9.63 13.55 25.31 18.26 8.93 13.80 18.26 25.35    ×10 −4. (6) The statistics {bk/s.e.(bk)} 2 have approximately chi- squared distributions with d.f. = 1 under the null hy- potheses H0k: ξk = 0. The respective p-values are 0.0152, < 0.0001, 0.0012, 0.0155 for visits 1, 2, 3, 4. Homo- geneity of the ξk across the four visits can have as- sessment with QC1 = bbb ′CCC1 ′(CCC1VVV bbbCCC1 ′)−1CCC1bbb where CCC1 = [III3,−1113] and 1113 is the (3× 1) vector of 1’s. Since QC1 = 8.93 has p = 0.0302 with respect to its approxi- mate chi-squared distribution with d.f. = 3, there is some suggestion of departures of the ξ̂k from homogeneity; in this regard, the difference between test treatment and placebo tends to be larger at Visit 2 and Visit 3 than at Visit 1 and Visit 4. A comparison between treatments for the average of the ξ̂k across the 4 visits is possible with QC2 = bbb ′CCC2 ′(CCC2VVV bbbCCC2 ′)−1CCC2bbb with CCC2 = 111 ′ 4; its re- sult is QC2 = 15.99 for which two-sided p< 0.0001 with respect to the approximate chi-squared distribution with d.f. = 1. The results summarized here for this example are similar to their counterparts in Table VIII and its dis- cussion in Koch et al. (1998) for which there was strat- ification for only centers with gender, integer scores for baseline, and age having management as numeric covari- ables. An important advantage of the methods in this ar- ticle is their better applicability to situations with smaller sample sizes in the strata. The previously described application of Mann– Whitney estimators (with both stratification adjustment and randomization based covariance adjustment) is in harmony with other methods for this example. In this re- gard, Koch et al. (1990) provided results from the van El- teren statistic and an extension with rank analysis of co- variance. As noted there, an advantage of these methods is that they can have exact determination of p-values for the comparison of two treatments, but a limitation is that they do not conveniently provide a corresponding con- fidence interval for a measure of the difference between the two treatments. Results from proportional odds logis- tic models for each visit separately also have presentation in Koch et al. (1990), and the discussion there indicates that the proportional odds assumption has reasonable ap- plicability for Visits 1, 2, 3, but not Visit 4. An illustra- tion of the proportional odds logistic model for all four visits simultaneously is in Stokes et al. (2000, chap. 15), and it identifies treatment*visit interaction for this exam- ple to a similar extent as QC1 . Tangen and Koch (1999) considered applications of randomization based covari- ance adjustment in the setting of a proportional odds lo- gistic model for this example, both for each visit sepa- rately and all visits simultaneously. An advantage of the methods in Tangen and Koch (1999) is that they only in- voke the proportional odds assumption for the population average odds ratio for the comparisons between the two treatments, but a potential limitation is their need for at least moderately large sample sizes within the strata. The main advantages of the methods in this article for Mann– Whitney estimators are that their scope can be multivari- ate, they do not require assumptions like proportional odds, they do not require assumptions about the relation- ship of ordinal response variables to strata or covariables, and they have applicability when the overall sample sizes of each treatment group for the combined strata are at least moderately large with the sample sizes within the strata not being overly small. 3.3 Randomized Two-Period Crossover Clinical Trial for Osteoarthritis with Ordinal Responses The data for this example are from a two period crossover clinical trial reported by Pincus, Koch, Sokka et al. (2001) for the comparison of a test treatment and a control treatment for the management of pain for patients with osteoarthritis of the hip or knee. This clinical trial in- 225 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 Table 4. Sample sizes for ordinal response variables and age for two sequence groups of two treatments within four strata according to pain severity index (Example 3) Pain severity Treatment index sequence Visit 1 Visit 2 Visit 3 Visit 4 Visit 5 Age 0 T:C 11 11 10 8 8 11 0 C:T 15 15 15 12 12 15 1 T:C 26 26 26 22 22 26 1 C:T 24 24 22 20 19 24 2 T:C 39 39 37 32 32 39 2 C:T 35 35 34 27 26 35 3 T:C 36 36 33 33 31 36 3 C:T 41 41 41 33 32 41 cluded 227 patients with 112 patients in the T:C group for test treatment T during the first period and control treat- ment C during the second period and 115 patients in the C:T group for the opposite order of the two treatments; and these patients had assessments of pain according to an ordinal visual analogue scale (with a range of 0 to 100 for no pain to very severe pain) at five visits as follows: screening (Visit 1), baseline prior to period 1 of treatment (Visit 2), end of period 1 (Visit 3), baseline prior to pe- riod 2 of treatment (Visit 4), and end of period 2 (Visit 5). To illustrate the methods in this article, this example has four strata according to the pain severity index in Table 4 of Pincus et al., and age serves as a numeric covariable. For each of the four strata, the sample sizes that apply at the respective visits and for age are in Table 4 for each of the two sequence groups of the two treatments. On the basis of the methods in this article, the vector of stratification adjusted estimators for the comparisons between the T:C and C:T sequence groups is in (7) for the Mann–Whitney estimators ξ̂1, ξ̂2, ξ̂3, ξ̂4, ξ̂5, that corre- spond to visits 1, 2, 3, 4, 5 and for the difference between mean ages g (with ξ̂1, ξ̂2, and g representing covariables). The corresponding estimated covariance matrix VVV fff from the methods in the appendix is in (8). fff = [ξ̂1, ξ̂2, ξ̂3, ξ̂4, ξ̂5, g] ′ = [0.5184, 0.5943, 0.6467, 0.6058, 0.3302, −0.3847]′ (7) VVV fff =        16.08 5.48 2.06 3.54 1.67 −82.24 5.48 14.65 4.21 6.87 2.81 −1.44 2.06 4.21 15.50 9.89 2.96 30.22 3.54 6.87 9.89 19.55 6.30 −20.89 1.67 2.81 2.96 6.30 18.01 −27.84 −82.24 −1.44 30.22 −20.89 −27.84 20405.11        ×10−4 (8) Since (ξ̂1−0.5), (ξ̂2−0.5), and g have null expected values on the basis of randomization of patients to the T:C and C:T sequence groups (and QCCC = 6.38 with d.f.=3 and p = 0.095 for the corresponding assessment of ran- dom imbalance), randomization based covariance adjust- ment of (ξ̂3− 0.5), (ξ̂4− 0.5), (ξ̂5− 0.5) is possible by fitting the model PPP = [0003,0003, III3,0003]′ to fff (after subtrac- tion of 0.5 from each of the ξ̂k) by weighted least squares. The resulting adjusted estimators from the methods in Section 2.2 are shown in (9) with the corresponding esti- mated covariance matrix. bbb = [0.1210, 0.0624, −0.1877]′, VVV bbb =   14.23 7.91 2.17 7.91 16.25 4.92 2.17 4.92 17.42  ×10−4. (9) The statistics {bk/s.e.(bk)} 2 have approximately chi- squared distributions with d.f. = 1 under the null hy- potheses H0k: ξk = 0.5. The respective p-values are 0.0013, 0.1217, Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments QCCC = 5.20 with p = 0.0225 is contrary to homogeneity, the extent of better response for T than C is larger at Visit 5 for the second period than at Visit 3 for the first pe- riod. Nevertheless, both of the treatment comparisons for Visit 3 and Visit 5 favor T over C, and so their average as (bbb3,adj− bbb5,adj)/2 is of interest through the estimator ((0.5, −0.5)bbbadj + 0.5) = 0.649. The corresponding 0.95 confidence interval is (0.598, 0.699) and p< 0.0001 . For this example, homogeneity across the four strata is of interest for the counterparts of the fully adjusted Mann–Whitney estimator within the four strata. These estimators bbbh∗,adj = (bbbh3,adj− bbbh5,adj)/2 and their corre- sponding standard errors for the strata h = 0, 1, 2, 3 are as follows. bbb0∗,adj = 0.5021 with s.e.(bbb0∗,adj) = 0.0774, bbb1∗,adj = 0.6490 with s.e.(bbb1∗,adj) = 0.0466, bbb2∗,adj = 0.6410 with s.e.(bbb2∗,adj) = 0.0458, bbb3∗,adj = 0.7112 with s.e.(bbb3∗,adj) = 0.0372. Their homogeneity can have as- sessment with the criterion indicated in Section 3.1 for the first example. Since QH = 6.28 with d.f. = 3 has p = 0.0987, there is some suggestion of heterogeneity across the strata for the differences between T and C with such differences being larger for the more severe strata than the less severe strata. The assumption of missing completely at random (MCAR) may not be realistic for Visit 4 and Visit 5 of this example, and so some caution is necessary for the interpretation of the results that involve the comparison between T and C during the second period. More posi- tively, such results are still worthy of attention because they have the rankings of observed data at Visit 4 and Visit 5 as their principal basis; moreover, they have the rankings of the observed data at Visits 1, 2, 3 and the co- variance structure of the rankings of the observed data at all visits serving as a supporting basis. 4. Discussion This article provides methodology and applications for a stratified Mann–Whitney estimator that addresses the same comparison between two randomized groups for a strictly ordinal response variable as the van El- teren test statistic. Through its structure as the ratio of two U-statistics, this estimator can have a consistent esti- mator for its corresponding variance constructed in the manner shown in the Appendix. Moreover, the meth- ods in the Appendix provide a consistent estimator of the covariance matrix for a vector of stratified Mann– Whitney estimators that correspond to a vector of strictly ordinal response variables that may have some missing completely at random (MCAR) values for some patients, and their viability only needs the sample sizes for each group within each stratum to be at least minimal (e.g., ≥ 4) in combination with at least moderate overall sam- ple sizes for each group (e.g., ≥ 50). As noted for the example in Section 3.1, the availability of at least mod- erately large sample sizes for each group in each stra- tum (e.g., all nhi ≥ 30) can support the determination of consistent estimators of the covariance matrices of multivariate Mann–Whitney estimators on a within stra- tum basis by the methods in the Appendix. Accordingly, the estimated covariance matrix for stratified multivari- ate Mann–Whitney estimators can then have construction from these within stratum estimated covariance matrices by the usual methods for linear functions of estimators. Also, when the nhi are clearly large (e.g., all nhi ≥ 100), weighted least squares methods can have use as described in Stokes et al. (2000, chap. 13.8) to fit models which de- scribe the variation of the within stratum Mann–Whitney estimators across both strata and response variables with these models including that which invokes homogeneity across strata. Randomization-based covariance adjustment is pos- sible for stratified multivariate Mann–Whitney estima- tors by expanding the vector of such estimators to include stratified differences between means of covariables. The latter estimators for the covariables then have constraints to 0’s invoked by weighted least squares methods with weights based on the estimated covariance matrix for the expanded vector from the methods for ratios of multivari- ate U-statistics in the Appendix. The resulting estimators are stratified multivariate Mann–Whitney estimators with randomization based covariance adjustment, and they ad- dress the comparison between two randomized groups in terms of the probability of better response for a patient in one group than a patient in the other group (with ties randomly broken with probability 0.5). With sufficiently large sample sizes as discussed previously, such estima- tors have an approximately multivariate normal distribu- tion with the covariance matrix being essentially known through its corresponding consistent estimator from the methods in the Appendix. Accordingly, confidence inter- vals can have construction for linear functions of such fully adjusted Mann–Whitney estimators (with respect to both stratification and covariables), and the scope of such linear functions can include the separate response vari- ables, averages across response variables, and contrasts among response variables. As discussed by Koch et al. (1998), however, randomization based methods for co- variance adjustment have the limitation of not providing estimates for the covariables or the strata; and so they do not have an underlying structure for addressing interac- tions of treatments with covariables or strata. Neverthe- less, their separate invocation within the strata can enable assessment of homogeneity of treatment effects across the strata in the ways illustrated in Sections 3.1 and 3.3 (i.e., treatments*strata interaction). Otherwise, since the methods in this article have random allocation of patients 227 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 to treatments as their basis, they are not applicable to ob- servational studies without any type of randomization. When there is some uncertainty as to whether sample sizes are large enough to support approximately normal distributions for the adjusted Mann–Whitney estimators, there are some strategies that can improve their statisti- cal behavior in terms of Type I error for test statistics and coverage of confidence intervals. One such strategy is to use logit transformations λ̂k = loge { ξ̂k/(1− ξ̂k) } for which the corresponding covariance matrix for the vector λ̂λλ = (λ̂1, . . . , λ̂r)′ is VVV λ̂λλ = DDD −1 η̂ηη VVV ξ̂ξξ DDD−1 η̂ηη for which η̂ηη = DDD ξ̂ξξ (111− ξ̂ξξ ). As noted by Carr, Hafner, and Koch (1989), Jung and Koch (1998, 1999), and Kawaguchi and Koch (2010), such logit transformations correspond to the Fisher (1925) transformation of Somers’ version of the Kendall Tau rank correlation coefficient. Also, sim- ulation studies of Carr et al. (1989) and Kawaguchi and Koch (2010) support somewhat better statistical proper- ties for the use of this logit transformation than the ac- tual Mann–Whitney estimator when the actual Mann– Whitney estimator is further from its null value of 0.5. Another strategy that can improve the statistical prop- erties of the methods in this article for adjusted Mann- Whitney estimators is to multiply the applicable covari- ance matrices by (N − 1)/(N − q− r −M) and then to use t distributions with d.f. = (N − q− r−M) for confidence intervals and F distributions with d.f. = (c, N− q− r−M) for testing contrasts with rank c. Simu- lation studies concerning these modifications were dis- cussed by Kawaguchi and Koch (2010) for a version of the methods in this article for a crossover study with four groups and no stratification. The methods in this article are applicable to stud- ies with missing completely at random (MCAR) data through the use of the vector of Mann–Whitney estima- tors that pertain to the observed data and their corre- sponding covariance structure. The Appendix identifies some alternative methods for missing data. One of these methods manages pairs of patients as having tied status for a response if data for either patient are missing for that response. Another method addresses pairs of patients in whom one or both have missing response by either carrying forward the status of their paired comparison at baseline or from the most recent prior visit at which nei- ther patient had missing response. A potentially appeal- ing way to address missing data for the methods in this article is to use multiple imputation, but how to invoke multiple imputation most appropriately for a multivari- ate set of strictly ordinal response variables is beyond the scope of this article. Nevertheless, with the invoca- tion of multiple imputation to replace the missing val- ues, the methods in this article can then have straightfor- ward application for an appropriate number of imputed datasets, and then the variability of the corresponding re- sults across the imputed datasets can have incorporation in the covariance structure of the adjusted multivariate Mann–Whitney estimators. In summary, stratified multivariate Mann–Whitney estimators provide a useful structure for the analysis of strictly ordinal response variables from randomized clin- ical trials to compare two groups. Their scope can ad- dress strata with small (but minimal) sample size, and it can have randomization-based covariance adjustment. Appendix. Methods Let h = 1, 2, . . . , q index a set of strata within which patients have randomization to two groups indexed by i = 1, 2. Let k = 1, 2, . . . , r index the response variables with observation for the nhi patients in the ith group of the hth stratum; and some of these response variables can be baselines at times prior to any treatment for a group; or in crossover studies, they can be at washout times prior to the treatments which are subsequent to the first treat- ment. Let j = 1, 2, . . . , N index the pooling of all patients in the clinical trial regardless of their groups or strata. Let S j denote the stratum for the jth patient, and let t j correspond to the group for the jth patient with t j = 1 if i = 1 for patient j and t j = −1 if i = 2 for patient j. Let YYY j = (Yj1, . . . ,Yjr)′ denote the response vector for the jth patient with Yjk denoting the kth strictly ordinal response variable for the jth patient; that is, the Yjk do not have an interval measurement scale, and so arithmetic opera- tions such as addition or subtraction are not applicable to them. Also, any strictly ordinal covariable with observa- tion prior to the randomization of the patients to the two groups can have inclusion in the YYY j. Since some of the Yjk may be missing (by a miss- ing completely at random (MCAR) process), let Z jk = 1 if Yjk is not missing and let Z jk = 0 if Yjk is missing; let ZZZ j = (Z j1, . . . ,Z jr)′. The vectors (YYY ′ 1,ZZZ ′ 1) ′, (YYY ′2,ZZZ ′ 2) ′, . . . , (YYY ′N ,ZZZ ′ N) ′ for the respective patients are assumed to be mutually independent and to represent target popu- lations for the treatments i = 1,2 within the respective strata h = 1,2, . . . ,q in a sense comparable to simple ran- dom samples (with this structure corresponding to in- dependent random sampling from multinomial distribu- tions for an ordered categorical response variable). In the subsequent discussion, any missing Yjk operationally has replacement by the corresponding median of the Yjk for the kth visit (although the use of any arbitrary value such as 0 is possible since the value used for such replacement has no role in the subsequently described processes for estimation). The comparisons between two groups for the strictly 228 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments ordinal response variables {Yjk} at the respective visits can have stratification adjusted evaluation through the ra- tios ξk = (θ1k/θ2k). Here, θ1k pertains to the probability that a random pair of patients is from the same stratum and has a patient in group 1 with larger value for the kth response variable than the other patient in group 2 af- ter randomly breaking ties on response with probability 0.5 (with pairs having at least one missing value for the kth visit ignored); and θ2k pertains to the probability that a random pair of patients is from the same stratum and has nonmissing values of the response at the kth visit for both one patient in group 1 and one patient in group 2. Estimators for θ1k and θ2k can have construction through the U-statistics θ̂1k and θ̂2k in (A.1). Here, the U1 j j′k and U2 j j′k have the structure shown in (A.2) for which I(Ψ) is an indicator function that has θ̂1k = 1 N(N−1) N ∑ j=1 N ∑ j′ 6= j U1 j j′k and θ̂2k = 1 N(N−1) N ∑ j=1 N ∑ j′ 6= j U2 j j′k (A.1) U1 j j′k = I { (S j−S j′) = 0 } ×[I { (t j− t j′)(Yjk−Yj′k)Z jkZ j′k > 0 } +0.5× I { (t j− t j′) 2Z jkZ j′k > 0 } ×I { (Yjk−Yj′k) = 0 } ]/(n jk + n j′k + 1) U2 j j′k = [ I { (S j−S j′) = 0 } ×I { (t j− t j′) 2Z jkZ j′k > 0 }] /(n jk + n j′k + 1) (A.2) the value 1 if the condition Ψ is satisfied or the value 0 if otherwise, and n jk denotes the sample size for the kth response variable for patients with the same stratum and group as the jth patient. In this regard, if nhik denotes the number of nonmissing Yjk for the ith group in the hth stratum for the kth response variable, then n jk = nhik if patient j is from the hth stratum and is in the ith group. Also, I { (S j−S j′) = 0 } has the value 1 when S j = S j′ for patients j and j′ being from the same stratum, and it is 0 if patients j and j′ are from different strata, regardless of the categorical nature of the strata. For purposes of in- terpretation, the U1 j j′k pertain to whether patients j and j′ are from the same stratum such that the kth response variable for one of them in group 1 is larger than that of the other in group 2 after randomly breaking ties (and ig- noring pairs where either Yjk or Yj′k is missing); and the U2 j j′k pertain to whether patients j and j′ are from the same stratum with one of them being in group 1 and the other being in group 2 with both of them having nonmiss- ing responses Yjk and Yj′k. The U2 j j′k are identical for all k as U2 j j′ when no patient has any missing Yjk; in this case, U2 j j′ = 1/(nh1 +nh2 +1) when S j = S j′ and t j 6= t j′ and U2 j j′k = 0 if otherwise where nhi denotes the number of patients in the ith group for the hth stratum. An alternative strategy for missing Yjk or Yj′k has re- moval of Z jkZ j′k from the definition of the U2 j j′k so that the previously noted U2 j j′ for situations with no missing Yjk apply. The U1 j j′k then have imputation when Yjk or Yj′k is missing with either 0.5 so as to represent tied sta- tus, its counterpart U1 j j′(k−1) for the previous response (or visit), or its counterpart for the baseline response (or visit) U1 j j′1. For this type of strategy, one correspond- ingly either adds 0.5(1−Z jkZ j′k), U1 j j′(k−1)(1−Z jkZ j′k) or U1 j j′1(1−Z jkZ j′k) to the U1 j j′k in (A.2). Let U1 jk = ∑ N j′ 6= j U1 j j′k/(N − 1) and U2 jk = ∑Nj′ 6= j U2 j j′k/(N − 1). Let FFF j = (UUU ′ 1 j,UUU ′ 2 j) ′ denote a compound vector for the jth patient where UUU1 j = (U1 j1, . . . ,U1 jr)′ and UUU2 j = (U2 j1, . . . ,U2 jr)′. Let FFF = ∑Nj=1 FFF j/N denote the sample mean vector for the FFF j. As noted in Davis and Quade (1968), Puri and Sen (1971), Quade (1974), Carr et al. (1989), and Jung and Koch (1998, 1999), a consistent estimator for the covariance matrix for FFF is given in (A.3). VVV FFF = 4 N(N−1) N ∑ j=1 (FFF j−FFF)(FFF j−FFF) ′. (A.3) The Mann–Whitney estimator for ξξξ = (ξ1,ξ2, . . . ,ξr)′ is ξ̂ξξ = DDD −1 θ̂θθ 2 θ̂θθ 1 = (ξ̂1, ξ̂2, . . . , ξ̂r)′ where DDDaaa denotes a diagonal matrix with the elements of vec- tor aaa on the main diagonal. Here θ̂θθ 1 = (θ̂11, θ̂12, . . . , θ̂1r)′ corresponds to the first r elements of FFF (as they represent comparisons between the two groups for the r response variables); and θ̂θθ 2 = (θ̂21, θ̂22, . . . , θ̂2r)′ correspond to the remaining r elements of FFF ; that is, FFF = (θ̂θθ ′ 1, θ̂θθ ′ 2) ′. Based on the Taylor series linearization, a consistent estimator VVV ξ̂ξξ for the covariance matrix for ξ̂ξξ is given in (A.4). VVV ξ̂ξξ = DDD ξ̂ξξ [DDD−1 θ̂θθ 1 ,−DDD−1 θ̂θθ 2 ]VVV FFF [DDD −1 θ̂θθ 1 ,−DDD−1 θ̂θθ 2 ]′DDD ξ̂ξξ . (A.4) Because of (A.1)–(A.4), the Mann–Whitney estimators ξ̂ξξ and their estimated covariance matrix VVV ξ̂ξξ for the compar- isons between the two groups for the respective response variables have the corresponding observed data as their basis; and so their soundness depends on the applicabil- ity of this way for managing randomly missing data (via MCAR). Since U1 j j′k = 0 and U2 j j′k = 0 when j and j′ are from different strata, both θ̂θθ 1k and θ̂θθ 2k are based on sums across the strata for sums over j 6= j′ within the respec- tive strata. In this regard, the sum of the U1 j j′k for j 6= j′ in stratum h can have expression as (nh1knh2kξ̂hk)/(nh1k + 229 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Statistics in Biopharmaceutical Research: Vol. 3, No. 2 nh2k + 1)N(N− 1) where ξ̂hk is the Mann–Whitney esti- mator for the probability that a randomly selected patient in group 1 of the hth stratum has a larger value for the kth response variable than such a patient in group 2 after ran- domly breaking ties; and the sum of the U2 j j′k for j 6= j′ in stratum h can have expression as (nh1knh2k)/(nh1k + nh2k + 1)N(N−1). Accordingly, ξ̂k = { q ∑ h=1 [ nh1knh2kξ̂hk/(nh1k + nh2k + 1) ] } /{ q ∑ h=1 [nh1knh2k/(nh1k + nh2k + 1)] } . From (ξ̂hk − 0.5) = (Rh1k − Rh2k)/(nh1k + nh2k) where Rh1k and Rh2k are the mean ranks for patients in group 1 and group 2 in the hth stratum for the kth response variable, it follows that (ξ̂k − 0.5) = { ∑ q h=1 whk(Rh1k−Rh2k)/(nh1k + nh2k) }/ ∑ q h=1 whk where whk = nh1knh2k/(nh1k + nh2k + 1); and so it is comparable to the stratified extension of the Wilcoxon rank sum statistic of van Elteren (1960), which corresponds to ∑ q h=1{nh1knh2k/(nh1k + nh2k) ×(Rh1k−Rh2k)/(nh1k + nh2k + 1)}. Let xxx j = (x j1, . . . , x jM)′ denote the vector of M co- variables for the jth patient with x jm denoting the nu- meric value of the mth covariable for the jth patient. All of the M covariables have observation prior to the ran- domization of the patients to the two groups. Also, any categorical covariable has expression as a set of indicator variables that correspond to all except one of its cate- gories with the excluded category serving as a reference category. All of the covariables are assumed to have no missing data. Let ϕ̂1m and ϕ̂2 be defined as in (A.5). ϕ̂1m = 1 N(N−1) N ∑ j=1 N ∑ j′ 6= j Ũ1 j j′m, ϕ̂2 = 1 N(N−1) N ∑ j=1 N ∑ j′ 6= j Ũ2 j j′ (A.5) In (A.5), the Ũ1 j j′m and Ũ2 j j′ have the structure shown in (A.6) Ũ1 j j′m = [ I { (S j−S j′) = 0 } ×0.5(t j− t j′)× (x jm− x j′m)/(n j + n j′) Ũ2 j j′ = [ I { (S j−S j′) = 0 } ×I { (t j− t j′) 6= 0 } /(n j + n j′), (A.6) where n j = nhi if patient j is from the hth stratum and is in the ith group. Let Ũ1 jm = ∑ N j′ 6= j Ũ1 j j′m/(N − 1) and Ũ2 j = ∑Nj′ 6= j Ũ2 j j′/(N − 1). Let GGG j = (UUU ′ 1 j, ŨUU ′ 1 j, UUU ′ 2 j, Ũ ′ 2 j) ′ where ŨUU ′ 1 j = (Ũ1 j1, . . . , Ũ1 jM) ′. Let GGG = ∑Nj=1 GGG j/N de- note the sample mean vector for the GGG j. A consistent estimator for the covariance matrix of GGG is given in (A.7) VVV GGG = 4 N(N−1) N ∑ j=1 (GGG j−GGG)(GGG j′ −GGG) ′. (A.7) Let G̃GG = (θ̂θθ ′ 1, ϕ̂ϕϕ ′ 1, θ̂θθ ′ 2, ϕ̂2) ′ where ϕ̂ϕϕ1 = (ϕ̂11, . . . , ϕ̂1M) ′. Let fff = (ξ̂ξξ ′ ,ggg′)′ where ξ̂ξξ = DDD−1 θ̂θθ 2 θ̂θθ 1 and ggg = ϕ̂ϕϕ1/ϕ̂2. A consistent estimator for the covariance matrix of fff is VVV fff = HHHVVV GGGHHH ′ where HHH as shown in (A.8) is from Taylor series linearization. HHH = DDD fff [ IIIr 000rM −IIIr 000r1 000Mr IIIM 000Mr −111M ] DDD−1 G̃GG . (A.8) Since ϕ̂1m = ∑ q h=1 nh1nh2(xh1m−xh2m)/(nh1 +nh2)N(N− 1) and ϕ̂2 = ∑ 2 h=1 nh1nh2/(nh1 + nh2)N(N − 1) where xhim represents the mean of the x jm for patients in the hth stratum and the ith group, it follows that (ϕ̂1m/ϕ̂2) = ∑ q h=1 w̃h(xh1m− xh2m)/∑ q h=1 w̃h where w̃h = nh1nh2/(nh1 +nh2). Thus, the first r elements of fff address comparisons between the two groups for the r response variables with stratification adjusted Mann–Whitney es- timators, and the last M elements address comparisons between two groups for the M covariables with strati- fication adjusted differences between their correspond- ing means. Since ggg would be expected to be null on the basis of randomization of patients to the two groups, randomization-based covariance adjustment of ξ̂ξξ is pos- sible by fitting the model PPP = [IIIr,000rM]′ to fff by weighted least squares. The resulting adjusted counterparts bbb for ξ̂ξξ are shown in (A.9): bbb = (PPP′VVV−1fff PPP) −1PPP′VVV−1fff fff = (ξ̂ξξ −VVV ξ̂ξξ gggVVV −1 ggg ggg), (A.9) where VVV ξ̂ξξ ggg corresponds to the covariances of ξ̂ξξ with ggg and VVV ggg corresponds to the covariance matrix of ggg. A con- sistent estimator for the covariance matrix of bbb is VVV bbb in (A.10) VVV bbb = (PPP ′VVV−1fff PPP) −1 = (VVV ξ̂ξξ −VVV ξ̂ξξ ggg VVV−1ggg VVV ′ ξ̂ξξ ggg ). (A.10) Additional models that address the variation of the ele- ments of bbb across the r response variables can be fit by weighted least squares methods. Acknowledgments The first author’s research was partly supported by grant from the Ishibashi Foundation for the Promotion of Science. [Received February 2010. Revised April 2010.] 230 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 Stratified Multivariate Mann–Whitney Estimators for the Comparison of Two Treatments References Agresti, A. (2002), Categorical Data Analysis (2nd ed.), Hoboken: Wi- ley. 219 Brunner, E., Puri, M. L., and Sun, S. S. (1995), “Nonparametric Meth- ods for Stratified Two-Sample Designs with Application to Multi- clinic Trials,” Journal of the American Statistical Association, 90, 1004–1014. 218 Carr, G. J., Hafner, K. B., and Koch, G. G. (1989), “Analysis of Rank Measures of Association for Ordinal Data from Longitudinal Stud- ies,” Journal of the American Statistical Association, 84, 797–804. 228, 229 Davis, C. E., and Quade, D. (1968), “On Comparing the Correlations Within Two Pairs of Variables,” Biometrics, 24, 987–995. 229 Fisher, R. A. (1925), Statistical Methods for Research Workers, Lon- don: Oliver and Boyd. 228 Jung, J. W., and Koch, G. G. (1998), “A Linear Model Method for Rank Measures of Association from Longitudinal Studies with Fixed Con- ditions (visits) for Data Collection and More Than Two Groups,” Journal of Biopharmaceutical Statistics, 8, 299–316. 228, 229 (1999), “Multivariate Non-parametric Methods for Mann- Whitney Estimators to Analyse Cross-over Studies with Two Treat- ment Sequences,” Statistics in Medicine, 18, 989–1017. 228, 229 Kawaguchi, A., and Koch, G. G. (2010), “Multivariate Mann-Whitney Estimators for the Comparison of Two Treatments in a Three Period Crossover Study with Randomly Missing Data,” Journal of Biophar- maceutical Statistics, 20, 720–744. 228 Koch, G. G., Carr, G. J., Amara, I. A., Stokes, M. E., and Uryniak, T. J. (1990), “Categorical Data Analysis,” in Statistical Methodol- ogy in Pharmaceutical Sciences, ed. D. A. Berry, New York: Marcel Dekker, pp. 291–475. 224, 225 Koch, G. G., Tangen, C. M., Jung, J. W., and Amara, I. A. (1998), “Is- sues for Covariance Analysis of Dichotomous and Ordered Cate- gorical Data from Randomized Clinical Trials and Non-parametric Strategies for Addressing Them,” Statistics in Medicine, 17, 1863– 1892. 219, 221, 225, 227 LaVange L. M., Durham T. A., and Koch G. G. (2005), “Randomization-Based Nonparametric Methods for the Analysis of Multicentre Trials,” Statistical Methods in Medical Research, 14, 281–301. 218, 221 Lehmann, E. L. (1975), Nonparametrics: Statistical Methods Based on Ranks, San Francisco: Holden-Day. 220, 222 Pincus, T., Koch, G. G., Sokka, T., Lefkowith, J., Wolfe, F., Jordan, J. M.. Luta, G., Callahan, L. F., Wang, X., Schwartz, T., Abramson, S. B., Caldwell, J. R., Harrell, R. A., Kremer, J. M., Lautzenheiser, R. L., Markenson, J. A., Schnitzer, T. J., Weaver, A., Cummins, P., Wilson, A., Morant, S., and Fort, J. (2001), “A Randomized, Double-blind, Crossover Clinical Trial of Diclofenac plus Misopros- tol versus Acetaminophen in Patients with Osteoarthritis of the Hip or Knee,” Arthritis & Rheumatism, 44, 1587–1598. 225 Puri, M. L., and Sen, P. K. (1971), Nonparametric Methods in Multi- variate Analysis, New York: Wiley. 229 Quade, D. (1974), “Nonparametric Partial Correlations,” in Measure- ment in the Social Sciences, ed. H. M. Blalock, Jr., Chicago: Aldine Publishing, pp. 369–398. 229 Ryu, E., and Agresti, A. (2008), “Modeling and Inference for an Or- dinal Effect Size Measure,” Statistics in Medicine, 27, 1703–1717. 218 Stokes, M. E., Davis, C. S., and Koch, G. G. (2000), Categorical Data Analysis using the SAS System, Cary: SAS Publishing. 218, 219, 221, 222, 223, 224, 225, 227 Tangen, C. M., and Koch, G. G. (1999), “Complementary Nonparamet- ric Analysis of Covariance for Logistic Regression in a Randomized Clinical Trial Setting,” Journal of Biopharmaceutical Statistics, 9, 45–66. 219, 225 van Elteren, P. H. (1960), “On the Combination of Independent Two- Sample Tests of Wilcoxon,” Bulletin of the International Statistical Institute, 37, 351–361. 218, 220, 222, 230 About the Authors Atsushi Kawaguchi, Biostatistics Center, Kurume University, 67 Asahi-Machi Kureme-City, Fukuoka 830-0011, Japan (E-mail for correspondence: kawaguchi [email protected]). Gary G. Koch, Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599-7420. Xiaofei Wang, Department of Biostatistics and Bioinformatics, Duke University Medical Center, Box 2721 2424 Erwin Road, Suite 1102 Durham, NC 27710. 231 D ow nl oa de d by [ U m eå U ni ve rs ity L ib ra ry ] at 2 3: 23 2 0 N ov em be r 20 14 mailto:[email protected] Introduction Methods Formal Structure for a Stratified Multivariate Mann--Whitney Estimator Randomization-Based Covariance Adjustment Examples Randomized Clinical Trial of Chronic Pain with Univariate Ordinal Response Variable Randomized Clinical Trial with Ordinal Responses at Four Visits for a Respiratory Disorder Randomized Two-Period Crossover Clinical Trial for Osteoarthritis with Ordinal Responses Discussion Methods


Comments

Copyright © 2025 UPDOCS Inc.