You have collected data from a clinical trial evaluating the effect of lumbar spinal surgery compared to exercise therapy. As a secondary aim you have decided to look at how to interpret the Oswestry Disability Index (ODI) and want to calculate the MIC-predictive.

Recall from the introductory course that the ODI has 10 items, each item has 6 response options, and the scale range is 0-100 (high score equals high disability). It is based on a reflective model. You can view the ODI in full by clicking here: ODI

For this you need to download the dataset: data-interpretation.zip (unpack the zip-file). The dataset can be read by Stata 12-16.

Use the summarize command to answer the question.

*Question*

1.1 What is the proportion of improved patients in the population and why is it important?

AnswerWe first look at the proportion of improved patients to see if it is more or less than 50%. If not, we do not need to apply the adjustmend to the MIC-predictive.

```
. summarize anc
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
anc | 168 .5357143 .5002138 0 1
```

The proportion of improved patiens is 0.54.

Use the logit command with the anchor (anc) as the dependent variable and the ODI change score (odich) as the independent variable.

*Question*

1.2 What are the intercept C, the regression coefficient Bx and their SE’s (standard errors)?

AnswerWe carry out the following logistic regression model:

```
. logit anc odich
Iteration 0: log likelihood = -115.39304
Iteration 1: log likelihood = -87.337139
Iteration 2: log likelihood = -86.540531
Iteration 3: log likelihood = -86.538959
Iteration 4: log likelihood = -86.538959
Logistic regression Number of obs = 167
LR chi2(1) = 57.71
Prob > chi2 = 0.0000
Log likelihood = -86.538959 Pseudo R2 = 0.2501
------------------------------------------------------------------------------
anc | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
odich | .1181484 .0206487 5.72 0.000 .0776776 .1586192
_cons | -1.009753 .2530678 -3.99 0.000 -1.505757 -.513749
------------------------------------------------------------------------------
```

The vales are:

- Intercept C = -1.01
- Bx = 0.12
- SE of Bx = 0.02
- SE of intercept C = 0.25

Right after the logistic regression you make a postestimation using the estat vce, corr command.

*Question*

1.3 What is the correlation between C and Bx?

AnswerWe carry out the postestimation command vce:

```
. estat vce, corr
Correlation matrix of coefficients of logit model
| anc
e(V) | odich _cons
-------------+--------------------
anc |
odich | 1.0000
_cons | -0.6810 1.0000
```

The correlation between C and Bx = -0.68

- Download the Excel spreadsheet.
- Then find the prevalence of improved patients as this is needed by the Excel spreadsheet. This is found by using the summarize command on the anchor variable.
- In addition, you need to use the 5 coefficients established in question 1.1 and 1.2.

*Question*

1.4 What is the MIC-predictive and it’s CI’s?

AnswerThe important coefficients we need are:

- Proportion improved = 0.54
- Intercept C = -1.01
- Bx = 0.12
- SE of Bx = 0.02
- SE of intercept C = 0.25
- Correlation between C and Bx = -0.68

If we enter these values in the spreadsheet we get the

MIC-predictive (CI) = 9.76 (6.629; 13.263)

The formula is as follows:

MIC-adjusted = MIC-predictive − (0.09 + 0.103 x cor) x SD-change x log−odds(imp)

Where

- cor = point biserial correlation between instrument change score and anchor
- SD-change = standard deviation of the instrument change score
- log-odds(imp) = log-odds of improvement = natural logarithm of [proportion improved/(1-proprotion improved)]

These coefficients are found by using the following commands:

```
esize twosample odich, by(anc) pbcorr // Point biserial correlation
sum odich // SD of the change score
sum anc // Proportion of improved patients
```

*Questions*

1.5.1 What is the adjusted MIC-predictive?

AnswerWe need to calculate the formula:

MIC-adjusted = MIC-predictive − (0.09 + 0.103 x cor) x SD-change x log−odds(imp)

To do this, we need to find the point biserial correlation (cor) and SD-change:

```
. esize twosample odich, by(anc) pbcorr // Point biserial correlation
Effect size based on mean comparison
Obs per group:
No improvement = 78
Improvement = 89
---------------------------------------------------------
Effect Size | Estimate [95% Conf. Interval]
--------------------+------------------------------------
Point-Biserial r | -.5214815 -.6133492 -.4059314
---------------------------------------------------------
. sum odich // SD of the change score
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
odich | 190 9.953843 14.98928 -30 62
```

MIC-adjusted = 9.758 – (0.09 + 0.103 x (-0.681)) x 14.989 x ln(0.536/(1-0.536))

The

MIC-adjusted = 9.68. This value is very close to the MIC-predictive as the prevalence is close to 50%.

1.5.2 Describe what the MIC-predictive value and the CI’s mean. How can you use it?

AnswerThe MIC-predictive reflects the gMIC which is the mean of all iMIC in a group (see Terluin et al. (2017)). It is determined using logistic modelling and is more precise compared to MIC estimated using ROC analysis. The wideness of the CI’s gives us an indication of how precise our estimate is and if we can trust it. The MIC-redictive and NNT can be used to interpret clinical trials in a more user-friendly and clinically relevant way.

This part requires that you have read the article:

van der Windt DA, van der Heijden GJ, de Winter AF, Koes BW, Deville W, Bouter LM. The responsiveness of the Shoulder Disability Questionnaire. Ann Rheum Dis. 1998;57(2):82-7. You can find it here.

*Article summary*

Some years ago, the responsiveness of the Shoulder Disability Questionnaire (SDQ) was evaluated in a general practice setting in patients with shoulder pain. The SDQ was compared with the Pain Severity Score (PSS) and Functional Status Questionnaire (FSQ).

**The SDQ (shoulder Disability Questionnaire)** consists of 16 items and is scored on a scale from 0–100, with higher scores indicating more severe disability *(see Appendix 1* at the end of the page).

**The PSS (Pain Severity Scale)** is a single question about the severity of pain, scored on a scale of 0–10. The PSS is also converted to a scale of 0–100, with higher scores indicating more severe pain.

**The FSQ (Functional Status Questionnaire)** consists of a three-point scale (1: little discomfort during daily activities; 2: much discomfort during daily activities; 3: unable to perform daily activities).

Clinical improvements and deterioration were documented through self-reported changes since the beginning of the episode. No change and little improvement were considered clinical stability. Measurements were taken upon entrance into the study and at one and six months’ follow-up.

In the study, responsiveness was assessed in terms of Guyatt’s responsiveness ratio and a ROC curve. The most important findings are presented in Table 3 and Figure 1 (see below).

**Table 3.** Mean change scores (SD) and responsiveness ratios for the SDQ and PSS after 1 and 6 months.

* Substitution of missing values was conducted for patients reporting complete recovery: for 74 patients at one month (PSS only) and for 157 patients at six months (PSS and SDQ),

† Responsivess ratio: the ratio of the mean change score in clinically improved patients to the variability (SD) of change in scores in clinically stable patients.

**Figure 1.** ROC curves for change scores of the SDQ, PSS and FSQ at 1 month.

Note: True positive rate (sensitivity) and false positive rate (100-specificity) are for discriminating between patients reporting clinical improvement or clinical stability. Potential cut off points for the SDQ-change-score = 18.75: sensitivty 74%, specificity 77% (optimal trade off); SDQ-change-score = 40: (mean change in clinically improved patients) sensitivity 46%, specificity 98%.

*Questions*

2.1 How were the responsiveness ratios for the SDQ and PSS calculated? Can you check these calculations using the presented data?

AnswerThe mean change scores for the Shoulder Disability Questionnaire (SDQ) and the Pain Severity Score (PSS) for patients with ‘clinical improvement’ after one and six months appear in the first row of Table 3. These are divided by the standard deviation of the change scores (SD-change) of stable patients, found in the next row.

For example, for the SDQ after six months, the Guyatt’s responsiveness ratio was calculated as 51/27 = 1.89. Note that the authors did NOT include the MINIMAL important change (MIC) in the numerator. Thus, the Guyatt’s responsiveness ratio is overestimated. Moreover, this ratio is more a measure of interpretability than a measure of responsiveness.

2.2 Calculate the smallest detectable change (SDC) and the limits of agreement (LOA) for the SDQ after the one-month follow-up. Clarify the possible difference between these two measures and explain which one you prefer in this case.

AnswerBecause the SEM is not given, the smallest detectable change (SDC) must be calculated with the SD-change:

SDC = 1.96 x SD-change = l.96 x 18 = 35.3

Limits of agreement (LOA) = Mean change in stable group ± 1.96 x SD-change = 4 ± 1.96 x 18 = (-31.3; 39.3)

These measures give a different outcome because, in the stable group, there was a mean difference of 4 points between the first and second measurements. If there were no change, the SDC (so calculated) and the limits of agreement would arrive at the same outcome. But if there is a systematic difference (as expected), preference should be given to the limits of agreement.

(NB: these calculations are actually based on SEM-consistency, because it is part of SD-change. LOA cannot be deduced from SEM-agreement)

2.3 What do think about the chosen external criterion? How would the SDC and responsiveness ratio change if the category ‘little improvement’ was considered ‘clinical improvement’?

AnswerThe chosen external criterion for clinical stability, ‘recovery as experienced by the patient’, is subjective. The literature expresses doubt about the reliability and validity of this criterion, because it is derived from only one question and often better correlates with the last rather than the first measurement. Self-reported recovery (‘much improved’) corresponds, as expected, insufficiently with the smallest clinically relevant change. The chosen criterion may classify a number of patients who did have clinically relevant improvements as stable.

If the category ‘little improvement’ is counted as ‘clinically relevant improvement’, the SD of the stable group would become smaller, thus giving the SDC a smaller value. The denominator of the responsiveness ratio is reduced, but so is the numerator: this is because patients with slight improvements are also counted as improved. What effect that has on the responsiveness ratio is hard to predict.

2.4 On the basis of the data in Table 3, draw an anchor-based MIC distribution after one month for the SDQ: distribution of scores for the group without clinically relevant improvement or deterioration and distribution of scores for the group with clinically relevant improvement.

AnswerRelevant numbers needed to draw the graph:

- Group with important improvement (n = 142): mean ± SD = (40 ± 30)
- Group with no change (n = 156): mean ± SD = (4 ± 18)

Figure 2.Anchor-based MIC distribution graph

2.5 Estimate the MIC value with optimal sensitivity and specificity for the SDQ using the ROC curve.

AnswerThe optimal sensitivity and specificity is 74% and 77%, respectively. This gives a ROC cut-off value of 18.75 points.

2.6 How would the MIC value change if the category ‘little improvement’ was considered ‘clinical improvement’?

AnswerIf the category ‘little improvement’ is counted as ‘clinically relevant improvement’, some of the people who were classified as not importantly changed in the right-hand curve in the anchor-based MIC distribution will move to the left-hand curve. Specifically: the upper part of the right-hand curve moves to the lowest part of the left-hand curve. The MIC based on optimal sensitivity and specificity will become smaller.