Group work - interpretation & responsiveness answers

This exercise requires that you have read the article:

van der Windt DA, van der Heijden GJ, de Winter AF, Koes BW, Deville W, Bouter LM. The responsiveness of the Shoulder Disability Questionnaire. Ann Rheum Dis. 1998;57(2):82-7. You can find it here.

Article summary

Some years ago, the responsiveness of the Shoulder Disability Questionnaire (SDQ) was evaluated in a general practice setting in patients with shoulder pain. The SDQ was compared with the Pain Severity Score (PSS) and Functional Status Questionnaire (FSQ).

The SDQ (shoulder Disability Questionnaire) consists of 16 items and is scored on a scale from 0–100, with higher scores indicating more severe disability (see Appendix 1 at the end of the page).

The PSS (Pain Severity Scale) is a single question about the severity of pain, scored on a scale of 0–10. The PSS is also converted to a scale of 0–100, with higher scores indicating more severe pain.

The FSQ (Functional Status Questionnaire) consists of a three-point scale (1: little discomfort during daily activities; 2: much discomfort during daily activities; 3: unable to perform daily activities).

Clinical improvements and deterioration were documented through self-reported changes since the beginning of the episode. No change and little improvement were considered clinical stability. Measurements were taken upon entrance into the study and at one and six months’ follow-up.

In the study, responsiveness was assessed in terms of Guyatt’s responsiveness ratio and a ROC curve. The most important findings are presented in Table 3 and Figure 1 (see below).

Table 3. Mean change scores (SD) and responsiveness ratios for the SDQ and PSS after 1 and 6 months.

* Substitution of missing values was conducted for patients reporting complete recovery: for 74 patients at one month (PSS only) and for 157 patients at six months (PSS and SDQ),
† Responsivess ratio: the ratio of the mean change score in clinically improved patients to the variability (SD) of change in scores in clinically stable patients.

Figure 1. ROC curves for change scores of the SDQ, PSS and FSQ at 1 month.

Note: True positive rate (sensitivity) and false positive rate (100-specificity) are for discriminating between patients reporting clinical improvement or clinical stability. Potential cut off points for the SDQ-change-score = 18.75: sensitivty 74%, specificity 77% (optimal trade off); SDQ-change-score = 40: (mean change in clinically improved patients) sensitivity 46%, specificity 98%.


1. How were the responsiveness ratios for the SDQ and PSS calculated? Can you check these calculations using the presented data?


Remember that Guyatt’s responsiveness statistic (GRS) is defined as:

GRS = (Scorepretest-Scoreposttest)improved patients/SDchange, stable patients

GRS = (Mean change-score)improved patients/SDchange, stable patients


The mean change scores for the Shoulder Disability Questionnaire (SDQ) and the Pain Severity Score (PSS) for patients with ‘clinical improvement’ after one and six months appear in the first row of Table 3. These are divided by the standard deviation of the change scores (SD-change) of stable patients, found in the next row.

For example, for the SDQ after six months, the Guyatt’s responsiveness ratio was calculated as 51/27 = 1.89. Note that the authors did NOT include the MINIMAL important change (MIC) in the numerator. Thus, the Guyatt’s responsiveness ratio is overestimated. Moreover, this ratio is more a measure of interpretability than a measure of responsiveness.

2. What do think about the chosen external criterion? How would the responsiveness ratio change if the category ‘little improvement’ was considered ‘clinical improvement’?


Think about the validity of the external criterion (anchor).

Think about what would happen to the GRS if the dichotomisation of the anchor changed.


The chosen external criterion for clinical stability, ‘recovery as experienced by the patient’, is subjective. The literature expresses doubt about the reliability and validity of this criterion, because it is derived from only one question and often better correlates with the last rather than the first measurement. Self-reported recovery (‘much improved’) corresponds, as expected, insufficiently with the smallest clinically relevant change. The chosen criterion may classify a number of patients who did have clinically relevant improvements as stable.

If the category ‘little improvement’ is counted as ‘clinically relevant improvement’, the SD of the stable group would become smaller, thus giving the SDC a smaller value. The denominator of the responsiveness ratio is reduced, but so is the numerator: this is because patients with slight improvements are also counted as improved. What effect that has on the responsiveness ratio is hard to predict.

3. On the basis of the data in Table 3, draw an anchor-based MIC distribution after one month for the SDQ: distribution of scores for the group without clinically relevant improvement (stability) and distribution of scores for the group with clinically relevant improvement.


Relevant numbers needed to draw the graph:

Figure 2. Anchor-based MIC distribution graph

4. Estimate the MIC value with optimal sensitivity and specificity for the SDQ using the ROC curve.


Look carefully at Figure 1 and find the point closest to the left upper corner. Then you can estimate the sensitivity and specificity.


The optimal sensitivity and specificity is 74% and 77%, respectively. This gives a ROC cut-off value of 18.75 points (Figure 1).

5. How would the MIC value change if the category ‘little improvement’ was considered ‘clinical improvement’?


Use the graph you drew in question 3. Some of the patiets would move from one graph to the other but how?


If the category ‘little improvement’ is counted as ‘clinically relevant improvement’, some of the people who were classified as not importantly changed in the right-hand curve in the anchor-based MIC distribution will move to the left-hand curve. Specifically: the upper part of the right-hand curve moves to the lowest part of the left-hand curve. The MIC based on optimal sensitivity and specificity will become smaller.

Appendix 1. The Shoulder Disability Questionnaire