The Journal of Bone and Joint Surgery. British volumeVol. 87-B, No. 2 AnnotationsFree AccessThe use of outcome scores in surgery of the shoulderP. Harvie, T. C. B. Pollard, R. J. Chennagiri, A. J. CarrP. HarvieSpecialist Registrar in Trauma and OrthopaedicsThe John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK.Search for more papers by this author, T. C. B. PollardClinical Research FellowDepartment of Orthopaedics, Frenchay Hospital, Park Road, Frenchay, Bristol BS16 1LE, UK.Search for more papers by this author, R. J. ChennagiriSpecialist Registrar in Trauma and OrthopaedicsThe John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK.Search for more papers by this author, A. J. CarrNuffield Professor of Orthopaedic SurgeryNuffield Department of Orthopaedic Surgery, University of Oxford, Nuffield Orthopaedic Centre, Windmill Road, Headington, Oxford OX3 7LD, UK.Search for more papers by this authorPublished Online:1 Feb 2005https://doi.org/10.1302/0301-620X.87B2.15305AboutSectionsView articleSupplemental MaterialPDF/EPUB ToolsDownload CitationsTrack CitationsPermissionsAdd to Favourites ShareShare onFacebookTwitterLinked InRedditEmail View articleThe pursuit of ‘best practice’, health economic planning, the increasing awareness and expectations of patients, pressure from politicians and the media, and the emergence of league tables for surgeons are some of the reasons why orthopaedic surgeons are encouraged to adopt evidence-based strategies for managing their patients. Levels of evidence have been devised which allow publications to be ranked or given a grade of recommendation.1,2 The highest levels are assigned to well-designed, randomised, controlled trials and systematic reviews of such trials.Lower levels are offered by cohort studies in which patients are compared with a control group treated at the same time and in the same institution. Such studies are ranked higher than randomised trials of poor quality, retrospective cohort studies or case-control studies. Individual case series and poorly designed cohort studies are lower still while the final level is expert opinions without critical appraisal and descriptive studies or reports from expert committees (Table I). Proper studies require good design and the use of validated outcome measures. We have carried out a systematic review of the use of outcome scores and research methods in surgery of the shoulder to establish whether the literature provides suitable evidence on which to establish best practice.Review of the literatureA systematic review was undertaken of all articles relating to the shoulder published in the Journal of Shoulder and Elbow Surgery, the Journal of Bone and Joint Surgery [Br] and the Journal of Bone and Joint Surgery [Am] between January 1992 and December 2002. After manual searching, all papers which documented any form of clinical outcome were included for more detailed review.3–53 Those relating to anatomy, pathology, biomechanics, engineering design or technical aspects which did not involve a clinical outcome were excluded. Each paper chosen was placed into one of 16 broad categories according to its subspecialty. The exact number of patients studied as well as the minimum, maximum and mean periods of follow-up were recorded. A ‘grade of recommendation’ and ‘level of evidence’ were assigned to each paper in accordance with the standards shown in Table I. All criteria used to describe a clinical outcome were recorded, whether in the form of observations such as power or range of movement, or by the use of a recognised scoring system. Each paper was reviewed to ascertain whether a description of the outcome method used and the reasons for its selection were included in the text. In particular, we looked for details of the original group of patients on which any outcome score was based. An outcome score was regarded as appropriate if it was used unmodified for a validated group of patients.ResultsWe reviewed 1106 articles relating to surgery on the shoulder. Of these, 496 were excluded on the basis of non-clinical content. The remaining 610 underwent more detailed review. There were 198 case reports and 379 cohort studies, the latter including 19 RCTs, but no systematic reviews (Table I). The mean sample size was 42 (1 to 1063). The overall mean follow-up was 27 months (1 to 540) with a minimum of 12 (1 to 540) and a maximum of 68 months (1 to 540). A formal outcome was described in 569 (93.3%) articles. Of these, 271 (47.6%) used clinical assessment, 217 (38.1%) an outcome score and 81 (14.2%) both. A total of 44 different outcome scores were encountered, 22 clinician-based (50.0%), 21 patient-based (47.7%) and one clinician- and patient-based (2.3%). Of 439 applications of an outcome score, 266 (60.6%) were clinician-based, 105 (23.9%) patient-based and 68 (15.5%) clinician- and patient-based. Trends in the use of the different types of score are shown in Figure 1. Of 298 articles using outcome scores, 126 (42.3%) described the details of the score within the text, but only eight (2.7%) made clear the reasons for the choice of the particular score.Closer scrutiny of the use of clinical assessment in 352 articles showed a mean of 2.3 observations (1 to 6) per article. Those used were range of movement (208), pain (202), function/activities of daily living (129), power (88), radiological appearance (83), patient satisfaction (67) and stability (47). In the 298 articles using a formal outcome score a mean of 1.5 outcomes (1 to 6) was used per article. Overall, of the 439 applications of an outcome score 282 (64.2%) were regarded as being appropriate (Fig. 2). All formal outcome measures identified during the course of this review are listed.DiscussionThe proposal that clinical outcome in orthopaedic surgery could be analysed systematically so that patients would receive increased benefits from their treatment was first introduced by Codman et al3 in the second decade of the 20th century, and is the basis of his concept of the “End Result”. Unfortunately, his peers did not share his enthusiasm. Codman’s frustration culminated at a meeting on January 6, 1915 in which he ridiculed his colleagues and members of the hospital board, portraying them in a large cartoon as an ostrich burying its head in the sand and choosing to ignore what was happening around it. Codman’s career declined thereafter and he died in relative anonymity. Systematic reviews of randomised, controlled trials offer the maximum levels of evidence upon which clinical decisions can be based. No such reviews were found in the course of this investigation. Although 19 randomised, controlled trials (3.1%) were identified, 538 papers (88.2%) described case series offering low levels of evidence. The undertaking of a randomised, controlled trial for a surgical procedure is costly and time-consuming. Nevertheless, increased use of cohort or case-control studies would considerably improve the level of evidence available.The use of validated outcome scores allows comparisons to be made between studies. If scores are modified or used on inappropriate groups of patients, such comparisons are flawed. The European Society for Surgery of the Shoulder and Elbow and the Japanese Orthopaedic Association have each given guidance on the preferred use of outcome scores. However, such recommendations are not uniformly accepted. Our review has shown that study cohorts are generally small, periods of follow-up short and levels of evidence low. The overall pattern of the application of an outcome score is highly variable and at times inappropriate. We have identified changes made to outcome scores, often without proper testing of the modification and without justification. For example, the Neer rating4 was initially used to assess the outcome of displaced fractures of the proximal humerus, but was modified to assess total shoulder arthroplasty5 and, more recently, repair of the rotator cuff,6 although its formal statistical validation for use with these differing groups has not been undertaken.The score of Constant and Murley7 is widely used, but large variations occur in how it is formulated. Pain is often assessed using separate visual analogue scales, the methods of measuring power vary and, most importantly, the fact that scores should be normalised for age and gender is selectively ignored.8 The application of objective clinical assessment of pain, range of movement, power and stability are acceptable means of measuring outcome. However, the means by which such assessments are measured and documented and the number of such criteria used in studies is variable. Scores may be patient-based such as the Oxford shoulder score,9 clinician-based as the Constant-Murley score or a combination of both as in the modified American Shoulder and Elbow Surgeons form.10 There are condition-specific scores such as the Oxford shoulder instability score11 and non-condition-specific scores such as the simple shoulder test.12 In recent years there has been a proliferation of patient-based outcome scores recognising the benefits of such scores compared with clinician-based assessments. The latter are susceptible to bias and error, and may not represent the view of the patient.13 Patient-based scores are designed for use in clinical trials and are valid for comparing and aggregating cohort studies.14–16 Their use will directly improve levels of evidence. Despite the trend to move away from the application of clinician-based outcome scores, our review has shown that in practice the magnitude of this shift is highly variable. Over the last decade the use of clinician-based scores has remained high. An overall understanding of the initial population upon which scores were first based is lacking. Newer scores such as the shoulder pain and disability index (SPADI)17 were initially based on a cohort of 37 male patients with shoulder pain which was either musculoskeletal, neurogenic or of unknown aetiology. The patient self-reporting section of the modified American Shoulder and Elbow Surgeons assessment form (M-ASES) has undergone validation. However, this was based on only 63 patients, 25 of whom had impingement, but only one had undergone hemiarthroplasty and two had tears of the rotator cuff.18 The use of outcome scores on cohorts for which they have not been validated casts doubt on the validity of the results.ConclusionForty-four different outcome scores were encountered in the course of this review, many being applied inappropriately. There is a trend towards the increased use of validated patient-based scores, but many have not been properly tested for validity, repeatability and sensitivity to change. Scores are not valid when used in a modified form and their use should be discouraged. Levels of evidence were generally low, with 88.2% of level 4, and with only a small number of RCTs. Improvement in the design of the studies and the use of appropriately validated outcome scores would substantially increase the levels of evidence on which to base best practice in surgery of the shoulder.Supplementary materialA table showing the list of outcome scores identified in the course of this review is available with the electronic version of this article, on our website at www.jbjs.org.uk.Table I. Hierarchy of evidence of reviewed papers with sample sizes and minimum, maximum and mean follow-up (range)Follow-up (mths)Grade of recommendationLevel of evidenceStudy designNumberSample sizeMinimumMaximumMeanA1aSystematic review (with homogeneity) of randomised, controlled trials0NANANANA1bIndividual randomised, controlled trials with independent blinding1285 (29 to 245)24 (3 to 120)32 (3 to 120)24 (3 to 120)1c‘All-or-none’ case series162124925B2aSystematic review (with homogeneity) of cohort studies0NANANANA2bIndividual cohort studies and low-quality randomised, controlled trials2573 (6 to 300)24 (1 to 180)66 (1 to 120)36 (1 to 194)2cOutcomes research29184 (8 to 1063)NANANA3aSystematic review of case-control studies0NANANANA3bIndividual case-control studies5154 (29 to 538)30 (5 to 120)32 (12 to 120)37 (9 to 120)C4Case series and poor-quality cohort and case-control studies538131 (1 to 667)22 (1 to 540)68 (1 to 540)27 (1 to 540)D5Expert opinion without explicit critical appraisal0NANANANAFig. 1 Proportion of combined and clinician- and patient-based scores used to assess outcome over the period of study.Fig. 2 Manner of application of frequently encountered outcome scores (CMS, Constant-Murley shoulder score; ASES, American Shoulder and Elbow Surgeons standardised shoulder assessment form; UCLA, University of California Los Angeles shoulder rating scale; Neer, Neer shoulder rating; Rowe, Rowe instability score; SST, simple shoulder test; SF-36, 36-item short-form health survey; HSS, Hospital for Special Surgery shoulder assessment; MSTS, Musculo-skeletal tumour score; DASH, Disabilities of the arm shoulder and hand questionnaire; SPADI, shoulder pain and disability index).We wish to thank Mrs Pat Deeley, Academic Secretary to Professor A. J. Carr, for her assistance in collating the review of the literature.References1 Phillips B, Ball C, Sackett D, et al. Oxford centre for evidence-based medicine levels of evidence (May 2001) http://www.cebm.net/levels_of_evidence.asp#levels. (accessed 18/11/04) Google Scholar2 Sackett D, Straus S, Richardson WS, Haynes RB. Evidence-based medicine: how to practice and teach. Second ed. London: Churchill Livingstone, 2000. Google Scholar3 Codman EA, Chipman WW, Clark JG, Kanavel AB, Mayo WJ. Standardisation of hospitals: report of the committee appointed by the Clinical Congress of Surgeons of North America. Trans Clin Cong Surg North Am 1913;4:2–8. Google Scholar4 Neer CS 2nd. Displaced proximal humeral fractures: I. classification and evaluation. J Bone Joint Surg [Am] 1970;52-A:1077–89. ISI, Google Scholar5 Neer CS II, Watson KC, Stanton FJ. Recent experience in total shoulder replacement. J Bone Joint Surg [Am] 1982;64-A:319–37. ISI, Google Scholar6 Ellman H, Hanker G, Bayer M. Repair of the rotator cuff: end-result study of factors influencing reconstruction. J Bone Joint Surg [Am] 1986;68-A:1136–44. Google Scholar7 Constant CR, Murley AHG. A clinical method of functional assessment of the shoulder. Clin Orthop 1987;214:160–4. Google Scholar8 Ware JE, Sherbourne CD. The MOS 36-item short-form health survey (SF-36): I. conceptual framework and item selection. Med Care 1992;30:473–83. Crossref, Medline, ISI, Google Scholar9 Dawson J, Fitzpatrick R, Carr A. Questionnaire on the perception of patients about shoulder surgery. J Bone Joint Surg [Br] 1996;78-B:593–600. Link, Google Scholar10 King GJW, Richards RR, Zuckerman JD, et al. A standardised method for the assessment of shoulder function. J Shoulder Elbow Surg 1999;3:351–4. Google Scholar11 Dawson J, Fitzpatrick R, Carr A. The assessment of shoulder stability: the development and validation of a questionnaire. J Bone Joint Surg [Br] 1999;81-B:420–6. Link, Google Scholar12 Lippitt SB, Harryman DT, Matsen FA III. A practical tool for evaluating function: the simple shoulder test. In: Matsen FA III, Fu FH, Hawkins RJ, eds. The shoulder: a balance of mobility and stability. Rosemont: American Academy of Orthopaedic Surgeons, 1993;519–29. Google Scholar13 Conboy VB, Morris RW, Kiss J, Carr AJ. An evaluation of the Constant-Murley shoulder assessment. J Bone Joint Surg [Br] 1996;78-B:229–32. Link, Google Scholar14 Pynsent PB. Choosing an outcome measure. J Bone Joint Surg [Br] 2001;83-B:792–4. Link, Google Scholar15 Dawson J, Hill G, Fitzpatrick R, Carr A. The benefits of using patient-based methods of assessment: medium-term results of an observational study of shoulder surgery. J Bone Joint Surg [Br] 2001;83-B:877–82. Link, Google Scholar16 Dawson J, Carr A. Outcomes evaluation in orthopaedics. J Bone Joint Surg [Br] 2001;83-B:313–15. Link, Google Scholar17 Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arthritis Care Res 1991;4:143–9. Crossref, Medline, Google Scholar18 Michener LA, McClure PW, Sennett BJ. American shoulder and elbow surgeons standardized shoulder assessment form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg 2002;11:587–94. Crossref, Medline, ISI, Google Scholar19 Rowe CR, Patel D, Southmayd WW. The Bankart procedure: a long-term end-result study. J Bone Joint Surg [Am] 1978;60-A:1–16. ISI, Google Scholar20 Amstutz HC, Sew Hoy AL, Clarke IC. UCLA anatomic total shoulder arthroplasty. Clin Orthop 1981;155:7–20. Google Scholar21 Enneking WF, Spanier SS, Goodman MA. A system for surgical staging of musculoskeletal sarcoma. Clin Orthop 1980;153:106–60. Google Scholar22 Tajima T, Takagishi N. Evaluation system for the shoulder joint disorders. J Jpn Orthop Assoc 1987;61:623–9 (in Japanese). Google Scholar23 Weber ER, Daube JR, Coventry MB. Peripheral neuropathies associated with total hip arthroplasty. J Bone Joint Surg [Am] 1976;58-A:66–9. ISI, Google Scholar24 Matsen FA III, Smith KL. Effectiveness evaluation of the shoulder. In: Rockwood CA, Matsen FA III, eds. The shoulder. Second ed. Philadelphia: W.B. Saunders, 1998:1313–39. Google Scholar25 Patte D. Directions for the use of the index severity for painful and/or chronic disabled shoulders. The first open congress of the European Society of Surgery of the Shoulder and Elbow, Paris, 1987:36–41. Google Scholar26 Imatani RJ, Hanlon JJ, Cady GW. Acute, completed, acromioclavicular separation. J Bone Joint Surg [Am] 1975;57-A:328–32. ISI, Google Scholar27 Rockwood CA Jr, Groh GI, Wirth MA, Grassi FA. Resection arthroplasty of the sternoclavicular joint. J Bone Joint Surg [Am] 1997;79-A:387–93. Google Scholar28 Warren RF, Ranawat CS, Inglis AE. Total shoulder replacement indications and results of the Neer Nonconstrained prosthesis. In: Inglis AE, ed. AAOS symposium on total joint replacement of the upper extremity. St Louis: CV Mosby, 1982:56–67. Google Scholar29 L’Insalata JC, Warren RF, Cohen SB, Altcheck DW, Peterson MGE. A self-administered questionnaire for the assessment of symptoms and function of the shoulder. J Bone Joint Surg [Am] 1997;79-A:738–48. Google Scholar30 Mancuso CA, Altchek DW, Craig EV, et al. expectations of shoulder surgery. J Shoulder Elbow Surg Crossref, Medline, ISI, Google C. Development of an upper outcome the of the shoulder and Am J Med Crossref, Medline, ISI, Google D, M. The shoulder rating system. Orthop Trauma Surg Crossref, Medline, ISI, Google AB, G, AB, et al. shoulder Clin Orthop Google R, Development of a Res Crossref, Medline, ISI, Google J. in the evaluation of Clin Orthop Google Jr, Smith A method for treatment of acromioclavicular Orthop Clin North Am Medline, ISI, Google Enneking WF, A system for the functional evaluation of surgical treatment of of the musculoskeletal system. Clin Orthop Google JG, et al. Development of a of function for patients with and sarcoma. Res Crossref, Medline, ISI, Google G, In: R, ed. et ed. Google of the head of Clin Orthop Google J. of the of for the treatment of the shoulder: method for the of Orthop (in Google Jr, The shoulder: and J Bone Joint Surg [Br] Link, Google G, S, A self-administered index for and with of Arthritis Crossref, Medline, Google PW, The of health the disability and pain J Medline, ISI, Google Patte D, D. in the painful shoulder by Orthop (in Medline, Google B, J. treatment of by the critical study and results Orthop (in Medline, Google Smith A. of the shoulder in in a randomised study of and J Bone Joint Surg [Br] Link, Google Bone in with to on the to J Bone Joint Surg [Br] Link, Google of the a retrospective study of J Trauma Crossref, Medline, Google S, of open and J Shoulder Elbow Surg Crossref, Medline, Google J, H, G, for Orthop Crossref, Medline, Google Table Orthop (in Medline, Google follow-up studies of large and rotator of outcome measures. J Shoulder Elbow Surg Crossref, Medline, ISI, Google the outcomes shoulder and concept and of outcome measures used for evaluating patients with proximal and No. Evaluation for instability as an to the Rowe of Shoulder and Elbow Surgery, No. in Shoulder A of on range of and outcomes total shoulder arthroplasty in rotator and rotator Shoulder Elbow, Evaluation with American Shoulder and Elbow Surgeons score and index in patients rotator of Shoulder and Elbow Surgery, No. Evaluation scores with American Shoulder and Elbow Surgeons scores in patients rotator of Shoulder and Elbow Surgery, No. in the of Joint January Orthopaedic Journal of No. in and Research and 4, No. Systematic of Outcomes in in Outcomes of A Systematic and The American Journal of No. of in rotator with Shoulder December Journal of Orthopaedic Surgery, No. for as UCLA proximal No. between the UCLA and Constant-Murley scores in rotator and proximal humeral fractures No. of shoulder and No. of of shoulder outcome measures in a systematic Shoulder Elbow, No. no no No. assessment in the treatment of rotator what is in No. in Orthopaedic Journal of No. outcomes of rotator between the University of Los Angeles and American Shoulder and Elbow Surgeons of Shoulder and Elbow Surgery, No. of the Japanese Orthopaedic Association score to Constant scores for evaluating outcomes in rotator Journal of Orthopaedic Surgery, No. clinical of system for shoulder function No. in for A Systematic The Journal of Surgery, No. of functional outcomes and to in and of Shoulder and Elbow Surgery, No. of in of of the The Journal of Surgery, No. medicine for of the shoulder: study for a controlled and No. assessment in rotator what are we of Shoulder and Elbow Surgery, No. in in of Bone and Joint Surgery, No. American Shoulder and Elbow Surgeons and Evaluation After or The Journal of Surgery, No. outcomes of Surgery, No. for displaced proximal humeral fractures in the a study of two surgical total replacement No. of on shoulder function and in proximal December European Journal of Orthopaedic Surgery No. for treatment for of the No. best combination of and self-report measures to function in patient December No. of Outcomes in Shoulder Bankart A December Clinical for of the December Scores for Shoulder Constant-Murley is to a Clinical and No. to the of shoulder The Shoulder score Surgery No. score Shoulder et No. and outcomes by treatment in proximal humeral fractures: a systematic literature review from in Surgery, No. shoulder A review of the and basis of shoulder British Journal of No. of the proximal December No. development and validation of a scoring system for shoulder in British Journal of No. evaluation of upper function: disability and of Journal of No. a of shoulder pain, not to be Surgery No. et No. Outcomes of No. controlled trial comparing the of with in patients with shoulder of Shoulder and Elbow Surgery, No. of and in on the of for The American Journal of No. Scores for Shoulder and outcome is there a European Journal of Trauma and Surgery, No. for Evaluation of No. rotator repair by the of Surgery, No. and validation of the version of the Oxford shoulder of Orthopaedic and Trauma Surgery, No. of to Research and on the Effectiveness of for Shoulder Elbow, No. of in the treatment of proximal of Shoulder and Elbow Surgery, No. for head and No. development and validation of an appraisal method for rotator The Shoulder of Shoulder and Elbow Surgery, No. of in the of Displaced of the American Academy of Orthopaedic Surgeons, No. retrospective application of the Oxford Shoulder of Shoulder and Elbow Surgery, No. Constant score for Surgery, No. of surgery for of the rotator T. M. M. J. The Journal of Bone and Joint Surgery. British No. for A No. should we use the Constant of Shoulder and Elbow Surgery, No. surgical treatment of the on the shoulder joint with minimum No. of of Orthopaedic No. to years follow-up of repair using the a retrospective of Orthopaedic and Trauma Surgery, No. from the on for Journal of and No. orthopaedic J. December The Journal of Bone and Joint Surgery. British 87-B, No. 87-B, No. 2 1 in 1 The British Society of Bone and Joint All