In this systematic review, we evaluated the reliability and reproducibility of measurements of shortening in MSCF. The results of this systematic review demonstrate that the literature on this topic did yield only three fair and one poor quality studies. Since shortening plays an increasingly important role in deciding on surgical intervention of MSCF, it is important to have a reliable and accurate method of measuring. Despite the lack of high-quality studies, the available knowledge and literature should not be discarded.
Smekal et al.  published a paper validating the accuracy/reliability of measurements of different imaging modalities and techniques. They found that the posterior-anterior (PA) thorax approximated the measurements on CT the best. Measurements on 15° tilted caudo-cranial radiograph of the clavicle and clinical measurements showed the smallest agreement with CT measurements. However, they did not state the reproducibility of measurements. The measurements were performed in healed malunited clavicle fractures and not in the acute phase. This was done to ensure static conditions in time. This is a strong feature of the study since Plocher et al.  described progressive shortening in acute MSCF in time.
The PA thorax means a higher dose for the patient of 0.1 mSv compared to 0.02 mSv for a clavicle AP . It also relies on the symmetry of the clavicle using the unfractured side for comparison. A study by Cunningham et al.  reported asymmetry of the intact clavicle of more than 5 mm in almost 30% of patients. This may mean that measuring shortening of the MSCF compared to the unfractured side may be less reliable than assumed.
Archer et al.  also used the assumption of symmetry which may compromise reliability. They found a limit of agreement of 3.48 cm indicating that plain AP film of the fractured clavicle is not reliable in the prediction of the shortening measured on the CT scan. However, they found an ICC of 0.90. The statistical method for calculating intrarater variability using the paired t test may be debatable but they report no significant differences in measurements in five of six observers.
Jones et al.  reported weak to no agreement in inter- and intrarater agreement for radiological shortening using AP and 30° caudo-cranial views. They did not report a standardized method of measuring the shortening on these views. In addition, they also reported minimal to moderate interrater agreement for displacement and comminution. Intrarater agreement was strong for comminution, moderate for displacement, and minimal for shortening.
In contrast to current standard practice in which AP and 15° caudo-cranial views are made, papers have been published that support the use of a 15–30° cranio-caudal AP or PA or PA thorax view as being the most accurate in measuring the shortening of MSCF. [20, 25,26,27]. Although commenting on accuracy, these studies did not report the reproducibility of these views. Silva et al.  proposed a standardized mode of measuring shortening in MSCF. Their paper focused on adolescents, not adults, and also did not report the imaging modality or technique used. After contacting the corresponding author, it was verified that measurements were performed on standard AP and 15° caudo-cranial views. They reported no difference in a standardized measurement or method of choice concerning inter- and intraobserver variability. More recent studies find both a moderate and excellent interrater agreement using a standardized method of measuring [28, 29].
Two studies were not included in the review because these studies did not meet the inclusion criteria as only interrater agreement and not intrarater agreement was reported. However, we believe these studies are worth mentioning here. Stegeman et al.  found an intraclass correlation coefficient of 0.97 (CI 0.95–0.99) between two observers measuring shortening in a standardized way on 32 AP X-rays of the fractured clavicle. Interestingly, they found only a moderate agreement (0.45 CI 0.12–0.69) for measuring absolute shortening on the AP panoramic view after consolidation indicating that the imaging technique may be influential on the reliability of measurements as well. Malik et al.  report an ICC of 0.926 (CI 0.909–0.941) between four observers using a standardized method of measuring shortening of the fractured clavicle in 196 AP chest X-rays. These images were made with the patient varying between supine, semi-upright, and upright positions. The goal of this study was to evaluate differences in measured shortening between the different positions of the patients. No additional information on statistical analysis or interrater agreement per subgroup was reported.
Other factors reported to influence reliable and reproducible measurements are variation in magnification due to X-ray positioning and possibly positioning of the patient [18, 28, 30]. Backus et al.  reported a statistically significant difference between upright and supine patient positioning concerning shortening and displacement. Malik et al.  found a significant step-wise progression of measured shortening between supine, semi-upright, and upright positioning of the patient.
Some limitations of this study have to be discussed. First, there is only limited available literature on the topic of measuring the fractured clavicle. Since four studies were included and none of them were rated as good or excellent quality according to the COSMIN checklist, it was not possible to draw definite conclusions or make definite recommendations. Second, although the COSMIN checklist is considered the best available option to evaluate the methodological quality of studies on measurement properties, the “worst score counts” algorithm might underestimate the overall quality of a paper (e.g., one poor score out of a total of 11 items results in a poor overall score). For that reason, we provided the scores for all items using the 4-point scale. Other limitations of this study include the possibility of publication bias and language restrictions. Third, the inclusion criteria used might have been too strict. Two papers that did not meet the inclusion criteria were identified but yet could be of value on the topic. Including these papers [28, 29], however, does not influence the final conclusion pertaining the lack of evidence on the subject.
In order to optimize future studies and the realization of comparable results, a standardized method of imaging and measuring is of great importance. When considering the optimal method of imaging and measuring the fractured clavicle, one should consider the following: Imaging modality and technique, patient positioning, radiation exposure, costs and the method for measuring shortening, and/or displacement. To identify a standardized method, a compromise between these factors should be made based on further research.
CT scans and PA thorax seem more accurate, but the first is more expensive and both expose the patient to a much higher radiation dose. Supine positioning of the patient may underestimate the actual shortening and displacement, which in turn can negatively influence the decision to surgically reduce and fixate the MSCF. Calibrated views will prevent magnification errors while measuring. Although not proven better, it might be a consideration to optimize consistency by measuring shortening and displacement in a standardized and possibly proportional way as proposed by other authors. [9, 13, 19, 30, 31]