# Applying the right statistics: analyses of measurement studies

@article{Bland2003ApplyingTR, title={Applying the right statistics: analyses of measurement studies}, author={J Martin Bland and D. G. Altman}, journal={Ultrasound in Obstetrics and Gynecology}, year={2003}, volume={22} }

The study of measurement error, observer variation and agreement between different methods of measurement are frequent topics in the imaging literature. We describe the problems of some applications of correlation and regression methods to these studies, using recent examples from this literature. We use a simulated example to show how these problems and misinterpretations arise. We describe the 95% limits of agreement approach and a similar, appropriate, regression technique. We discuss the… Expand

#### Topics from this paper

#### Paper Mentions

Observational Patient Registry Clinical Trial

This novel pilot project will assess the effectiveness of corticosteroids in treatment of
acute post-concussion headache. The investigators hypothesize that the use of corticosteroids… Expand

Conditions | Headache, Post-Concussion Syndrome |
---|---|

Intervention | Drug |

#### 1,169 Citations

Agreed statistics: measurement method comparison.

- Medicine
- Anesthesiology
- 2012

An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability. Expand

Conducting correlation analysis: important limitations and pitfalls

- Medicine
- Clinical Kidney Journal
- 2021

Why the coefficient is invalid when used to assess agreement of two methods aiming to measure a certain value, and better alternatives, such as the intraclass coefficient and Bland–Altman’s limits of agreement are discussed. Expand

Measurement consistency from magnetic resonance images.

- Computer Science, Medicine
- Academic radiology
- 2008

In almost all cases, using only one method is insufficient and it is recommended that several methods be used simultaneously, and in general, ANOVA performs the best. Expand

Measurement Consistency from Magnet Resonance Images 1

- 2008

Rationale and Objectives. In quantifying medical images, length-based measurements are still obtained manually. Due to possible human error, a measurement protocol is required to guarantee the… Expand

Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables

- Medicine
- Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology
- 2008

The general concepts of agreement and reliability are distinguished to aid researchers in considering which are relevant for their particular application, and the fact that reliability depends on the population in which measurements are made, and not just on the measurement errors of the measurement method is highlighted. Expand

Statistical Methods: Reliability Assessment and Method Comparison

- Computer Science
- 2017

Common statistical methods for assessing reliability and agreement between methods, including the intraclass correlation coefficient, coefficient of variation, Bland-Altman plot, limits of agreement, percent agreement, and the kappa statistic are described. Expand

How Replicates Can Inform Potential Users of a Measurement Procedure about Measurement Error: Basic Concepts and Methods

- Medicine
- Diagnostics
- 2021

This paper encourages clearly conceptually distinguishing between investigations of the measurement error of a single measurement procedure and the comparison between different measurement procedures or observers and describes the link to the existing general statistical methodology. Expand

Improvements in the application and reporting of advanced Bland–Altman methods of comparison

- Computer Science, Medicine
- Journal of Clinical Monitoring and Computing
- 2014

The implementation of a freely available implementation accompanied by a formal description of the more advanced Bland–Altman comparison methods is provided and a standard format of reporting is proposed that would improve analysis and interpretation of comparison studies. Expand

Longitudinal MRI data analysis in presence of measurement error but absence of replicates

- Computer Science
- 2018

This article proposes a novel method for the analysis of unreplicated longitudinal data under the presence of measurement errors using mixed-effect regression and develops a new EM-Variogram technique to estimate regression coefficients as well as variance components. Expand

The Case for Using the Repeatability Coefficient When Calculating Test–Retest Reliability

- Medicine
- PloS one
- 2013

A case is made for clinicians to consider measurement error (ME) indices Coefficient of Repeatability (CR) or the Smallest Real Difference (SRD) over relative reliability coefficients like the Pearson’s (r) and the Intraclass Correlation Coefficient (ICC) while selecting tools to measure change and inferring change as true. Expand

#### References

SHOWING 1-10 OF 20 REFERENCES

Statistics Notes: Measurement error and correlation coefficients

- Mathematics, Medicine
- BMJ
- 1996

This work considers the use of correlation coefficients to quantify measurement error, a variation between measurements of the same quantity on the same individual which has a simple clinical interpretation. Expand

Measuring agreement in method comparison studies

- Mathematics, Medicine
- Statistical methods in medical research
- 1999

The 95% limits of agreement, estimated by mean difference 1.96 standard deviation of the differences, provide an interval within which 95% of differences between measurements by the two methods are expected to lie. Expand

Measurement in Medicine: The Analysis of Method Comparison Studies

- Computer Science
- 1983

This paper shall describe what is usually done, show why this is inappropriate, suggest a better approach, and ask why such studies are done so badly. Expand

Comparing methods of measurement: why plotting difference against standard method is misleading

- Mathematics, Medicine
- The Lancet
- 1995

A plot of the difference against the average of the standard and new measurements is unlikely to mislead in this way and is shown theoretically and by a practical example. Expand

STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT

- Medicine
- The Lancet
- 1986

An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability. Expand

Renal volume measurements: accuracy and repeatability of US compared with that of MR imaging.

- Medicine
- Radiology
- 1999

Renal volume was underestimated with US and a comparable underestimation was found when the ellipsoid formula was applied to MR images, indicating that the inaccuracy of US renal volume measurements occurred because the kidney does not resemble an ellipSOid and was not primarily related to the imaging modality. Expand

Statistics Notes: Some examples of regression towards the mean

- Mathematics, Medicine
- BMJ
- 1994

It is shown that regression towards the mean occurs whenever the authors select an extreme group based on one variable and then measure another variable for that group, and that the mean of the extreme group is now closer to themean of the whole population. Expand

Single X-ray absorptiometry: Performance characteristics and comparison with single photon absorptiometry

- Medicine
- Osteoporosis International
- 2005

The new single X-ray absorptiometry forearm bone densitometer described in this paper has performance characteristics which allows it to be used both for diagnostic purposes and for the follow-up of treatment. Expand

Statistic Notes: Regression towards the mean

- Mathematics, Medicine
- BMJ
- 1994

The statistical term “regression,” from a Latin root meaning “going back,” was first used by Francis Galton in his paper “Regression towards Mediocrity in Hereditary Stature.”1 Galton related the… Expand

Pulmonary emphysema: subjective visual grading versus objective quantification with macroscopic morphometry and thin-section CT densitometry.

- Medicine
- Radiology
- 1999

Systematic overestimation and moderate interobserver agreement may compromise subjective visual grading of emphysema, which suggests that subjectiveVisual grading should be supplemented with objective methods to achieve precise, reader-independent quantification of empysema. Expand