Longer online version. This is a draft and awaits peer review.

Reporting statistics right

HAVE YOU ever wanted to report medical trials but been too worried about the numbers to know where to start? Dr Susan Mayor, an award-winning medical journalist, addressed LFB's May meeting to demystify the discipline and show members how to interpret press releases from drugs companies.

Video © Rob Emmett

Video (MP4 file): Susan Mayor explains statistics

Susan herself undertook post-doctoral research for some years, before retraining as a journalist. She has been news editor of the British Medical Journal as well as managing editor of both the British Journal of Primary Care Nursing and the Primary Care Cardiovascular Journal. She has won the Medical Journalists' Association's Medical Journalist of the Year award twice. Her book Finding the Right Molecule, which traces the development of the drug Pradaxa, won the Communiqué award.

But what hope is there for freelances who have not chosen to benefit from a scientific training? Quite a lot, Mayor insists. The first medical research was not a high-powered jargon-infested affair. In fact, the first documented medical trial is recorded in the Bible: Nebuchadnezzar II ordered Daniel's elite students to eat meat to improve their scholastic performance; Daniel arranged for a comparative trial of his vegetable diet against servants eating the king's rich food. Daniel's boys did best, as he expected, and the results were written up in the Book of Daniel. Whether or not the report was adequately peer-reviewed is perhaps a question best left to theologians.

A while later, in 1537, we find the first recorded investigation of a novel therapy. The French surgeon Ambroise Paré compared two forms of remedial treatment upon the battlefield. Unsurprisingly, a mixture of egg yolk, oil of roses and turpentine was found to be more efficacious than boiling oil - for all purposes, presumably, other than killing enemies.

By 1747 we had the first controlled clinical trial: James Lind set out to find a solution to the perennial problem of scurvy at sea. However, he did not write about his findings and the British Navy did not change its dietary policies for another fifty years, because oranges and lemons were too expensive. Eventually it served up limes to its sailors; thereafter known as "limeys".

The first use of a placebo was recorded in 1863; and in 1948 the first randomised double blind trial was conducted in the Medical Research Council's trials of the antibiotic streptomycin for curing pulmonary tuberculosis. The results, written up in the British Medical Journal, were designed to be extrapolable for patients with the same condition but who were not part of the trial. In order to do this the researchers needed to compare the test subjects with a control group, so as to eliminate chance or bias as much as possible.

© Hazel Dunlop

Susan Mayor explains statistics

Dr Mayor elucidated some key terms relating to types of trials:

  • Controlled trial - one undertaken for comparative purposes, to ensure that results seen can only be due to the new treatment. For example in the scurvy trial the standard sailors' diet was compared with a diet containing vitamin C-rich fruit.
  • Placebo - a dummy tablet or injection. Placebo trials compare the effect of this with that of the treatment being investigated. The "placebo effect" is the observation that people often do better with a placebo "treatment" than with nothing at all: this may be due to better care from health service professionals or to the psychological effect of subjects believing they are trying beneficial new treatments.
  • Ethical standards often require that a new treatment only be compared with existing treatments, not against a dummy. Less benefit may be seen, but the results will be less damaging and more realistic, as sufferers would typically be treated with existing drugs anyway.
  • Non-inferiority trials ensure that a new treatment is not worse than existing treatment, although it may be just the same. All clinical trials, however, are ethically supposed to benefit the people taking part.
  • Randomised trials allocate people randomly to different treatment groups, to eliminate biasses (such as people who are more sick being given different treatment). Randomness is nowadays achieved by computer-generated random number sequences.
  • Blinded trials - single-blinded trials are studies in which participants do not know which treatment they are receiving. In double-blinded trials, neither the participants nor the researchers know which is which while the research is being conducted, to eliminate bias that may arise if researchers know who is getting what. Blinded assessments, such as assessing scans of tumours, do not reveal identities or other characteristics of subjects. Treatments with side effects are impossible to blind.
  • Open-label study - all subjects know which treatment they are testing. This may be unavoidable where the treatments compared differ in type, for example tablet versus an injection.

Some types of trial yield more reliable data than others. In decreasing order of reliability are:

  • Systematic review, or meta-analysis, which combines the results of many trials
  • Randomised control trial
  • Other controlled trial
  • Observational studies
  • Cohort studies
  • Case-controlled studies
  • Case studies, anecdotes and personal opinion

Making sense of the numbers

There are three key numbers which you should check when looking at the report of any trial:

  1. Size of the sample; how big is it? Small samples reduce the power of the study and a power calculation is made at the beginning. This can be low for trials involving rare diseases. For instance, 150 subjects may be acceptable for a study into a rare cancer, whereas 4000 might be expected for a cardiovascular study.
  2. How long was the trial run for? Follow-up research may be necessary.
  3. Completeness of follow-up. Quite a few subjects drop out.

How results are calculated

Key concepts to use when checking reports of studies include:

  • Was the study carried out "per protocol" - that is, was the planned method followed?
  • Mean: the average value (add all numbers and divide by the number of numbers)
  • Median: the middle value of a range of numbers. [If 5 people have incomes of 20k, 21k, 22k, 25k and 100k the median is 22k but the mean is 37.6k which is less representative!]
  • Hazard ratio - often more useful than measures such as median survival, as it uses data from all participants, including those who fail to complete the trial. It is defined as:
    (chance of the hazard in treatment group)
    divided by
    (chance of the hazard in control group).
    For example in a trial of a new treatment for cancer the hazard ratio would be (odds on cancer recurring in the treatment group) divided by (odds on cancer recurring in the control group). A hazard ratio of 0.66 means there is a 34 per cent reduction of hazard with the treatment.
  • Confidence ratio: this is a measure of how reliable the trial results are expected to be for people outside the trial - in other words how much the measured benefit of a treatment can be extrapolated. A confidence ratio of 95 per cent means that 95 per cent of the population with that disease can be expected to respond that way.
  • P-value: how reliable are the results of the study? The calculated "p-value" assesses the probability of an observed effect having happened completely by chance. If a study has a p-value of 0.5 it is no use, as that simply means there are evens odds (50:50) that it result is just one of those things that happens at random, and has no meaning. A result is often taken to be "significant" if the p-value is less than 0.05 - that is, there is a one-in-twenty chance that it's randomness after all.
  • Absolute risk reduction: measures the incidence of hazard (for example death) in subjects receiving the new treatment compared to that for regular treatment, expressed as a percentage. (So if 32 per cent of the subjects on the new treatment die and 45 per cent of those on the old treatment die, the absolute risk reduction is 45 minus 32 = 13 per cent.)
  • Relative risk reduction: the proportion by which an event rate is reduced. Often more useful. (Express the risks as absolute numbers, not percentages, and divide. In the above example the relative risk reduction is 0.32 divided by 0.45 = 0.71 - in plain English people on the new treatment are about two-thirds as likely to die.)

It is often good to give the risk reduction at the start of a health story; but the absolute risk reduction is now more reported.

To summarise: check the mean and median, hazard ratio, confidence interval and P-value of any study upon which you are basing a story.

A checklist

Finally, Dr Mayor offered prospective medical journalists a checklist for analysing research papers:

  • What question is the research asking? Is it useful?
  • What were the main findings?
  • How meaningful are these findings? (Look at the confidence interval and p-value.)
  • Are the findings credible?
  • Who carried out the research? Are they reputable?
  • Where was the research published? Was it peer-reviewed?
  • Who funded the research? This is typically disclosed at the end of a study. Could funder identity have affected interpretation of results?
  • What is the significance of the results for health care professionals and patients with the disease in question?

For those with a further interest, Dr Mayor recommended a small booklet, What is a p-value anyway? and the Sense About Science website www.senseaboutscience.org. She also recommended the book Medical Statistics Made Easy.

Last modified: 30 May 2015 - © 2015 contributors
The Freelance editor is elected by London Freelance Branch and responsibility for content lies solely with the editor of the time
Send comments to the editor: editor@londonfreelance.org