07. Evaluating the Evidence

What are the results of the study? (Therapeutic)

Therapeutic Studies: What is the treatment effect?

 

Yes Outcome

No Outcome

Total

Experimental

a

b

a+b

Control

c

d

c+d

All outcomes

a+c

b+d

a+b+c+d

 

  • Risk of outcome in experimental group = a/(a+b)
  • Risk of outcome in control group =c/(c+d)
  • Odds of outcome in experimental group =[a/(a+b)]/[b/(a+b)]
  • Odds of outcome in control group =[c/(c+d)]/[d/(c+d)]
  • Odds ratio (OR): odds of an outcome in the experimental vs control group =(a/b)/(c/d) = ad/bc
  • Relative risk (RR): the risk in the experimental divided by the risk in the control group =[a/(a+b)]/[c/(c+d)]
  • Relative risk reduction (RRR): proportional reduction in risk of outcome of interest =1-RR
  • Absolute risk reduction (ARR) or Risk Difference: difference in risk between groups (not proportional) = [c/(c+d)]-[a/a+b]
    • Takes into account underlying risk of outcome in population (control group)
    • If RRR is large, may be less clinically relevant if underlying minimal risk in population
    • e.g. RRR 50% to prevent MI but risk of MI in population is only 1%, then ARR=0.5%
  • Absolute risk increase (ARI): difference in risk for an unfavorable outcome or adverse event
  • Number needed to treat (NNT): number of patients that need to be treated to lead to one favorable outcome or to prevent one additional bad outcome =1/ARR
    • Example: NNT = 100 if we prevent 1 case of colon cancer for every 100 patients who undergo colonoscopy
  • Number needed to harm (NNH): number of patients that need to be treated to lead to one additional unfavorable outcome =1/ARI
    • Example: Cause 1 case of perforation for every 1000 patients who underwent colonoscopy
  • Time to benefit (TTB): duration of treatment required to detect a measurable benefit. May help decide when chronic treatments may no longer be indicated.
    • Example: Treatment may not be indicated if a patient’s life expectancy is less than TTB.

What are the results of the study? (Diagnostic)

See Sections on Measures of Diagnostic Test Performance and Likelihood Ratios.

How Precise are the Results?

P-value is a statistical measure of the probability of observing a study result by chance when the null hypothesis is true. Most commonly, a cutoff of p=0.05 is used, which indicates that there is <5% chance that the measured result happened by chance alone.

Multiple hypothesis testing describes the phenomenon that if many comparisons are drawn from a single dataset, the p-value may underestimate the likelihood that the result would be found purely by chance, leading to an increased false positive rate.

A Bonferroni correction adjusts for the increased likelihood of falsely rejecting the null hypothesis by dividing the desired level of significance by the number of hypotheses being tested.

Confidence interval gives an estimated range of sampled values that is likely to include a true population value since we cannot measure the entire population.

  • A 95% CI means that 95% of the sample estimates will include the true value.
  • The CI narrows with:
    • Larger sample size
    • Reduced data variability (e.g. more precise measurement)
    • Less stringent p-value (i.e. 0.1 instead of 0.05) or decreasing the CI percentage (e.g. using 90% CI instead of 95%)

Are the Results Valid?

Biases are systematic errors that affect data. Subtypes of bias include:

  • Selection bias—systematic differences between individuals who participate in a study versus those who do not. Examples include: sampling bias, susceptibility bias, attrition bias.
  • Measurement bias—systematic errors in data collection. Examples include: information bias, recall bias.
  • Bias may be difficult to manage and does not invalidate a study, though you should account for it in your analysis.
    • Assess type of bias and direction of the bias
    • What was the impact of the bias on the outcomes and conclusions?
    • Were the researchers aware of potential bias and did they attempt to minimize impact?
  • Basic methods to minimize bias:
    • Rigorous evaluation of study design (e.g. review committee, ethics board).
    • Blinding or allocation concealment preserves study design efforts to minimize the introduction of bias.
    • Intention to treat: Analyzing participants according to group assigned regardless of drop out, loss to follow-up, crossover.

Confounding happens when a variable is associated with both the exposure and the outcome, but is not a part of the causal pathway.

Effect Modification occurs when the magnitude of an exposure on an outcome depends upon a third variable (e.g. interaction, synergy, or antagonism).

Basic methods to minimize confounding at design stage:

  • Randomization: Randomly allocate participants to treatment groups to equally distribute known and unknown confounders and minimize the systematic differences in groups.
  • Matching: If no randomization, attempt to match characteristics of participants in different groups, eg. case-control.
  • Restriction: Limits participation in study to individuals who are similar in relation to a confounder.

Basic methods to minimize confounding at analysis stage:

  • Stratification: Examine results by strata of the confounding variable.
  • Standardization: Commonly used to control for effects of age and sex.
  • Multivariate analysis: Identify significant confounders and adjust outcome measure according to confounding effects

Watch Out! Common pitfalls include:

  • Conflicts of interests
  • High rates of loss-to-follow up or short follow-up that may have missed outcomes
  • Lack of intention-to-treat analysis, e.g. ignoring all dropouts and non-responders
  • Performing multiple comparisons on the data
  • Extensively analyzing subgroups or hypotheses that were not originally intended in study design
  • Not adjusting for baseline differences between groups
  • Not reporting measures that are equivocal or “not positive”, e.g. confidence intervals