Evaluation criteria
Different types of checklists are used for the different types
of studies. Checklists are used as reminders of the issues that
each reviewer must evaluate. A score point system is not used. Every
item on the ckecklists is classified as "yes", "no", "don't know"
or "not applicable".
The checklists used for each type of study are:
For articles on studies of frequency, case-series, risk factors,
and prognostic factors:
(Developed by the Editorial Committee of the Professional Site
of the Web of the Back taking into account the recommendations of
the University of York)
(http://www.york.ac.uk/inst/crd)
- Does the study incur in any of the following biases in a way
that validity of results may be questioned?
- Reading Bias: For example, bibliographic references bias:
Authors only include references supportive of their point
of view.
- Sample Definition or Selection Bias: For example, bias of
diagnostic accessibility, hospitalization rate, Berkson's
bias or non-respondent bias.
- Exposure Bias: For example, adherence to treatment bias.
- Measuring Bias: For example, bias involving memory, instruments,
or non-sensitive measurements.
- Analysis Bias: For example, exhaustive search for correlations
without a previous hypothesis.
- Interpretation Bias: For example, confusing statistical
and biological significance or equating correlation to causality.
- Are the subjects appropriate for the study objective? (If there
are several groups, it should be applied to all and each one of
them)
- Are the variables appropriate for the study objective?
- Have the variables been measured adequately?
- If applicable, is the follow-up period adequate?
- If the study includes follow-up of subjects, and there is loss
to follow-up, are the results still valid despite this?
- Has the study objective been clearly described?
- Are the studied population's characteristics adequate for the
established objective? (Temporal and geographic characteristics,
sampling technique, etc.)
- If more than one group is studied, are the inclusion criteria
adequate for each group?
- Is the sample size adequate for the analysis performed?
- Are the described subjects' characteristics relevant and given
in enough detail?
- Are the described sample population's characteristics relevant
and sufficient?
- If subjective variables were used (for example, interpretation
of a radiology test or clinical records), is there a reference
to its inter-observer variability or is it analyzed?
- If a measurement system or a rating scale was used on the relevant
variables, is there a comment or reference to its accuracy? (If
all are considered to be standard, the answer is "yes").
- If applicable, have the interpretations of the different evaluations
and measurements been performed under blind conditions?
- Was the performed statistical analysis appropriate for the established
objective? (For example, in association studies, have the potential
confounding variables been accounted for?).
- Are the results presented in adequate detail?
- Are conclusions consistent with the presented results?
Articles concerning treatments:
(Modified by the Editorial Committee of the Professional Site
of the Web of the Back, based on Koes BW, Scholten RJ, Mens JM,
Bouter LM. Efficacy of non-steroidal anti-inflammatory drugs for
low back pain: a systematic review of randomized clinical trials.
1997; 56(4):214-23)
- Are the patients' selection criteria specified?
- Treatment Allocation:
- Has treatment been assigned under random conditions?
- Has treatment allocation been performed under blind conditions?
- Were treatment groups sufficiently similar concerning the most
relevant prognostic factors? If the potential difference was taken
into consideration, answer "yes".
- Are the study and control interventions explicitly described?
- Was the treatment provider blind to the intervention?
- Are co-interventions sufficiently comparable? If they were avoided,
answer yes.
- Is there a sufficient grade of baseline homogeneity between
all groups?
- Were patients blinded to the assigned intervention?
- Was outcome assessment blinded to interventions?
- Are the measured variables the relevant ones for the study?
- Are adverse events described?
- Is the study dropout rate described and acceptable?
- Chronography of the follow-up measurements:
- Was a short-term follow-up measurement performed?
- Was a long-term follow-up measurement performed
- Was the time of evaluation of results comparable for all groups?
- Is the sample size of each group described?
- Was an intention-to-treat analysis performed?
- Are estimators of dispersion or variability for the main outcomes
presented?
For articles of clinical-diagnostic correlation:
(Prepared by the Editorial Committee of the Professional Site
of the Web of the Back taking into account the recommendations of
the University of York)
(http://www.york.ac.uk/inst/crd)
- Are there at least two established groups to allow comparison
(symptomatic/asymptomatic, with/without findings)?
- Have the tests and/or assessments been performed appropriately?
- Is there an appropriate reference standard?
- Was the follow-up period appropriate?
- In case of withdrawals, are results still valid despite them?
- Is the proposed diagnostic test to be used properly described?
- Are the studied population's characteristics appropriate?
- Were inclusion and exclusion criteria adequate? (If they have
not been described, answer "No").
- Was the sample size of the groups adequate for the analysis
performed?
- Are patients' characteristics described?
- Is the population source described?
- Were the test results and evaluations clearly classified?
- If a specific measurement/classification was used, was the test
precision described or referenced? (If all are standards, answer
"yes").
- Were interpretations of reference standards and tests blinded?
- Was the statistical analysis performed appropriate for the established
objective?
- Are results presented in adequate detail?
- Were results interpreted correctly?
For Diagnostic Studies:
(Modified by the Editorial Committee of the Professional Site
of the Web of the Back, based on Mulrow et al. Assesing quality
of a diagnostic test evaluation. Journal of General Internal Medicine.
1998;4:288-95)
- Were diseased and nondiseased patients included?
- Was the test appropriately performed?
- Was there an appropriate reference standard?
- Was the proposed use described?
- Was the appropriate population studied?
- Were inclusion and exclusion criteria described?
- Was a wide spectrum of diseased patients included?
- Were control (nondiseased) patients with comorbid diseases included?
- Was the sample size adequate?
- Were patients' characteristics described?
- Were case (diseased) patients with comorbid diseases included?
- Was the population source described?
- For the diagnostic test, were the terms normal/abnormal clearly
defined?
- Was test precision described?
- Were interpretations of reference standards and tests blinded?
- Was the reference standard appropriately performed?
- As far as the reference standard is concerned, were the terms
normal/abnormal defined?
- Are data presented in adequate detail?
- Were uninterpretable results enumerated?
For systematic literature reviews and meta-analyses:
(Modified by the Editorial Committee of the Professional Site
of the Web of the Back, based on Oxman AD, Guyatt GH. Guideliness
for reading literature reviews CMAJ. 1988;138:698-703)
- Is the objective of the review precisely stated?
- Does the source selection guarantee an exhaustive review?
- Were electronic search strategies used appropriately to locate
relevant articles?
- Among the located articles, were explicit methods used to determine
which articles to include and which not in the review?
- Are these explicit methods appropriate for the objective of
the review? (Answer: "Not known", if methods are not described).
- Was the publication bias estimated?
- Was methodological quality of the included studies assessed?
- Was methodological assessment correct? (For example using a
previously validated, or at least a complete, quality rating scale).
- Were authors consulted for doubts about their articles?
- Is the study replicable? (Concerning the method used, independently
of whether it would obtain the same results or not).
- Have measures been taken to control bias during the review process
of the included articles? (For example, various independent reviewers
for each article).
- Are results of each of the included studies appropriately presented?
- Was variation in the findings of the relevant studies analyzed?
- If there is no "evidence of the effect", is "evidence of no
effect" differentiated?
- In meta-analyses, was heterogeneity of the relevant studies
evaluated? (Concerning methodology, study population, variables,
measuring devices, etc).
- If the studies included were heterogeneous, are results presented
in subgroups?
- In meta-analyses, were the findings of the primary studies combined
appropriately?
- If applicable, were sensitivity studies performed?
- Are guidelines for further reviews recommended?
- Were the reviewers' conclusions supported by the data cited?
For studies on economic aspects:
(Prepared by the Editorial Committee of the Professional Site
of the Web of the Back based on M. F. Drummond "Principles of economic
appraisal in health care" and McMaster Health Science Center "How
to read clinical journals: VII: To understand an economic evaluation
(Parts A and B) Can. Med. Asso. J. 1984).
- Is the study objective clearly stated?
- Are objectives quantifiable?
- Is it clearly specified if the study deals at a micro or a macro
level?
- Does it clearly mention from whose (hospital, insurance company,
society, government, etc.) perspective the study was carried out?
- Was an important alternative to the proposed technology omitted?
(For example, results without applying the technology, placebo,
other technology, changes in lifestyle, no intervention).
- Were costs and consequences of the proposed technology analyzed?
- Were several and relevant alternatives compared?
- Was economic assessment thoroughly performed?
- Were relevant levels at which costs and outcomes are produced
clearly identified? If costs and outcomes were estimated, were
the assumptions clearly presented?
- Was the calculation method explicitly stated?
- Were measurement units clearly defined?
- Were appropriate quantitative (or qualitative) values assigned
to the different analysis points?
- Were mean and marginal costs and consequences differentiated?
- Was a consideration made of the time factor on costs and benefits?
- Was the use of the discount rate applied to this consideration
justifiable?
- Were the value ranges of the relevant variables in the sensitivity
analysis justified?
- Was sensitivity of results to changes in the variables considered?
- Are the conclusions consistent with the results?
- Are the results discussed in a useful manner for decision-makers?
- Are the results discussed exhaustively?
Quality Control Mechanisms:
- 20% of all articles is automatically submitted to at least two
reviewers, and the consistency of their evaluations is assessed.
- Along with this systematic method, articles are sent to at least
two reviewers when:
- The Editorial Committee foresees there may be a risk of
inconsistent evaluations.
- Any reviewer considers that a review should be performed
by more than one evaluator.
- All reviewers meet periodically to guarantee the best possible
homogeneity during the review process.
- The assessment and methodology criteria are subjected to the
international scientific community through publication in scientific
journals and discussion in forums involved in systems of scientific
evidence assessment. Review criteria are revised periodically
according to the reviewers', inputs and the updating of the criteria
of the international scientific community.
- The Editorial Committee systematically reviews 20% of the articles
assessed, regardless of whether they were approved or discarded.
|