Finding the balance in clinical trial eligibility between too strict, not strict enough, and “just right”
To assess a drug or device’s safety and efficacy, researchers and regulators often rely on randomized, controlled clinical trials. And, to ensure that those trials provide reproducible, valuable results with minimal risk to the trial participants, investigators designing the protocols incorporate strict inclusion and exclusion criteria. Many have called into question whether those criteria are too strict, and if the populations enrolled in randomized clinical trials accurately reflect their real-world counterparts.
“Investigators imagine a question and apply it to a very focused patient population that they believe has the greatest chance of showing an effect size and the least chance of having any complication or harm from the intervention,” Robert Fowler, MD, MS, a senior scientist at Sunnybrook Health Sciences Centre in Toronto, explained to ASH Clinical News. “Often that means that investigators end up excluding large groups of the general patient population who might be potentially eligible for a clinical trial.”
When designing clinical trials, it can be difficult to balance internal validity (whether a treatment will work within the defined study population) with external validity (whether the results of a trial can be generalized to other populations or settings). “The challenge and tradeoff is that the generalizability of the trial often takes a hit because the intervention is studied on patients [who] may not represent the patient population to whom you would ideally want to apply the trial results,” Dr. Fowler noted.
Internal validity is often determined by controlling variables so that only the tested intervention can affect outcomes. “Investigators typically have many patients who are potentially eligible for a trial – they meet the inclusion criteria – but then also meet one of the many exclusion criteria,” Dr. Fowler said. “In these cases, investigators end up with a population of eligible participants that is less than 10 percent of the size of the original population. It is one of the most common challenges of eventually being able to apply a trial finding back to the patient population.”
ASH Clinical News recently spoke with Dr. Fowler and other health-care professionals about clinical trial inclusion and exclusion criteria and the challenges of designing a clinical trial that is both internally and externally valid.
Common Exclusion Criteria
Every clinical trial has an established set of inclusion and exclusion criteria that has been approved by scientific and institutional review boards, as part of the study protocol. The selection of these criteria is influenced by multiple stakeholders.
First, the U.S. Food and Drug Administration (FDA) requires that any clinical study report its criteria for exclusion at study entry and provide a rationale for those criteria. Investigators also are required to discuss the effect of these exclusion criteria on the generalizability of the study. “The FDA has to make sure that a clinical investigation is enrolling patients who will adequately test the efficacy and safety of a drug,” said Abby Statler, MPH, MA, from the Taussig Cancer Institute at Cleveland Clinic.
Second, the clinical trial investigators, who serve as key opinion leaders, work with the clinical trial’s sponsor to establish appropriate exclusion criteria for a study, Ms. Statler added.
The last piece of the protocol puzzle is often related to the economics of running a clinical trial. “Clinical trials are very expensive to run, and some studies cost hundreds of thousands of dollars a day to keep open, so researchers look for ways to accrue patients quickly,” Ms. Statler said.
However, despite the thousands of ongoing clinical trials that are testing a wide variety of drugs from a wide variety of diseases, many clinical trials use a set of criteria that is fairly universal, according to Bradford Hirsch, MD, MBA, senior medical director at Flatiron Health in New York. In 2015, Dr. Hirsch and colleagues published a study that evaluated whether patients with metastatic renal cell carcinoma being treated with certain drugs in a routine oncology practice were similar to the patients enrolled in the clinical trials evaluating those drugs.1 They found that almost 40 percent of the real-world patients would have been excluded from phase III clinical trials of the drug they were receiving. Furthermore, patients in the real-world setting were sicker: They were more likely to have poor-risk disease and to have impaired performance status compared with the patients enrolled in clinical trials.
“Common exclusion criteria include things like adequate organ function, white blood cell count, kidney function, creatinine, HIV status, brain metastases, or age,” Dr. Hirsch explained.
When researchers conducted a systematic review of randomized clinical trials published in major medical journals, they found that common medical conditions formed the basis for excluding patients in 81 percent of the 283 trials included in the analysis.2 More than half of the trials (54.1%) excluded patients receiving commonly prescribed medications, and about 40 percent excluded women with conditions related to female sex. Overall, less than half of the exclusion criteria used (47.2%) were graded as “strongly justified” by the investigators.
“Drug intervention trials or large multicenter trials were more likely to have greater numbers of exclusion criteria than studies that did not investigate drug interventions or were single-center studies,” said Dr. Fowler, who was an investigator on the study. Industry-sponsored trials and drug intervention trials were more likely to exclude individuals due to concomitant medication use and medical comorbidities than non–industry-sponsored and non-drug intervention trials.
“Most exclusion criteria for medical conditions or comorbid conditions were poorly justified,” Dr. Fowler added, “and there was a signal that if a trial was supported by industry it was more likely to have poorly justified exclusion criteria.”
Loosening the Reins
Another review of randomized clinical trials in hematologic malignancies, conducted by Ms. Statler and colleagues and presented at the 2015 ASH Annual Meeting, showed that hematologic malignancy trials tend to exclude patients regardless of the adverse events (AEs) that would be expected (or ultimately observed) with the drug class being investigated.3 The researchers looked at 98 phase II and III trials published between January 2010 and January 2015, including 12 trials that contributed data leading to a label change or a new FDA approval. Then, they collected AE data from package inserts or published manuscripts and compared them with corresponding trial exclusion criteria.
Using the threshold applied in medication labels (reported AEs that occurred in ≥10% of trial participants), Ms. Statler and colleagues also calculated the binomial probability for each AE, particularly those relevant to the most commonly used organ function exclusion criteria (hepatic, kidney, cardiac, and neurologic).
Similar to the findings published by Dr. Hirsch and colleagues, most trials applied exclusion criteria related to:
- medical comorbidities (e.g., active or prior cancer, previous cardiac conditions, HIV infection, hepatitis, or psychiatric disease): 97% of studies
- inadequate organ function: 89%
- poor performance status: 67%
“Exclusion criteria relevant to baseline organ function did not reflect established safety profiles,” the authors noted, with the proportion of studies excluding patients with specific organ dysfunction far outweighing the proportion of drug classes with known toxicities in these realms.
For example, 73.5 percent of studies had exclusion criteria for renal toxicities, while only 50 percent of the drug classes being investigated had known renal toxicities (p<0.0001). This also was true for cardiac toxicities (74.5% vs. 62.5%; p=0.02) and hepatic toxicities (87% vs. 75%; p=0.007). “For hepatic eligibility criteria, there were 14 studies that did not include any drugs with hepatic toxicities, but excluded patients with hepatic abnormalities at baseline,” Ms. Statler said.
This type of burdensome eligibility criteria, which severely limits the number of patients who are eligible for a study, is a widespread problem for the clinical trial enterprise. A 2015 study found that almost one in five National Clinical Trials Network-sponsored trials initiated between 2000 and 2011 were closed because of low accrual.4
“We often call patients on clinical trials ‘the Olympic athletes of patients’ because they are the most clinically fit of all patients. They have very few comorbidities, and very high performance levels,” Ms. Statler said. “They are the best of the best and that does not reflect the patients walking through the doors of our clinics every day.”
Randomized clinical trials may be the benchmark for drug evaluation, but the true effectiveness of a drug remains to be determined until after its approval and after it has been used in real-world patients, according to Sonal Singh, MBBS, MPH, associate director of the Methods Program at Center for Public Health and Human Rights at Johns Hopkins Bloomberg School of Public Health.
Prior to drug approval, it is important to know that the drug works in a small subset of the population and what its effects are in the most vulnerable populations, he noted.
“Robust observational studies provide high-quality evidence on intended effects in the real world – where patients may be sicker and less adherent – and effect sizes are generally smaller,” Dr. Singh said. “Regarding safety, observational studies can provide even more information than clinical trials and can complement information from clinical trials and their meta-analysis.”
He offered the following example in the area of heart failure research where real-world, practical information was discovered too late: Results from the randomized, double-blind RALES trial that compared spironolactone (an aldosterone-receptor blocker) with placebo in more than 1,600 patients with heart failure were published in the New England Journal of Medicine in 1999. The trial results showed that spironolactone in combination with standard therapy significantly reduced the risk of mortality and morbidity in patients with severe heart failure.5
According to the published results, patients who received spironolactone had fewer symptoms and were hospitalized less frequently. A 2005 Canadian study examining prescription trends of spironolactone for heart failure found that, after the RALES results were published, prescriptions rapidly increased; however, by late 2001, rates of hospitalization for hyperkalemia (abnormally high potassium levels in the blood) among spironolactone-treated patients also increased.6 The rate of hyperkalemia associated with in-hospital death also increased nearly threefold during this time.
A closer look at the RALES trial revealed that the exclusion criteria barred patients with serum creatinine >2.5 mg/dL and serum potassium >5.0 mmol/L from participating. “Patients most prone to this complication were not studied because the trial excluded them due to renal dysfunction or because of medications that might do the same thing,” Dr. Singh noted. Only when observational trials were conducted did the consequences of applying clinical trial data to populations that differ from those studied, or those outside the close scrutiny of a clinical trial setting, become apparent.
In Search of Pragmatic Trials
Attempting to avoid costly, impractical trials, some members of the medical research world have started a growing movement to make clinical trials more pragmatic. Rather than conducting explanatory trials to test whether a treatment could work in ideal circumstances and confirm a hypothesis, pragmatic trials look at multiple heterogeneous settings and diverse patient populations. They also use comparison conditions seen in real-world settings. In other words, pragmatic trials look at typical patients with typical conditions rather than the so-called Olympic athletes of patients.
Investigators curious to know where a trial falls on the continuum from “explanatory” to “pragmatic” can visit precis-2.org, a website designed as a training resource and database of trials that have been scored using PRECIS-2 (Pragmatic Explanatory Continuum Indicator Summary-2). The tool was developed in 2009, and modified in 2013, to “help trialists to think more carefully about the impact their design decisions would have on applicability.”7 PRECIS-2 is a nine-spoked wheel that contains nine domains of trial design decisions, including eligibility criteria and recruitment (see FIGURE). The wheel visually represents how explanatory or pragmatic a trial is; trials that take an explanatory approach produce wheels nearer the hub and those with a pragmatic approach are closer to the rim.
Umbrella and basket trials are two types of trial designs that would score “closer to the rim.” Umbrella trials are designed to test the effect of a group of different drugs on different mutations in a single disease type. These trials are designed to be more flexible and allow for randomized comparisons and come equipped with the ability to add or drop biomarker subgroups. The BATTLE (Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination) trial is an example of an adaptively randomized, umbrella study; it included 255 patients pretreated for lung cancer who were adaptively randomized to erlotinib, vandetanib, erlotinib plus bexarotene, or sorafenib based on their relevant molecular biomarkers found during “real-time” biopsies.8 Initial trial results showed that there was an eight-week overall disease control rate of 46 percent.
In contrast, basket trials test the effect of one drug on a single mutation that is present in a variety of cancer types. The phase II NCI-MATCH study, which looks for actionable mutations in a patient’s tumor and assigns treatment based on the abnormality, for one, is designed with the ability to add new treatments or drop treatments over time.9
Striking a Balance
The road toward more pragmatic trials is not without obstacles. A 2016 review of pragmatic trials noted several challenges, particularly in the area of trial recruitment.10 These challenges include:
- the difficulty in selecting participants similar to those who would receive a drug if it became usual care
- that volunteers for trials are often healthier than the average person
- the possibility of financial incentives to recruit to industry-sponsored trials rather than academic trials
- the need for informed consent in unselected participants
- Also, in an attempt to increase the external validity of a trial, trialists run the risk of over-correcting and jeopardizing the trial’s internal validity.
To achieve a balance between internal and external validity in trial design, Ms. Statler recommended first tackling the low-hanging fruit, like the requirement that certain medical tests be performed within an appropriate timeframe in order for patients to meet trial eligibility criteria.
“Many studies in blood cancers require bone marrow testing be performed within 14 days of the trial initiation and within three or four weeks for certain patients,” Ms. Statler said. “The disease is not likely to change in that time, but these patients will still not be eligible for the trial.”
Investigators should also look more closely at the toxicity profiles of each investigational agent and revise trial criteria based on the specific toxicity profile, she added. “If a drug being investigated does not have cardiac toxicities, it does not make sense to exclude patients with benign cardiac abnormalities at baseline. If the drug is not going to exacerbate the problem, there is no justification for including those criteria.”
In addition to changing exclusion criteria on a trial-by-trial basis, there also are changes that could be made to the regulatory process to address generalizability, Dr. Fowler said.
There will always be a need for tightly controlled trials with a multitude of eligibility criteria at the beginning of an investigational agent’s trial life. As researchers achieve positive results and move toward a phase III study, they should also begin to turn their attention toward addressing generalizability in parallel or soon after, Dr. Fowler said, noting though that those aspects typically are not budgeted or planned for.
“Right now, the only incentive to demonstrate efficacy is to get a drug through regulatory mechanisms and into clinical practice,” Dr. Fowler said. “There is less incentive to look at the influence of a drug in the usual patient population. This is something that could be requested or demanded in follow-up phases.” For instance, a drug might be allowed to come to market with the condition that investigators ensure that there are no adverse effects or a decrease in efficacy when indications for the drug are loosened, he explained.
Dr. Singh agreed with this premise, suggesting that one approach to solving the challenge of balancing internal and external validity might be to use a tool like the Centers for Medicare and Medicaid Service’s Coverage with Evidence Development,11 which would “provide approval based on internally valid clinical trials, and then assess generalizability with evidence generation in high-quality observational studies or subsequent trials.”
In the end, everyone ASH Clinical News spoke with acknowledged that there will never be an absolute wrong or an absolute right way to balance internal and external validity of clinical trials, while bringing the most effective drugs to market quickly and safely.
“There will always be a push and pull with internal and external validity,” Dr. Hirsch said. “The challenge will be finding the middle ground.” —By Leah Lawrence
- Mitchel AP, Harrison MR, Walker MS, et al. Clinical trial participants with metastatic renal cell carcinoma differ from patients treated in real-world practice. J Oncol Pract. 2015;11:491-7.
- Van Spall HGC, Toren A, Kiss A, et al. Eligibility criteria of randomized controlled trials published in high-impact general medical journals. A systematic sampling review. JAMA. 2007;297:1233-40.
- Statler A, Radivoyevitch T, Siebenaller C, et al. Eligibility criteria are not associated with expected or observed adverse events in randomized controlled trials (RCTs) of hematologic malignancies. Abstract #635. Presented at the 2015 ASH Annual Meeting, December 7, 2015; Orlando, FL.
- Bennette CS, Ramsey SD, McDermott CL, et al. Predicting low accrual in the National Cancer Institute’s Cooperative Group Clinical Trials. J Natl Cancer Inst. 2015;108:djv324.
- Pitt B, Zannad F, Remme WJ, et al. The effect of spironolactone on morbidity and mortality in patients with severe heart failure. N Engl J Med. 1999;341:709-17.
- Juurlink DN, Mamdani MM, Lee DS, et al. Hyperkalemia associated with spironolactone therapy. Can Fam Physician. 2005;51:357-60.
- PRECIS-2. “About PRECIS-2.” Accessed September 26, 2016 from https://precis-2.org/Help/Documentation/Help.
- Kim ES, Herbst RS, Wistuba II, et al. The BATTLE trial: personalizing therapy for lung cancer. Cancer Discov. 2011;1:44-53.
- National Cancer Institute. NCI-Molecular Analysis for Therapy Choice (NCI-Match) Trial. Accessed September 21, 2016 from https://www.cancer.gov/about-cancer/treatment/clinical-trials/nci-supported/nci-match.
- Ford I, Norrie J. Pragmatic trials. N Engl J Med. 2016;375:454-63.
- Centers for Medicare and Medicaid Services. Coverage with Evidence Development. Accessed September 26, 2016 from https://www.cms.gov/Medicare/Coverage/Coverage-with-Evidence-Development/.
FIGURE. The PRECIS-2 Wheel of Trial Design
A visual example of how “explanatory” or “pragmatic” a trial is
Source: PRECIS-2. “About PRECIS-2.” Accessed September 26, 2016 from https://precis-2.org/Help/Documentation/Help.