Measuring Response in GVHD Clinician Assessment, Patient-Reported Outcomes Predict Survival

In patients who have undergone allogeneic hematopoietic cell transplantation (alloHCT), chronic graft-versus-host disease (GVHD) is a significant cause of morbidity and mortality. A major hurdle in developing new treatments has been the lack of validated methods of measuring response in clinical trials.

“Due to the long-time course of chronic GVHD, standard endpoints such as overall survival (OS) and non-relapse mortality (NRM) require longer-term follow-up than might be desired in most early-phase chronic GVHD trials,” Jeanne Palmer, MD, from the Division of Hematology/Oncology at Mayo Clinic in Scottsdale, Arizona, and colleagues wrote. Dr. Palmer and co-authors sought to identify measurements at three or six months that could predict subsequent long-term OS, NRM, and failure-free survival (FFS).

FFS, defined as continued disease-free survival without the addition of a new systemic immunosuppressive medication, is “easy to document … but also has the disadvantage of relying on the clinician’s treatment approach, which is subject to bias and variation in management styles,” the authors noted.

In this trial, overall response was measured in three ways:

  • National Institutes of Health-calculated response according to both the 2005 and 2014 Consensus Criteria algorithms, which use changes in skin, mouth, eye, lungs, joints, gastrointestinal, and liver measures to assign patients to the categories of complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD)
  • Clinician-reported response of CR, PR, SD, and PD as reported on clinician-completed surveys
  • Patient-reported response (i.e., whether their chronic GVHD was improving, stable, or worsening on a 7-point Likert scale)

Of the 575 patients included in the observational trial, 451 had evaluations at six months, and 307 of those patients had no recurrent malignancy or prior treatment change.

At three months, clinician-reported, patient-reported, and 2014 NIH-calculated response was associated with longer subsequent FFS, but not with NRM or OS. For FFS:

  • Clinician-reported response: HR=0.34 (95% CI 0.22-0.52); p<0.001
  • Patient-reported response: overall p<0.001
  • NIH-calculated response: HR=0.60 (95% CI 0.41-0.89); p=0.01

At six months, clinician-reported and 2014 NIH-calculated response was associated with higher subsequent FFS (but not NRM):

  • Clinician-reported response: HR=0.61 (95% CI 0.44-0.85); p=0.004
  • NIH-calculated response: HR=0.58 (95% CI 0.42-0.80); p=0.001
  • Clinician-reported response also predicted longer OS (HR=0.55; 95% CI 0.36-0.85; p=0.007).

Dr. Palmer and colleagues also tested whether changes in individual organ assessments, laboratories, or patient-reported symptoms were predictive of FFS, OS, and NRM. At six months, improvements in the NIH 0-3 clinician-reported skin score and 0-10 patient-reported itching score predicted longer subsequent FFS. Additionally, improvements in the Lee skin symptom score predicted longer subsequent OS and NRM, and the FACT BMT TOI score predicted longer subsequent OS (TABLE).

Although the 2014 NIH response criteria were not associated with subsequent OS or NRM, the authors added, “It is important to remember that the NIH response measures were never designed to predict survival.” Instead, they were designed to capture relevant changes in chronic GVHD disease activity as a result of chronic GVHD-directed therapy.

“Based on these data, we recommend that, for now, the 2014 NIH response measures, clinician-reported responses, and patient-reported outcomes be collected in therapeutic trials of chronic GVHD to ensure that relevant data are available once the best algorithm to capture a meaningful objective response is determined,” the authors concluded.

Several of the findings were surprising and unexpected, Dr. Palmer and colleagues noted, particularly that patient-reported measures were most strongly associated with FFS, OS, and NRM. “Patient-reported symptoms and quality of life may be more sensitive to overall health than clinician-reported chronic GVHD measures,” they explained.

However, this analysis was conducted as a discovery exercise, and, “while the results are informative, they will need to be validated in a separate independent cohort prior to drawing definitive conclusions.” It also would be important to validate these assessments in patients receiving identical therapies, to assess whether they can be used for regulatory purposes.


Palmer J, Chai X, Pidala J, et al. Predictors of survival, non-relapse mortality and failure-free survival in patients treated for chronic graft-versus-host disease. Blood. 2015 November 2. [Epub ahead of print]

TABLE. Multivariate Landmark Analyses at Six Months for Subsequent FFS, OS, and NRM
Outcome Parameter Number of events/number at risk Hazard ratio p Value
FFS Change in 2005 NIH 0-3 skin score 112/211 1.53 (1.19-1.96) 0.001
Change in patient 0-10 skin itching 1.15 (1.06-1.24) 0.002
OS Change in Lee symptom score 64/308 1.02 (1.01-1.04) 0.005
FACT-BMT total score 0.98 (0.97-0.99) 0.04
NRM Change in Lee symptom score 48/326 1.03 (1.01-1.04) 0.001