iconstar paper   Hepatitis C Articles (HCV)  
Back grey arrow rt.gif
 
 
Correlation of FIBROSpect II With Histologic and Morphometric Evaluation of Liver Fibrosis in Chronic Hepatitis C
 
 
  Clinical Gastroenterology and Hepatology, Feb 2008
 
Keyur Patel_, David R. Nelson, Don C. Rockey, Nezam H. Afdhal_, Katie M. Smith, Esther Oh, Keith Hettinger, Marc Vallee_, Anouk Dev_, Margaret Smith-Riggs, John G. McHutchison_
 
_ Division of Gastroenterology, Duke Clinical Research Institute, Duke University Medical Center, Durham, North Carolina Section of Hepatobiliary Diseases, University of Florida College of Medicine, Gainesville, Florida Division of Digestive and Liver Diseases, University of Texas Southwestern Medical Center, Dallas, Texas _ The Liver Center, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts Prometheus Laboratories Inc, San Diego, California
 
Chronic hepatitis C (CHC) infection is characterized by varying degrees of inflammation and hepatic fibrosis, affecting an estimated 2.7 million persons in the United States and more than 170 million people worldwide.1 A proportion of patients develop progressive liver fibrosis, and, ultimately, cirrhosis with complications of end-stage liver disease, typically over 20 to 40 years. CHC infection is the leading indication for liver transplantation in developed nations and will continue to pose significant health and economic burdens during the next 10 to 20 years.2
 
Assessment of disease activity with histology obtained by a liver biopsy helps guide treatment and management decisions in CHC patients.3 However, a liver biopsy is invasive, costly, and associated with a small but finite risk of complications.4, 5 Furthermore, accurate disease staging is limited by issues relating to sampling and observer variability.6, 7, 8, 9 The semiquantitative grading systems used for histopathologic analysis in CHC do not reflect actual matrix burden, but were developed to standardize and improve observer variability, and to determine thresholds for therapy in CHC.10 Computer-aided image analysis may provide a more objective measurement of fibrous tissue in a liver biopsy specimen, but this modality also is associated with a high coefficient of variation, even in good quality samples.9, 11
 
Given the invasive nature of liver biopsy, there has been significant progress in recent years in the development of noninvasive biomarkers of fibrosis that provide an alternative to disease staging by a liver biopsy.12 One such approach has been to develop panels of specific markers such as FIBROSpect II (FSII) (Prometheus Laboratories Inc., San Diego, CA), which uses a predictive algorithm based on 3 serum markers, including hyaluronic acid, serum tissue inhibitor of metalloproteinase-1, and _-2-macroglobulin. This panel has been shown to differentiate mild (METAVIR stages F0 to F1) from moderate-to-severe (METAVIR stages F2 to F4) hepatic fibrosis with an accuracy of 75% in a previous retrospective study of 696 CHC patients.13
 
In this study, we hypothesized that a serum fibrosis panel such as FSII has the ability to stage fibrosis accurately, as assessed by both standard histology and quantified using morphometry. In addition, because few studies have evaluated the utility of fibrosis markers in relation to image analysis or recurrent disease in the posttransplant setting,14, 15 we aimed to evaluate prospectively the diagnostic utility of FSII in a broad group of CHC patients, including a posttransplant cohort. We further aimed to assess the observer variability for staging fibrosis and to compare the diagnostic utility of FSII for METAVIR stages F2 to F4 in comparison with the aspartate aminotransferase to platelet ratio index (APRI) index.
 
ABSTRACT
 
Background & Aims:
Accurate disease staging in chronic hepatitis C (CHC) infection helps guide treatment and may provide prognostic information. Liver biopsies are invasive, costly, and associated with morbidity. We hypothesized that a noninvasive test of liver fibrosis can accurately stage liver fibrosis. We prospectively evaluated the FIBROSpect II (FSII) biomarker panel versus pathology assessment and a quantitative measure of fibrosis.
 
Methods: Liver biopsy specimens and serum were obtained from 252 CHC patients, including 50 posttransplant, from 3 tertiary centers. Biopsy specimens were scored centrally and independently at each site, along with central quantification of fibrosis by digitized morphometry. Serum tests were performed blinded to clinical or histologic evaluation.
 
Results: The mean biopsy specimen length was 1.95 ± 0.87 cm; prevalence of stage F2 through F4 fibrosis was 77%. Agreement between central and site readings for individual stages was modest (k = 0.674), with concordant readings in 106 of 248 (43%) biopsy specimens. The area under the receiver operating characteristic curve for FSII and morphometry for stages F2 through F4 for concordant biopsy specimens were 0.823 and 0.728, respectively. Sensitivity and specificity for FSII were 83.5% and 66.7%, respectively, with an accuracy of 80.2%. The aspartate aminotransferase to platelet ratio index sensitivity and specificity for predicting F2 through F4 were 30.4% and 100%, respectively, the indeterminate rate was 40.4%, and the accuracy rate was 48.4%. The accuracy of FSII in concordant biopsy specimens in the posttransplant cohort was 73%.
 
Conclusions: Serum biomarkers can differentiate mild from moderate-to-severe fibrosis. This prospective study validates the performance characteristics of FSII in CHC patients and a posttransplant cohort. Assessing the diagnostic utility of biomarkers is limited by variability in methods to quantify fibrosis and poor interobserver agreement for histologic staging.
 
Discussion
 
Our results highlight several important points. We identified significant discordance between observers in terms of histologic assessment of fibrosis. In addition, quantitative morphometric analysis correlated poorly with semiquantitative METAVIR biopsy scores. These findings further highlight apparent limitations of percutaneous liver biopsy. Although this prospective cohort study validated the performance characteristics for the FSII panel, the earlier-described limitations appear to undermine the predictive utility of this index in relation to both standard biopsy assessment and morphometry.
 
Histologic assessment of fibrosis using liver biopsy has several limitations. There appears to be substantial sampling error in heterogeneously distributed diseases such as chronic hepatitis C, with significant 1-stage discordance even with good quality biopsy specimens.6 Another important limitation of histologic assessment of fibrosis is interobserver and intraobserver variation among pathologists. Standardized scoring systems such as METAVIR, Knodell, and Ishak were developed to improve agreement among pathologists, but concordance rates still are around 70% to 80% even among experienced observers. The level of experience may have a greater influence on agreement for staging than biopsy specimen quality.18 In our study, the hepatopathologists at all 3 sites were experienced observers based at tertiary-care referral centers for chronic hepatitis C patients. The central biopsy scores were dependent on achieving a consensus between 3 experienced observers who previously had shown good agreement for METAVIR staging (_ = 0.8) (data not shown). Although small biopsy specimens may lead to inaccuracies in fibrosis staging,9 we made a specific effort to obtain good quality biopsy specimens (biopsy length of 1.95 cm and at least 6 portal tracts available for evaluation). There were no differences in length between concordant and discordant biopsy specimens. Thus, we do not believe poor biopsy quality alone could account for the discordant fibrosis staging observed in this study. We did not evaluate interobserver variation between site-based pathologists in this study, but there were no significant differences between individual sites in terms of quality of biopsy or discordance compared with the central reading. A potential limitation of our study was that serum samples were collected up to 1 month after biopsy. However, any interval changes in fibrogenesis and matrix turnover are expected to be minimal over such a short period, and thus unlikely to affect performance characteristics of this marker panel significantly.
 
As might be expected, the FSII panel showed the greatest utility to discriminate stages F2 to F4 fibrosis (AUROC = 0.823) in the 106 concordant biopsy specimens (ie, with agreement between central and site-based scoring), and was comparable with previous observations for this marker panel.13, 19, 20 Of the 85 patients with stages F2 to F4 fibrosis, 71 (83.5%) were identified correctly by the marker panel. More than 80% of the discordance for central assessment compared with site-based readings were for stages F2 and F3, with less than 5% disagreement for stage F0 or F4. This is in keeping with prior observations and also partly accounts for the relatively poor predictive performance of fibrosis marker panels in the intermediate range of fibrosis.21 Clearly, the semiquantitative scoring systems for fibrosis allow for greater variability in interpretation for moderately severe fibrosis. For example, a 1-stage discordance for fibrosis stage F2 may result in a misclassification as either stages F1 or F3, but only a single directional change is possible for fibrosis stages F0 or F4.
 
Computer-aided image analysis may provide a more accurate quantitation of fibrosis by accounting for observer subjectivity that is associated with conventional fibrosis staging.14, 22 However, the coefficient of variation for image analysis is unacceptably high, at approximately 45% even for good quality biopsy specimens.9 In our study, the AUROC for FSII to detect moderate-to-severe fibrosis was lower for image analysis compared with standard histologic assessment. These comparatively poor performance characteristics for the marker panel in relation to image analysis were observed irrespective of central or site concordance. These differences between standard histologic assessment and image analysis could reflect laboratory sample handling and staining procedures, but likely were caused by an inherent bias in the methodology used to assess fibrosis. METAVIR staging has been used as the comparative gold standard in several noninvasive marker studies. It is possible that earlier fibrosis stages are represented by thinner fibrous septae that may not accurately reflect the global disease process, and a qualitative assessment by an experienced pathologist, of the distribution and integrity of portal tracts, plays an important role in determining the designated fibrosis stage.
 
Although marker panels now have been studied in chronic liver diseases of varying etiology, there have been few studies evaluating their utility for fibrosis assessment in the posttransplant setting.15 Simple markers of inflammatory activity such as transaminases may be influenced by many posttransplant issues, including allograft rejection, immunosuppression, and viral infection. This study evaluated the FSII panel in a small cohort of patients with recurrent HCV infection in the allograft. Performance characteristics for the marker panel were similar to the nontransplant patients and dependent on concordance with biopsy reading. Although these markers could guide the need for protocol biopsy in the posttransplant setting, larger prospective studies certainly are required before determining the true clinical utility of these noninvasive indices in following up changes in recurrent HCV infection-related fibrosis after liver transplantation.
 
Emerging noninvasive approaches for fibrosis assessment such as transient elastography and high-throughput protein profiling are promising modalities that likely will be integrated into the clinical setting in the future.23, 24, 25 However, a significant limitation to ongoing progress in developing more accurate indices of fibrosis relates to inherent limitations of our current disease staging by liver biopsy. This study highlights some of the limitations posed by interobserver differences and morphometry in this regard. Noninvasive biomarkers may provide a more accurate reflection of the dynamic nature of fibrogenesis, and perhaps in combination with newer indices, provide increased confidence in the level of actual disease severity.26, 27 Although no single test or algorithm currently can be recommended as a true alternative to liver biopsy for fibrosis quantification, further improvements in our understanding of the fibrogenesis cascade, along with the development of alternative methods of assessing fibrosis, certainly should result in more accurate and viable disease-staging options in the future.
 
Patients and Methods
 
Patient Population

 
Adult CHC patients undergoing a liver biopsy as standard of care were eligible for this prospective study conducted at 3 tertiary care institutions from December 2002 to August 2003. Eligible patients were either treatment-naive or had received no therapy for at least 6 months. Each patient required confirmation of CHC on a liver biopsy and serologic evidence of hepatitis C virus (HCV) RNA by a polymerase chain reaction assay. A separate cohort that underwent liver transplant for end-stage liver disease as a result of CHC also was included in the study. These patients were at least 3 months post-liver transplant and all had evidence of recurrent HCV infection by polymerase chain reaction assay. Up to 10% of the total enrollment could include patients co-infected with hepatitis B virus or human immunodeficiency virus-1 infection. Patients with other chronic liver diseases, connective tissue disease, extrahepatic infectious or inflammatory diseases, and for the transplant subset evidence of acute or chronic cellular rejection, were excluded from this study. Serum samples were collected before antiviral therapy, and within 1 month of liver biopsy, and stored at _70‹C until analysis. All samples were evaluated by personnel at a central laboratory blinded to clinical or histologic findings for the FSII panel. The Institutional Review Boards at Duke University Medical Center, University of Florida Gainesville, and Beth Israel Deaconess Medical Center approved the study protocol and all subjects gave written informed consent. The study met all standards for good clinical research according to the ethical guidelines outlined in the Declaration of Helsinki.
 
Liver Histology
 
Percutaneous liver biopsies were performed at each site as standard of care for assessment of histologic activity before antiviral therapy for nontransplant patients, and for clinical indications, or per protocol, as assessed by the transplant team for the subset of posttransplant patients with HCV recurrence. Liver biopsy specimens were fixed in 10% buffered formalin, embedded in paraffin, and stained with H&E and Massonfs trichrome for routine histopathologic assessment both centrally and at each site using the METAVIR system.10, 16 Biopsy specimens were considered adequate for evaluation if they were at least 15 mm in length and/or contained greater than 6 portal tracts. The biopsy specimens were read centrally by 3 experienced hepatologists who were blinded to clinical details, cross-trained together, and subsequently read the biopsy specimens independently. Initial concordance across the 3 readers for METAVIR was estimated by a _ coefficient of 0.86. Differences in scoring between the 3 readers were resolved by re-evaluation of the biopsy specimen as a consensus. Biopsy specimens also were evaluated independently by a single experienced hepatopathologist at each site. In addition, unstained liver sections were prepared and sent for quantitative morphometric analysis at Duke University Medical Center.
 
Serum Biomarker Assay
 
The FSII panel includes serum hyaluronic acid, serum tissue inhibitor of metalloproteinase-1, and _-2-macroglobulin measured at a central laboratory as outlined previously.13 A derived regression index greater than 0.36 is concordant with METAVIR stages F2 to F4, with an accuracy of 75%.
 
Quantitative Analysis of Liver Fibrosis
 
Morphometric quantitation of hepatic fibrosis was performed using Sirius red staining at a single laboratory by an experienced technician blinded to the histology score or clinical details. In brief, sections were incubated for 30 minutes in 0.1% Sirius red F3B containing saturated picric acid and 0.1% Fast Green (Sigma Chemical Co, St. Louis, MO). After rinsing twice with distilled water, sections then briefly were dehydrated with 70% ethanol. Thereafter, a morphometric score was derived by computerized image analysis using a photomicroscope (Nikon TE300 photomicroscope; Nikon Co, Tokyo, Japan) and MetaView software (Universal Imaging Corp, Downingtown, PA). Collagen stained with Sirius red was quantified at 20_ magnification. All fields from each liver biopsy were photographed, and fields not filling the entire viewed area were omitted. Included fields yielded a score that was reported as a mean aggregate score for each liver biopsy, and repeated measurements had a variability of less than 10%.
 
Statistical Analysis
 
Patient demographic and clinical laboratory characteristics were summarized descriptively and reported as mean ± SD and range (minimum/maximum). Statistical significance was assessed at the 0.05 level. Correlation analysis was performed by Pearson and Spearman rho tests. Differences between continuous variables were assessed by the Student t test, and observer agreement by the _ coefficient. The diagnostic accuracy of FSII relative to histologic or morphometric assessment of liver fibrosis was evaluated by receiver operating characteristic curve analysis (ROC) (Statistica Software, version 6.1; Statsoft, Tulsa, OK). Confidence intervals were based on a binomial distribution. The appropriate morphometric classifications to distinguish F0 to F1 from F2 to F4 liver fibrosis were established using classification and regression trees analysis, using the lowest error rate.
 
Results
 
Patient Characteristics

 
Of the 252 CHC patients enrolled in the study, 196 (78%) had CHC infection only, 6 (2%) had co-infection with human immunodeficiency virus-1, and 50 (20%) were posttransplant. The majority of patients had genotype 1 infection (169 of 252, 67.1%). Other differences in the baseline demographic and laboratory characteristics for the transplant and nontransplant groups are shown in Table 1. The mean biopsy (±SD) length was 1.95 ± 0.87 cm, and all biopsy specimens had at least 6 portal tracts suitable for evaluation.
 
Fibrosis Staging
 
Overall, 58 of 252 (23%) of the patients had minimal stage liver fibrosis (F0-F1) and 194 of 252 (77%) had moderate-to-severe fibrosis (F2-F4) by central biopsy staging (Figure 1). This indicated a relatively high prevalence of significant fibrosis among the CHC population evaluated at the 3 tertiary centers in this study. The prevalence of stage F2 to F4 fibrosis was 137 of 248 (55%) for site-based readings (results were not available for 4 patients in the nontransplant group). The agreement for individual stages between central and site-based reading was modest (Cohenfs _ = 0.674), with agreement for individual stages in only 106 of 248 (43%) biopsy specimens. As might be predicted, there was better agreement between site and central readings in 172 of 248 (69%) biopsy specimens that were not scored as individual stages, but as either minimal (F0-F1) or moderate-to-severe stage fibrosis (F2-F4); for biopsy specimens graded as either stage F0 to F3 or F4 there was agreement in 237 of 248 (95%) of the cases.
 
One-stage discordance was noted for 118 of 248 (47%) biopsy specimens that were mostly stages F2 (23%) and F3 (12%) by the central reading. Two-stage or more discordance was noted in 25 (10%) cases (Figure 2). There was no significant difference between mean (±SD) concordant and discordant biopsy length (1.94 ± 0.99 cm vs 1.97 ± 0.76 cm; P = .78). There were no significant differences in the quality of the biopsy specimen between the sites, or concordance between central reading and the cohort of transplant patients or an individual site (data not shown).
 
FIBROSpect II Performance Characteristics
 
The ability of FSII to differentiate stages F2 to F4 from F0 to F1 was assessed relative to the central and site-based staging of study biopsy specimens for each patient. The results indicated a sensitivity of 70.6% and 81%, a specificity of 65.5% and 62.2%, and an area under the ROC (AUROC) curve of 0.757 (95% confidence interval [CI], 0.686-0.827) and 0.776 (95% CI, 0.717-0.834) for central and site-based staging, respectively (Table 2). For the 106 patients with agreement for each individual stage between central and site-based readings the performance of FSII was improved, with an AUROC of 0.823 (95% CI, 0.720-0.927). At a lower prevalence for stage F2 to F4 of 30%, the positive and negative predictive values for FSII in concordant biopsy specimens were 53% and 91%, respectively (Figure 3).
 
For the transplant cohort, at prevalence of stage F2 to F4 of 41.7% and 78% for site and central readings, respectively, FSII had a sensitivity of 75% and 71.8%, a specificity of 42.9% and 54.5%, and an accuracy of 56% and 68% for the detection of moderate-to-severe disease. For the 30% of biopsy specimens concordant between site and central readings, and a prevalence of stage F2 to F4 of 73.3%, the sensitivity and specificity of FSII was 81.8% and 50%, respectively, with an accuracy of 73%.
 
Sirius Red Morphometry Biopsy Assessment
 
The diagnostic utility of Sirius red morphometry for differentiating individual and stage F2 to F4 fibrosis was assessed in 247 patient biopsy specimens (specimens not available for assessment in 5 subjects). The AUROC values for F2 to F4 were 0.622 (95% CI, 0.545-0.698) and 0.687 (95% CI, 0.623-0.752) for central and site-based readings, respectively. For 105 biopsy specimens with agreement between central and site readings (morphometry results were not available in 1 patient) the AUROC for morphometry was 0.728 (95% CI, 0.625-0.832) (Figure 4). For these concordant biopsy specimens, mean ± SD morphometry units increased with fibrosis stage: F0 (n = 3; 1.25 ± 0.42), F1 (n = 19; 1.89 ± 1.37), F2 (n = 31; 3.23 ± 2.87), F3 (n = 18; 4.15 ± 3.99), and F4 (n = 34; 5.91 ± 4.40) (chi-square P = .001). However, there was significant overlap in morphometry values between individual stages, with differences apparent only for F0 vs F2 (P = .02), F3 (P = .008), F4 (P < .001); F1 vs F2 (P = .03), F3 (P = .03), F4 (P < .001), and F2 vs F4 (P = .005) (Figure 5).
 
Aspartate Aminotransferase to Platelet Ratio Index in Concordant Biopsy Specimens
 
The APRI was calculated according to the formula: APRI = (AST (/upper limit of normal) ÷ platelet count [109/L]) _ 100. The lower and higher cut-off levels for stages F2 to F4 were 0.5 or less and greater than 1.5 as established previously.17 Results were available in 104 of the concordant biopsy specimens with a prevalence of stage F2 to F4 of 74.2%. Performance characteristics of APRI indicated a sensitivity of 30.4%, a specificity of 100%, with indeterminate results in 42 (40.4%) patients, and an overall accuracy for predicting stages F2 to F4 disease of 48.4%. Excluding the posttransplant cohort with concordant biopsy specimens did not significantly improve the predictive accuracy of APRI for stage F2 to F4 disease.
 
APRI results were available in 13 of 14 patients with stage F2 to F4 disease on concordant biopsy specimens that were classified as false-negative results by FSII; 12 patients (F2 = 10 and F3 = 2) had an APRI of less than 0.5, and 1 patient with stage F4 was classified as being indeterminate. Thus, nearly all patients with a false-negative result by FSII also would have been classified incorrectly using APRI. However, for the 7 patients with false-positive results by FSII, 3 were identified correctly with an APRI cut-off level of less than 0.5 (F0 = 1, F1 = 2), with indeterminate results in the remaining 4 patients.
 
 
 
 
  iconpaperstack View Older Articles   Back to Top   www.natap.org