|
Comparison of eight diagnostic algorithms for liver fibrosis in hepatitis C: new algorithms are more precise and entirely noninvasive
|
|
|
Download the PDF here
Hepatology January 2012
Jerome Boursier,1,2 Victor de Ledinghen,3,4 Jean-Pierre Zarski,5,6 Isabelle Fouchard-Hubert,1,2 Yves Gallois,2,7 Frede ric Oberti,1,2 Paul Cale` s1,2 and multicentric groups from SNIFF 32, VINDIAG 7, and ANRS/HC/EP23 FIBROSTAR studies
From the 1Department of Hepatogastroenterology, University Hospital, Angers, France; 2HIFIH Laboratory, UPRES 3859, Institut Federatif de Recherche (IFR) 132,
University of Angers, Pole de Recherche et d'Enseignement Superieur Universite Nantes Angers Le Mans (PRES UNAM), France; 3Department of
Hepatogastroenterology, Haut-Leveque University Hospital, Pessac, France; 4Institut National de la Sante et de la Recherche Medicale (INSERM) U889, Victor
Segalen University, Bordeaux, France; 5Department of Liver-Gastroenterology, University Hospital, Grenoble, France; 6Institut National de la Sante et de la Recherche
Medicale (INSERM)/UJF U823, IAPC, IAB, Grenoble, France; and 7Department of Biochemistry, University Hospital, Angers, France.
Abstract
The sequential algorithm for fibrosis evaluation (SAFE) and the Bordeaux algorithm (BA), which cross-check FibroTest with the aspartate aminotransferase-to-platelet ratio index (APRI) or FibroScan, are very accurate but provide only a binary diagnosis of significant fibrosis (SAFE or BA for Metavir F ≥ 2) or cirrhosis (SAFE or BA for F4). Therefore, in clinical practice, physicians have to apply the algorithm for F ≥ 2, and then, when needed, the algorithm for F4 ("successive algorithms"). We aimed to evaluate successive SAFE, successive BA, and a new, noninvasive, detailed classification of fibrosis. The study included 1785 patients with chronic hepatitis C, liver biopsy, blood fibrosis tests, and FibroScan (the latter in 729 patients). The most accurate synchronous combination of FibroScan with a blood test (FibroMeter) provided a new detailed (six classes) classification (FM+FS). Successive SAFE had a significantly (P < 10-3) lower diagnostic accuracy (87.3%) than individual SAFE for F ≥ 2 (94.6%) or SAFE for F4 (89.5%), and required significantly more biopsies (70.8% versus 64.0% or 6.4%, respectively, P < 10-3). Similarly, successive BA had significantly (P ≤ 10-3) lower diagnostic accuracy (84.7%) than individual BA for F ≥ 2 (88.3%) or BA for F4 (94.2%), and required significantly more biopsies (49.8% versus 34.6% or 24.6%, respectively, P < 10-3). The diagnostic accuracy of the FM+FS classification (86.7%) was not significantly different from those of successive SAFE or BA. However, this new classification required no biopsy. Conclusion: SAFE and BA for significant fibrosis or cirrhosis are very accurate. However, their successive use induces a significant decrease in diagnostic accuracy and a significant increase in required liver biopsy. A new fibrosis classification that synchronously combines two fibrosis tests was as accurate as successive SAFE or BA, while providing an entirely noninvasive (0% liver biopsy) and more precise (six versus two or three fibrosis classes) fibrosis diagnosis.
Several fibrosis algorithms combining different fibrosis tests have been proposed to improve the accuracy of the noninvasive diagnosis of liver fibrosis in chronic hepatitis C.1-5 These decision-making algorithms were developed to provide an accurate diagnosis of liver fibrosis and limit liver biopsy to indeterminate cases. They use either two blood tests in a sequential procedure, as in the sequential algorithm for fibrosis evaluation (SAFE),6 or are based on agreement between a blood test and FibroScan (Echosens, Paris, France) results, as in the Bordeaux algorithm (BA).1
Although the accuracy of SAFE and BA has been shown to be excellent for the diagnosis of significant fibrosis or cirrhosis,1, 2, 6-8 they have some limitations in clinical practice. First, SAFE uses the aspartate aminotransferase-to-platelet ratio index (APRI) as a first-line fibrosis test, then FibroTest as a second-line test, and, if necessary, liver biopsy when the diagnosis remains undetermined. This implies several diagnostic steps.9 Second, the rate of required liver biopsy in SAFE and BA ranges from 30% to 50% for the diagnosis of significant fibrosis and from 20%-30% for cirrhosis.6, 7 These rates seem inconsistent with a "noninvasive" diagnostic procedure for liver fibrosis screening. Third, SAFE was developed for a binary diagnosis of significant fibrosis or cirrhosis, which is insufficient for patient management in clinical practice. Indeed, a noninvasive diagnosis of significant fibrosis could indicate either moderate/severe fibrosis or cirrhosis. Thus, to achieve an accurate diagnosis, physicians have to use the SAFE for significant fibrosis first, and then, if significant fibrosis is diagnosed, the SAFE for cirrhosis. This adds a diagnostic step and increases the rate of misclassified patients and the rate of required liver biopsy. Finally, the BA was presented in the pivotal study as a three-diagnostic-class algorithm,1 but further evaluation focused only on the binary diagnosis of significant fibrosis or cirrhosis.7
We have developed several statistical techniques to improve the noninvasive diagnosis of liver fibrosis. These include blood tests adapted to a diagnostic target,10 synchronous combinations of fibrosis tests8, 11 to improve diagnostic accuracy, and reliable diagnosis intervals for fibrosis tests to improve diagnostic precision.12, 13 Finally, a synchronous combination of FibroScan and FibroMeter using these methods in a one-step procedure resulted in an accurate noninvasive classification of fibrosis.14 This classification provided a precise diagnosis (six diagnostic classes), with robust and high diagnostic accuracy, and eliminated the need for liver biopsy.
The aim of the present study was to evaluate the accuracy of SAFE and BA for the noninvasive diagnosis of liver fibrosis in clinical practice and compare them with our new noninvasive classification of fibrosis, which synchronously combines fibrosis tests.
BA, Bordeaux algorithm; SAFE, sequential algorithm for fibrosis evaluation.
Discussion
This study evaluated the accuracy of two published fibrosis algorithms, the SAFE and the BA, in a large cohort of chronic hepatitis C patients. Castera et al. recently compared SAFE and BA constructed for the binary diagnoses of significant fibrosis or cirrhosis.7 However, their work had two limitations: first, it included a relatively small subset of 302 patients from two centers, and second, the prevalence of significant fibrosis or cirrhosis in the population studied (76% and 24%, respectively) was higher than observed in a reference population (48% and 12%, respectively) including more than 33,000 patients with chronic hepatitis C.22 This epidemiological limit induced a misevaluation of the overall accuracy, the predictive values, and probably the sensitivity and specificity of the algorithms studied.
Our study has several noteworthy points. First, we performed a direct and independent comparison of SAFE and BA in a large cohort of 729 patients with chronic hepatitis C. Second, our study had a multicenter design.13-15 Third, the prevalence of fibrosis stages in our population was very close to that of the reference population22 described above. In this setting, compared with the study by Castera et al.,7 we found a higher negative predictive value for BA for F ≥ 2 and lower positive predictive values for SAFE for F4 and BA for F4. Our results are probably more representative of the real accuracy of SAFE and BA in clinical practice.
All the studies that have evaluated SAFE and BA have demonstrated their excellent accuracy for the diagnosis of significant fibrosis or cirrhosis.2, 6-8 Two caveats should, however, be kept in mind. First, SAFE for F ≥ 2 was impaired in all these studies by a very high rate of required liver biopsy (>50%). Indeed, because it considers negative predictive values of APRI and FibroTest as insufficient, SAFE for F ≥ 2 recommends the use of liver biopsy when the blood fibrosis tests suggest no/mild fibrosis.2 It should be noted that this implies to perform of liver biopsy in the subgroup of patients with a good prognosis.
Second, fibrosis algorithms intended for a simple binary diagnosis of fibrosis do not provide sufficient information for the management of patients in clinical practice. Indeed, physicians have to answer two questions: (1) whether the patient needs antiviral therapy (i.e., is there any significant fibrosis in genotype 1 chronic hepatitis C); and (2) whether the patient needs screening for hepatocellular carcinoma and esophageal varices (i.e., is there any cirrhosis). Thus, physicians first have to apply the algorithm for the diagnosis of significant fibrosis and then, if the noninvasive diagnosis is F ≥ 2, apply the algorithm for the diagnosis of cirrhosis. This successive use of algorithms for binary diagnosis leads to greater rates of misclassified patients and of liver biopsy. In this setting, our results clearly demonstrated that Successive SAFE (Fig. 1C) and Successive BA (Fig. 2C) had significantly lower diagnostic accuracies and required significantly higher rates of liver biopsy than single algorithms (respectively SAFE for F ≥ 2 or SAFE for F4, and BA for F ≥ 2 or BA for F4) (Table 3). Moreover, Successive SAFE required a significantly higher rate of FibroTest use, compared with SAFE for F ≥ 2 or SAFE for F4. It is also of note that the results for Successive SAFE or Successive BA were the same when the algorithm for the binary diagnosis of cirrhosis was performed first, and then followed, if necessary, by the algorithm for the binary diagnosis of significant fibrosis (data not shown). Taken together, these results show that the accuracy of SAFE and BA for the diagnosis of fibrosis in clinical practice has been overestimated in published studies.
Also, Sebastiani et al. proposed an algorithm for the simultaneous detection of significant fibrosis and cirrhosis (Supporting Fig. 3).6 Despite very high diagnostic accuracy (97.0%), this algorithm required liver biopsy in almost all patients (85.2%), thus greatly limiting its interest for the noninvasive diagnosis of fibrosis.
The association of FibroMeter and FibroScan, which was shown to be the best combination among six noninvasive fibrosis tests,14 serves as the foundation of the FM+FS classification (Supporting Fig. 4). This new noninvasive classification of fibrosis had several advantages compared with SAFE and BA. First, the FM+FS classification required no liver biopsy. Second, the FM+FS classification provided a more precise diagnosis (six diagnostic classes) than Successive SAFE (two classes) or Successive BA (three classes) (Supporting Table 1). Third, despite the absence of liver biopsy requirement, the diagnostic accuracy of the FM+FS classification was not significantly different from those of Successive SAFE or Successive BA. Finally, the FM+FS classification provided the best performance profile compared with Successive SAFE or Successive BA (Fig. 3). It should be noted that the reference for liver fibrosis in our study was liver biopsy, which should be considered as a "best standard" but not a "gold standard."23 Thus, the diagnostic accuracy of the FM+FS classification was probably underestimated in our study.24
Finally, the FM+FS classification significantly improved the noninvasive diagnosis of liver fibrosis by avoiding liver biopsy and refined the precision of fibrosis diagnosis while maintaining very high accuracy. Thus, between published decision-making algorithms and our new noninvasive classification of fibrosis, the FM+FS classification appears to be the most appropriate for clinical use. Because it requires several steps and calculations (Supporting Fig. 4), the use of the FM+FS classification in clinical practice may, at first glance, seem complex. However, once all these steps are computerized, physicians need only provide the results of FibroScan and FibroMeter.
The length of liver biopsy had no influence on the diagnostic accuracy of Successive SAFE, Successive BA, or the FM+FS classification. The accuracy of Successive SAFE was independently influenced by age, sex, and ALT level. Indeed, Successive SAFE was quite inaccurate at high ALT levels, especially in older men (Fig. 4A). The success rate of FibroScan had no influence on the diagnostic accuracy of Successive BA or the FM+FS classification. In this setting, it has been already been shown that IQR/M was the only FibroScan characteristic that had a significant impact on its accuracy.20 In our study, IQR/M influenced the accuracy of the FM+FS classification but not that of Successive BA. This was due to the high rate of liver biopsy (49.8%) required by Successive BA. In fact, the FM+FS classification is more sensitive to the influence of IQR/M than Successive BA because its diagnosis depends on the FibroScan results in all patients (0% liver biopsy required). In addition to IQR/M, the accuracy of the FM+FS classification was independently influenced by age and ALT level. However, the accuracy of the FM+FS classification remained higher than 80% in the various subgroups resulting from the combination of these three parameters (Fig. 4B).
In conclusion, SAFE and BA for binary diagnoses of significant fibrosis or cirrhosis have excellent diagnostic accuracy in chronic hepatitis C. However, in clinical practice, the significant fibrosis algorithm and the cirrhosis algorithm have to be used successively, which induces a significant decrease in diagnostic accuracy and a significant increase in the rate of required liver biopsy. A new noninvasive classification of fibrosis synchronously combining FibroScan and FibroMeter results allows for an entirely noninvasive (0% liver biopsy required) and precise (six fibrosis classes) diagnosis of liver fibrosis with a diagnostic accuracy (87%) that is not significantly different from those of SAFE and BA.
Results
Patients
The characteristics of the patients included in the three populations have been described13-15 and are summarized in Table 1. More detailed characteristics are presented in Supporting Table 2. The prevalence of significant fibrosis (Metavir F ≥ 2) was 52.2% in population #1, 67.8% in population #2a, and 49.4% in population #2b (P < 10-3). The prevalence of Metavir F stages over the entire population of 1785 patients was: F0: 4.2%, F1: 41.1%, F2: 26.5%, F3: 15.5%, and F4: 12.7%.
Decision-Making Algorithms
Binary Diagnosis of Significant Fibrosis.
In the entire population, SAFE for F ≥ 2 provided 94.6% diagnostic accuracy but required liver biopsy in 64.0% of patients (Table 2) . Because FibroScan was not available in population #1, SAFE and BA were compared in population #2: BA for F ≥ 2 provided significantly lower diagnostic accuracy than SAFE for F ≥ 2 (88.3% versus 92.5%, P = 0.010) but required a significantly lower rate of liver biopsy (34.6% versus 57.0%, P < 10-3).
Binary Diagnosis of Cirrhosis.
In the entire population, SAFE for F4 provided 89.5% diagnostic accuracy and required liver biopsy in 6.4% of patients (Table 2). In population #2, BA for F4 provided significantly higher diagnostic accuracy than SAFE for F4 (94.2% versus 87.6%, P < 10-3) but required a significantly higher rate of liver biopsy (24.6% versus 6.7%, P < 10-3).
SAFE for F ≥ 2 and F4.
The SAFE for F ≥ 2 and F4 published by Sebastiani et al.6 provided excellent diagnostic accuracy (97.0%) but required a very high rate of liver biopsy (85.2%) (Table 3).
Successive Algorithms
Successive SAFE.
In the entire population, Successive SAFE provided significantly lower diagnostic accuracy (87.3%) than individual SAFE for F ≥ 2 (94.6%, P < 10-3) or SAFE for F4 (89.5%, P < 10-3) (Table 3). Moreover, Successive SAFE required a significantly higher rate of liver biopsy (70.8%) than SAFE for F ≥ 2 (64.0%, P < 10-3) or SAFE for F4 (6.4%, P < 10-3). The use of FibroTest was required in 49.2% of patients with Successive SAFE versus 35.8% with SAFE for F ≥ 2 (P < 10-3) or 22.2% with SAFE for F4 (P < 10-3). Finally, the accuracy of the noninvasive diagnosis (i.e., the rate of correctly classified patients by noninvasive tests in the subgroup of patients without liver biopsy) was 56.5% with Successive SAFE whereas it was 85.1% and 88.7% with SAFE for F ≥ 2 and SAFE for F4, respectively.
Successive BA.
In population #2, Successive BA had significantly lower diagnostic accuracy (84.7%) than individual BA for F ≥ 2 (88.3%, P = 10-3) or BA for F4 (94.2%, P < 10-3) (Table 3). Also, Successive BA required a significantly higher rate of liver biopsy (49.8%) than BA for F ≥ 2 (34.6%, P < 10-3) or BA for F4 (24.6%, P < 10-3). Finally, the accuracy of the noninvasive diagnosis was 69.6% with Successive BA compared with 82.1% and 92.2% with, respectively, BA for F ≥ 2 and BA for F4.
New Noninvasive Classifications of Fibrosis
There was no discrepancy between the reliable diagnoses of the CSF index and SF index (Supporting Fig. 4), with thus a required liver biopsy rate of 0%. The diagnostic accuracy of the FM+FS classification was not significantly different between populations #2a (derivation) and #2b (validation): 87.7% and 85.8%, respectively (P = 0.461). Despite the absence of required liver biopsy, the FM+FS classification provided high diagnostic accuracy (86.7%), with no significant difference from that of Successive SAFE or Successive BA (Table 3). The FM+FS classification provided a lower rate of large discrepancies (≥2 F stages: 1.2%; Table 4) compared with Successive SAFE (3.1%, P = 0.015) or Successive BA (2.4%, P = 0.115). The rate of correctly classified patients was >85% in all diagnostic classes of the FM+FS classification (except for the F2/3 class: 74.2%), whereas it was <73% in all diagnostic classes of Successive SAFE or Successive BA (Supporting Fig. 5). The FM+FS classification provided the best performance profile,21 especially in F ≥ 2 stages: the rate of correctly classified patients was the highest (>80%) and the most homogeneous over the fibrosis stages, compared with the other algorithms (Fig. 3).
Sensitivity Analysis
We evaluated the influence of age, sex, biopsy length, Metavir F, and alanine aminotransferase (ALT) level on the diagnostic accuracy of successive algorithms and FM+FS classification. The influence of FibroScan examination characteristics (success rate, IQR/M) was also evaluated for Successive BA and FM+FS classification.
Successive SAFE.
By stepwise forward binary logistic regression, the rate of well-classified patients by Successive SAFE was independently associated with ALT (first step), age (second step), Metavir F (third step), and sex (fourth step; Supporting Table 3). The diagnostic accuracy of Successive SAFE as a function of each of these influencing factors is detailed in Supporting Table 4. The combination of age, sex, and ALT level showed that diagnostic accuracy of Successive SAFE decreased in patients with high ALT level especially in the subgroup of men ≥50 years old, in which only 64.6% were well classified (Fig. 4A).
Successive BA.
The rate of well-classified patients by Successive BA was only independently associated with Metavir F (Supporting Table 3). The diagnostic accuracy of Successive BA was significantly lower in Metavir F2 or F3 stages compared with F0/1 or F4 (Supporting Table 4).
FM+FS Classification.
The rate of well-classified patients by FM+FS classification was independently associated with Metavir F (first step), IQR/M (second step), ALT (third step), and age (fourth step, Supporting Table 3). Diagnostic accuracy of FM+FS classification as a function of each of these influencing factors is detailed in Supporting Table 4. Diagnostic accuracy of FM+FS classification was always higher than 80%, whatever the combination of age, ALT, and IQR/M (Fig. 4B).
Patients and Methods
Patients.
We pooled the populations of three published studies, SNIFF 32,13 VINDIAG 7,14 and FIBROSTAR ANRS/HC/EP23,15 all of which had very similar inclusion and exclusion criteria. Patients were included if they had chronic hepatitis C, defined as both positive anti-hepatitis C virus antibodies and hepatitis C virus RNA in serum. Exclusion criteria were other causes of chronic hepatitis (hepatitis B or HIV coinfection, alcohol consumption > 30 g/day in men or >20 g/day in women in the 5 years before inclusion, hemochromatosis, or autoimmune hepatitis), cirrhosis complications (ascites, variceal bleeding, systemic infection, hepatocellular carcinoma), and antifibrotic treatment in the preceding 6 months. Patients were included from nine centers for SNIFF 32, three centers for VINDIAG 7, and 19 centers for FIBROSTAR, all located in France. Patients included in both VINDIAG 7 and FIBROSTAR were excluded from the FIBROSTAR population for the statistical analysis of the present study. All patients gave informed consent. Study protocols conformed to the ethical guidelines of the current Declaration of Helsinki and received approval from local ethics committees.
Liver Biopsy.
Liver fibrosis was evaluated according to Metavir fibrosis (F) staging. Significant fibrosis was defined as Metavir F ≥ 2, severe fibrosis as Metavir F ≥ 3, and cirrhosis as Metavir F4. Histological liver fibrosis evaluation was performed by blinded senior pathologists in each center. In the FIBROSTAR study, liver fibrosis was centrally evaluated by two senior experts with a consensus reading in cases of discordance. All pathologists involved in the three studies were hepatology specialists. Histological results were used as reference for the evaluation of noninvasive tests.
Blood Fibrosis Tests.
Fasting blood samples were collected immediately before or no more than 3 months after liver biopsy. Blood samples were processed independently in each center, except for hyaluronic acid, α2-macroglobulin, haptoglobin, and apolipoprotein A1, which were tested centrally in the FIBROSTAR study. Fibrotest,1 FibroMeter2G,16 and APRI17 were calculated according to published or patented formulas. We have demonstrated the excellent interlaboratory reproducibility of these tests.18
Liver Stiffness Evaluation.
FibroScan was available in the VINDIAG 7 and FIBROSTAR studies. FibroScan examinations were performed under fasting conditions by an experienced observer (>50 examinations before the study), blinded for patient data. Examination conditions were those recommended by the manufacturer.19 FibroScan examinations were stopped when 10 valid measurements were recorded. Results (in kilopascals) were expressed as the median of all valid measurements. A FibroScan result was considered reliable when the interquartile range (IQR)/median ratio (IQR/M) was <0.21.20
Fibrosis Algorithms
Characteristics of the eight fibrosis algorithms evaluated in the present study are detailed in the glossary and summarized in Supporting Table 1 in the Supporting Material.
Decision-Making Algorithms
SAFE
SAFE for the diagnosis of significant fibrosis (SAFE for F ≥ 2; Fig. 1A), SAFE for the diagnosis of cirrhosis (SAFE for F4; Fig. 1B), and SAFE for the simultaneous diagnosis of significant fibrosis and cirrhosis (SAFE for F ≥ 2 and F4. Supporting Fig. 3) were determined according to data published by Sebastiani et al.6
Bordeaux Algorithm
BA for the diagnosis of significant fibrosis (BA for F ≥ 2; Fig. 2A) and BA for the diagnosis of cirrhosis (BA for F4; Fig. 2B) were determined according to data published by Castera et al.7
Successive Algorithms
An algorithm constructed for a binary diagnosis of liver fibrosis, such as SAFE or BA, provides only limited data for the management of patients in clinical practice. Indeed, when the noninvasive diagnosis provided by the algorithm specific to significant fibrosis is F ≥ 2, the physician has to apply the cirrhosis-specific algorithm in a second step to determine whether the patient has cirrhosis. We used the term "successive algorithms" to describe this consecutive use of algorithms in clinical practice. In the present study, we evaluated:
· Successive SAFE, which corresponds to the use of SAFE for F ≥ 2 followed by SAFE for F4 when necessary (Fig. 1C); and
· Successive Bordeaux algorithms (Successive BA), which corresponds to the use of BA for F ≥ 2 followed by BA for F4 when necessary. Successive BA presents as a three-diagnostic-class algorithm (Fig. 2C).
New Noninvasive Classification of Fibrosis
The new noninvasive classification of fibrosis was derived from the synchronous combination of FibroMeter and FibroScan results as described.14 The method is detailed in the glossary in the Supporting Information and summarized in Supporting Fig. 4. Briefly, two fibrosis indexes combining FibroMeter and FibroScan are derived by binary logistic regression: the clinically significant fibrosis (CSF) index (diagnostic target: Metavir F ≥ 2) and the severe fibrosis (SF) index (diagnostic target: Metavir F ≥ 3). The reliable diagnosis intervals (see the glossary in the Supporting Information for a precise definition) of these two indexes are then determined according to a method that has been described.13 Finally, the association of the reliable diagnoses from the CSF index and SF index determines the FM+FS classification, which includes six diagnostic classes of fibrosis stages (F0/1, F1/2, F2±1, F2/3, F3±1, and F4) and eliminates the need for required liver biopsy.
Statistical Analysis
All FibroScan examinations, reliable or not, were included in the initial statistical analysis. Then sensitivity analysis was performed in patients with reliable FibroScan results.
SAFE was evaluated in the SNIFF32 (called population #1 in the present study), VINDIAG 7 (population #2a), and FIBROSTAR (population #2b) cohorts. Because FibroScan was not available in the SNIFF32 study, BA was evaluated only in the VINDIAG 7 and FIBROSTAR populations (i.e., population #2). The VINDIAG 7 study provided the exploratory population of the FM+FS classification,14 which was then validated in the FIBROSTAR population.
Performance for the evaluated fibrosis algorithms was expressed as the rate of correctly classified patients according to liver biopsy results, the rate of required liver biopsy, and the performance profile as described21; comparisons were done with the paired McNemar test. Statistical software was SPSS, version 11.5 (SPSS Inc., Chicago, IL).
|
|
|
|
|
|
|