iconstar paper   Hepatitis C Articles (HCV)  
Back grey arrow rt.gif
 
 
Performance of Transient Elastography for the Staging of Liver Fibrosis: A Meta-Analysis
 
 
  Download the PDF here
 
Gastroenterology April 2008
 
Mireen Friedrich-Rust_, Mei-Fang Ong, Swantje Martens, Christoph Sarrazin_, Joerg Bojunga_, Stefan Zeuzem_, Eva Herrmann
 
v _ Department of Internal Medicine I, J. W. Goethe-University Hospital, Frankfurt, Germany
 
Faculty of Medicine, Internal Medicine-Biomathematics, Saarland University, Homburg, Germany
 
We performed a meta-analysis to assess the overall performance of transient elastography for the diagnosis of liver fibrosis and to analyze the heterogeneity between the available studies.
 
"Transient elastography can be performed with excellent diagnostic accuracy and independent of the underlying liver disease for the diagnosis of cirrhosis. However, for the diagnosis of significant fibrosis, a high variation of the AUROC was found that is dependent on the underlying liver disease."
 
Transient Elastography

 
Transient elastography is a novel method. The first clinical data from transient elastography were published in 2002. Transient elastography is performed with an ultrasound transducer probe mounted on the axis of a vibrator. A vibration transmitted from the vibrator toward the tissue induces an elastic shear wave that propagates through the tissue. These propagations are followed by pulse-echo ultrasound acquisitions and their velocity is measured, which is related directly to tissue stiffness. The harder the tissue, the faster the shear wave propagates.6 Up to 10 successful acquisitions are performed routinely on each patient and the examination lasts about 5-10 minutes. The success rate is calculated automatically by the machine as the ratio of the number of successful acquisitions over the total number of acquisitions. According to the manufacturerfs recommendations, only transient elastography results obtained with 10 valid measurements and with a success rate of at least 60% are considered reliable. However, recent publications have suggested that 3 valid measurements could be performed with the same results as 10 valid measurements for cirrhosis diagnosis, but the minimum number for significant and advanced fibrosis is unknown.7 The quality assessment using the success rates varied between studies, with a range from 30% to 65%. Ten valid measurements and a success rate of at least 60% can be achieved in 90%-96% of examinations. Transient elastography can be learned easily and has a high intraobserver (96%-98%) and interobserver (89%-98%) agreement.8
 
To analyze whether the underlying liver disease has an influence on the AUROC values, the studies were divided into 3 groups: studies examining hepatitis C virus (HCV)-infected patients only, studies examining a patient population of different liver diseases including HCV, and studies without HCV patients. This group selection was chosen because most studies examining a single liver disease considered HCV.
 
Because the fibrosis staging system used to classify the histology varied, scoring systems using scores from 0 to 4 for fibrosis staging (METAVIR, Desmet and Scheuer, Knodell, Brunt, Ludwigfs) were pooled for the overall calculation of the mean AUROC. The influence of the different staging systems on the mean AUROC was analyzed separately. The Ishak score, using a scale from 0 to 6, was transferred into METAVIR with Ishak F ≥ 3 assigned to METAVIR F ≥ 2, Ishak F ≥ 4 assigned to METAVIR F ≥ 3, and Ishak F ≥ 5 assigned to METAVIR F = 4, respectively.
 
The meta-analysis was performed using the random-effects model (DerSimonian and Laird estimator)11 for the AUROC with straightforward extensions to meta-regression and summary ROC (SROC) techniques. The AUROC was known in all included studies (see inclusion criteria) and the standard error of the single studies could be determined or approximated from the available data, especially using the 95% confidence intervals (CIs). The random-effects model incorporated heterogeneity of studies in the analysis of the overall efficacy of transient elastography in the different studies. The method estimated the magnitude of the heterogeneity and assigned a greater variability to the estimate of the overall mean AUROC. Studies with a larger sample size and therefore a smaller standard error received more weight when calculating the mean AUROC. The reason for heterogeneity between studies was analyzed in regard to the effect of different factors (underlying liver disease, staging system used, country where the study was performed, publication as abstract vs full-length article, mean body mass index [BMI], mean age, fibrosis stage, sex distribution, mean or median length of liver biopsy specimen, proportion of liver biopsy failure, proportion of FibroScan failure, as well as the quality criteria described later) on the AUROC. Nevertheless, in contrast to testing continuous factors, the asymptotic foundation of testing categoric factors may become problematic if only part of the heterogeneity can be explained by the respective factor. Therefore, we interpret significant test results as a reduction of heterogeneity and also provide CIs from a random-effects model for different categories. Furthermore, we studied the influence of the difference of the mean of advanced and the mean of nonadvanced fibrosis stages (DANA) on the mean AUROC and the adjusted AUROC according to the quality of liver biopsy.12, 13
 
To assess the quality of the studies included in the meta-analysis, the Quality Assessment of Studies of Diagnostic Accuracy Included in Systematic Review (QUADAS) questionnaire was used (Supplementary Table 1; see Supplementary Table 1 online at www.gastrojournal.org). Items were rated as yes, no, or unclear. The impact of the fulfillment of the individual QUADAS items on the diagnosis of liver fibrosis was analyzed.14 Item 3 (appropriate reference standard) was rated as unclear if no data on the length of the liver biopsy specimen or portal tracts were given. Item 9 was rated as unclear if the staging system was given, but no inclusion criteria concerning the length of the liver biopsy or the number of portal tracts.
 
Furthermore, a SROC was calculated from all studies in which sensitivity and specificity were known for at least one cut-off level using a weighted linear model according to Littenberg and Moses.15 The weights were chosen according to sample size. Such a weighting scheme also was used for the assessment of the influence of the chosen cut-off levels for liver stiffness on sensitivity and specificity (where reported). In general, sensitivity should decrease and specificity should increase with increasing cut-off levels. Nevertheless, heterogeneity between the studies may disturb this general trend.
 
ABSTRACT
 
Background & Aims: Transient elastography has been studied in a multitude of liver diseases for the staging of liver fibrosis with variable results. A meta-analysis was performed to assess the overall performance of transient elastography for the diagnosis of liver fibrosis and to analyze factors influencing the diagnostic accuracy.
 
Methods: Literature databases and international conference abstracts were searched. Inclusion criteria were as follows: evaluation of transient elastography, liver biopsy as reference, and assessment of the area under the receiver operating characteristic curve (AUROC). The meta-analysis was performed using the random-effects model for the AUROC, summary receiver operating curve techniques, as well as meta-regression approaches.
 
Results: Fifty studies were included in the analysis. The mean AUROC for the diagnosis of significant fibrosis, severe fibrosis, and cirrhosis were 0.84 (95% confidence interval [CI], 0.82-0.86), 0.89 (95% CI, 0.88-0.91), and 0.94 (95% CI, 0.93-0.95), respectively. For the diagnosis of significant fibrosis a significant reduction of heterogeneity of the AUROC was found when differentiating between the underlying liver diseases (P < .001). Other factors influencing the AUROC were the scoring system used and the country in which the study was performed. Age, body mass index, and biopsy quality did not have a significant effect on the AUROC.
 
Conclusions: Transient elastography can be performed with excellent diagnostic accuracy and independent of the underlying liver disease for the diagnosis of cirrhosis. However, for the diagnosis of significant fibrosis, a high variation of the AUROC was found that is dependent on the underlying liver disease.
 
Discussion
 
The systematic literature search revealed 50 studies evaluating the diagnostic performance of transient elastography for the staging of liver fibrosis, which fulfilled the inclusion criteria and reported enough data to perform a meta-analysis. The aim of the systematic literature search was to include all relevant publications (including abstracts) with the main focus on the meta-analysis of the AUROC. A meta-analysis based on individual data was not the scope of the present study. Therefore, the power of this meta-analysis is certainly lower in comparison with large studies and studies including individual data.
 
Transient elastography performed best at differentiating cirrhosis vs no cirrhosis with a mean AUROC of 94% (95% CI, 0.93-0.95) and an adjusted AUROC of 99%. A diagnostic tool is defined as perfect if the AUROC is 100%, excellent if the AUROC is greater than 90%, and good if the AUROC is greater than 80%.64, 65 According to these results, transient elastography can be used in clinical practice as an excellent tool for the confirmation of cirrhosis when other clinical signs and examinations are nondecisive. In our view, a liver biopsy is not essential anymore to answer this question. Unfortunately, not enough information from the single studies was available to analyze in what percentage of patients the diagnosis of cirrhosis could have been made owing to overt clinical and biochemical signs of cirrhosis (low platelet count, low albumin level, increased international normalized ratio, sonographic signs of cirrhosis). The optimal cut-off value for the diagnosis of cirrhosis suggested from the SROC was 13.01 kPa.
 
The presence of significant fibrosis (F ≥ 2) is considered a hallmark of a progressive liver disease. The highest aim of treatment is to cure the patient by resolving the underlying cause of liver disease (viral elimination in viral hepatitis, alcohol abstinence in ASH, weight loss in NASH, and immunosuppressant treatment in autoimmune hepatitis). Studies have shown that antiviral treatment of patients with chronic hepatitis C prolongs life, improves quality of life, and is cost effective.66, 67 However, treatment may be associated with severe side effects and the decision for treatment needs to be made on an individual basis. Patients with present fibrosis stage F2 and more already have shown a great progression of their liver disease and are at increased risk of developing cirrhosis with its sequelae (ie, esophageal varices, ascites, hepatic encephalopathy, and hepatocellular carcinoma). Therefore, patients with fibrosis stage F2 and more have a stronger indication for treatment as compared with patients with no or mild fibrosis (F0/1).2, 3, 66
 
The AUROC for F ≥ 2 varied between the different studies with a range of 68%-100% and a mean AUROC of 84% (95% CI, 0.82-0.86) and an adjusted AUROC of 91%. For this indication, transient elastography alone cannot be used sufficiently in clinical practice. However, taking into account other clinical and diagnostic results, transient elastography can be a helpful tool for directing treatment decisions. The optimal cut-off value for the diagnosis of significant fibrosis suggested from the SROC was 7.65 kPa. However, because of the high heterogeneity caution must be taken when interpreting the results of different populations.
 
Compared with fibrosis biomarkers the disadvantage of transient elastography is the absence of a large control group to assess the limit of normal value (ie, blood donors). In addition, in studies using liver biopsy as a reference method, the number of patients without fibrosis (F0) is very small. Although transient elastography shows the best diagnostic accuracy for the differentiation of F0/1/2/3 and F4, the validated biomarkers are superior in differentiating F0 vs F1 vs F2. Studies thus have shown that the combination of transient elastography with biomarkers can further improve the diagnostic accuracy, especially for the diagnosis of significant fibrosis.17, 41
 
Recently, a series of algorithms based on a sequential combination of noninvasive serum markers showed 93%-95% accuracy in the detection or exclusion of significant liver fibrosis and a reduction of 50% of liver biopsies in this subset of patients with HCV.68 Further studies are needed to investigate if the inclusion of transient elastography in an algorithm with a combination of noninvasive serum markers may further reduce the number of liver biopsies needed. Transient elastography and the serum fibrosis marker FibroTest (BioPredictive, Paris, France) currently have been approved after an independent systematic review by the French Health authorities for the diagnosis of advanced fibrosis and cirrhosis in patients with HCV.
 
Significant heterogeneity was found between the single studies. Different possible reasons (qualitative and quantitative factors) for this heterogeneity were analyzed.
 
Discriminating between the underlying liver diseases led to a reduction of heterogeneity of AUROC for the differentiation of F0/1 vs F2/3/4. These results again support the use of transient elastography for the differentiation of cirrhosis vs no cirrhosis independent of the underlying liver disease, whereas caution needs to be taken for the interpretation of the differentiation of no/mild fibrosis from significant fibrosis.
 
The different scoring systems seem to have an impact on the heterogeneity of the studies and might be partially explained by the different underlying liver diseases that use different scoring systems. Not enough data were available to perform a multivariate analysis to analyze these coherences further.
 
For the diagnosis of significant fibrosis and cirrhosis a significant reduction of heterogeneity was observed when differentiating between the different countries where the studies were performed. This may be explained by different population groups and the quality criteria with respect to study conduction and result reporting. Because most of the studies were abstracts only, detailed information rarely was available. The mean/median length of the liver biopsy specimen was reported in 16 studies only. It ranged from 12 to 35 mm. However, in a subanalysis there was no significant influence of the length of the liver biopsy specimen on the AUROC. Most studies lack further information on the quality of the liver biopsy, ie, the number of fragmentations, the blinding of the pathologist, and the use of a central pathologist, and so forth. This certainly accounts for the heterogeneity between the studies. Nevertheless, assessment of quality by QUADAS items could not explain the heterogeneity between the studies sufficiently. Large international studies with satisfying high-quality criteria with respect to study conduction and result reporting are awaited to overcome these discrepancies.
 
The predictive values of tests are known to be affected by disease prevalence and the distribution of fibrosis stages. However, the prevalence of extreme fibrosis stages described by DANA showed no or only slight influence on the AUROC in the present study (Figure 2, Supplementary Table 4; see Supplementary material online at www.gastrojournal.org). Obviously, the correlation of DANA with the AUROC in our meta-analysis was not as strong as in previous studies in the context with FibroTest13, 69 and as could be expected here (Figure 2, Supplementary Table 4; see supplementary material online at www.gastrojournal.org). This may have several reasons, especially additional reasons, for heterogeneity in our meta-analysis of FibroScan when compared with the published ones of FibroTest. Furthermore, the range of DANA that can vary between 1 and 4 is limited in our meta-regression here (Figure 2) and details on the prevalence of extreme fibrosis stages were not available in all included studies. Therefore, a multivariate analysis (eg, by the analysis of the DANA-adjusted AUROC with a reliable adjustment for DANA) was not possible here. This was a limitation of the present meta-analysis and the influence of differences in the prevalence of the fibrosis stages on AUROC should be examined in future analyses based on individual data.
 
Most studies presented the AUROC as a measure of test performance. However, the AUROC has limitations and may not be the best way to present the diagnostic performance of a test. Unfortunately, SROC analysis showed significant dependence of the diagnostic odds ratios on the chosen threshold because of significant deviations from symmetry and different thresholds used in the single studies. Therefore, we did not perform a meta-analysis of diagnostic odds ratio.
 
The use of liver biopsy as a reference standard for the evaluation of noninvasive methods and markers has methodologic limitations that may influence the performance of these tests. The accuracy of liver biopsy is limited because of intraobserver and interobserver variability and sampling errors.5 In a study on more than 10,000 virtual biopsy specimens Bedossa et al5 showed that liver fibrosis stage is diagnosed correctly in only 65% of cases if the biopsy is at least 15 mm long, in 75% of cases if it is at least 25 mm long, and that the optimal size should be 40 mm. However, most biopsy specimens even at hepatology centers do not fulfill these optimal criteria.70 Nevertheless, transient elastography cannot replace liver biopsy. Liver biopsy as compared with transient elastography gives additional information on the cause of liver injury (viral, hereditary, autoimmune liver disease), necroinflammatory activity, and steatosis. Also, it must be noted that transient elastography cannot be used for the staging of liver fibrosis in patients with acute hepatitis or hepatitis exacerbation because transient elastography measurements significantly overestimate the stage of liver fibrosis during alanine aminotransferase flare.71
 
Data analyzing the discordance of liver biopsy and the panel marker FibroTest showed that this discordance was highly attributed to biopsy in 5% and to the panel marker in 2% (P = .03).70 The investigators concluded that these shortcomings of liver biopsy lead to underestimation of the diagnostic accuracy of noninvasive markers. That this also might apply to the underestimation of transient elastography was shown in another study analyzing the discordance of the panel marker FibroTest and transient elastography compared with liver biopsy. The investigators showed that this discordance was attributable to FibroTest failure in 12.4% and to transient elastography failure in 6.8%.72 At present, a perfect gold standard for the evaluation of liver fibrosis is not available. Liver biopsy, FibroTest, and transient elastography remain imperfect reference methods. Therefore, specific methodology that is independent of a gold standard could be recommended to overcome these limitations at this time point.73 Another possibility would be an optimization of the reference standard (eg, laparoscopic liver biopsy with a biopsy specimen from the left and right lobes of 20-mm length each). Only with an improved, comparable, and standardized reference standard can the true diagnostic performance of transient elastography be evaluated.
 
The ultimate validation of liver fibrosis as a marker of liver injury is its prognostic value in terms of morbidity and mortality. In a recently published study, the biomarker FibroTest was shown to have a 5-year prognostic value similar to that of liver biopsy.74 However, transient elastography is still a novel method and 5-year follow-up studies are not available yet. Large, well-conducted, randomized trials with clearly defined end points (eg, assessing 5-year survival without HCV-related cirrhosis or complications related to liver disease such as liver-related death, liver transplantation, hepatic decompensation, variceal bleeding, hepatocellular carcinoma) are needed to compare transient elastography with liver biopsy and biochemical markers.
 
 
 
 
  iconpaperstack View Older Articles   Back to Top   www.natap.org