icon-    folder.gif   Conference Reports for NATAP  
 
  Fatty Liver Disease

 
 
 
 
Validity criteria for the diagnosis of fatty liver by M probe-based controlled attenuation parameter
 
 
  Download the PDF here
 
"CAP is performed together with LSM and can be used to detect fatty liver and assess the degree of liver injury simultaneously and conveniently. However, compared with LSM, the overall accuracy of CAP is lower, with a significant proportion of patients being misclassified.[9], [10], [11], [12], [13], [17] Head-to-head comparisons also showed that CAP is inferior to magnetic resonance imaging-based proton density fat fraction.24 Nevertheless, because of cost and availability, TE will likely remain a commonly used non-invasive test in clinical practice, and it is all the more important for identifying factors associated with the accuracy of CAP."
 
"while the diagnosis of fatty liver based on ≥5% steatosis by liver biopsy was in accordance with international recommendations, newer magnetic resonance imaging (MRI)-based techniques such as MRI-proton density fat fraction (MRI-PDFF) and proton-magnetic resonance spectroscopy have been shown to be highly accurate and even more sensitive than liver biopsy in detecting changes in liver fat over time.35 In head-to-head comparison, MRI-PDFF has shown higher applicability and accuracy than CAP in the United States and Japan.[24], [36] It would be important to validate the validity criteria of CAP against MRI in future studies. Similarly, the performance of XL probe-based CAP measurement should also be validated against MRI. Quantitative ultrasound has also shown good correlation with MRI-PDFF and deserves further evaluation.37
 
In conclusion, the validity of CAP for the diagnosis of fatty liver is lower if the IQR of CAP is ≥40 dB/m. Traditional factors affecting the performance of LSM have little impact on the validity of CAP. Our findings provide guidance on the interpretation of CAP results in routine clinical practice.
 
Lay summary: Controlled attenuation parameter (CAP) is measured by transient elastography (TE) for the detection of fatty liver. In this large study, using liver biopsy as a reference, we show that the variability of CAP measurements based on its interquartile range can reflect the accuracy of fatty liver diagnosis. In contrast, other clinical factors such as adiposity and liver enzyme levels do not affect the performance of CAP."
 
---------------------
 
Validity criteria for the diagnosis of fatty liver by M probe-based controlled attenuation parameter
 
Jnl of Hepatology - Vincent Wai-Sun Wong1,2,,y, Salvatore Petta3,y, Jean-Baptiste Hiriart4, Calogero Camma3, Grace Lai-Hung Wong1,2, Fabio Marra5, Julien Vergniol4, Anthony Wing-Hung Chan6, Antonino Tuttolomondo7, Wassil Merrouche4, Henry Lik-Yuen Chan1,2, Brigitte Le Bail8,9, Umberto Arena5, Antonio Craxi3, Victor de Ledinghen4,8, 1Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong; 2State Key Laboratory of Digestive Disease, The Chinese University of Hong Kong, Hong Kong; 3Sezione di Gastroenterologia, Di.Bi.M.I.S., University of Palermo, Palermo, Italy; 4Centre d'Investigation de la Fibrose Hepatique, Hopital Haut-Leveque, Bordeaux University Hospital, Pessac, France; 5Dipartimento di Medicina Sperimentale e Clinica, Universita degli Studi di Firenze, Florence, Italy; 6Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Hong Kong; 7Sezione di Medicina Interna e Cardioangiologia, Di.Bi.M.I.S., University of Palermo, Palermo, Italy; 8INSERM U1053, Bordeaux University, Bordeaux, France; 9Service de Pathologie, Hopital Pellegrin, Bordeaux University Hospital, Bordeaux, France
 
Highlights
 
⋅Controlled attenuation parameter (CAP) can detect fatty liver with moderate accuracy.
⋅The interquartile range (IQR) of CAP reflects its measurement variability.
⋅The accuracy of CAP declines when its IQR exceeds 40 dB/m.
⋅The accuracy of CAP is not affected by high transaminase or bilirubin levels.
 
Background & Aims
 
Controlled attenuation parameter (CAP) can be performed together with liver stiffness measurement (LSM) by transient elastography (TE) and is often used to diagnose fatty liver. We aimed to define the validity criteria of CAP.
 
Methods
 
CAP was measured by the M probe prior to liver biopsy in 754 consecutive patients with different liver diseases at three centers in Europe and Hong Kong (derivation cohort, n = 340; validation cohort, n = 414; 101 chronic hepatitis B, 154 chronic hepatitis C, 349 non-alcoholic fatty liver disease, 37 autoimmune hepatitis, 49 cholestatic liver disease, 64 others; 277 F3-4; age 52 ± 14; body mass index 27.2 ± 5.3 kg/m2). The primary outcome was the diagnosis of fatty liver, defined as steatosis involving ≥5% of hepatocytes.
 
"A TE examination involves 10 measurements Transient elastography (TE)
 
TE (FibroScan, Echosens, Paris, France) was performed less than one week before liver biopsy, after fasting for at least 6 h. All operators had undergone formal training and performed at least 100 examinations before this study. Measurements were performed using the M probe on the right lobe of the liver, through intercostal spaces, with the patient lying in a dorsal position with the right arm in maximal abduction.20 Only cases with 10 or more successful acquisitions were evaluated. The success rate was calculated as the number of successful measurements divided by the total number of measurements. The operators were blinded to the clinical data and diagnosis."
 
Results
 
The area under the receiver-operating characteristics curve (AUROC) for CAP diagnosis of fatty liver was 0.85 (95% CI 0.82-0.88). The interquartile range (IQR) of CAP had a negative correlation with CAP (r = -0.32, p <0.001), suggesting the IQR-to-median ratio of CAP would be an inappropriate validity parameter. In the derivation cohort, the IQR of CAP was associated with the accuracy of CAP (AUROC 0.86, 0.89 and 0.76 in patients with IQR of CAP <20 [15% of patients], 20-39 [51%], and ≥40 dB/m [33%], respectively). Likewise, the AUROC of CAP in the validation cohort was 0.90 and 0.77 in patients with IQR of CAP <40 and ≥40 dB/m, respectively (p = 0.004). The accuracy of CAP in detecting grade 2 and 3 steatosis was lower among patients with body mass index ≥30 kg/m2 and F3-4 fibrosis.
 
Conclusions
 
The validity of CAP for the diagnosis of fatty liver is lower if the IQR of CAP is ≥40 dB/m.
 
Lay summary: Controlled attenuation parameter (CAP) is measured by transient elastography (TE) for the detection of fatty liver. In this large study, using liver biopsy as a reference, we show that the variability of CAP measurements based on its interquartile range can reflect the accuracy of fatty liver diagnosis. In contrast, other clinical factors such as adiposity and liver enzyme levels do not affect the performance of CAP.
 
Introduction
 
Non-alcoholic fatty liver disease (NAFLD) is currently the most common chronic liver disease worldwide and has become an important cause of end-stage liver disease and hepatocellular carcinoma.[1], [2], [3], [4] The presence of fatty liver and metabolic syndrome in patients with chronic viral hepatitis is also associated with increased risk of cirrhosis and hepatocellular carcinoma.[5], [6], [7], [8] Abdominal ultrasonography is commonly used to diagnose fatty liver, but it cannot reliably diagnose mild steatosis, and its performance is suboptimal in obese patients. Recently, the controlled attenuation parameter (CAP) was developed as a new test for fatty liver. It is based on the physical phenomenon that the amplitude of ultrasound waves is attenuated more rapidly when they traverse across a steatotic liver. In previous studies, CAP had moderate to good accuracy for fatty liver detection, when compared to histology or magnetic resonance spectroscopy.[9], [10], [11], [12], [13] CAP is measured simultaneously with liver stiffness (LSM) using transient elastography (TE). It is thus possible to diagnose fatty liver and assess the disease severity at the same time.
 
A TE examination involves 10 measurements, with the median CAP and LSM taken as the estimates for liver fat and fibrosis, respectively, and the interquartile range (IQR) as the dispersion or fluctuation of the measurements. For LSM, the IQR-to-median ratio is a well-recognized parameter for determining the validity of measurement.14 A high IQR-to-median ratio reflects inconsistent results from the 10 measurements and is associated with less accurate results. However, the validity criteria for CAP are undefined. It is therefore difficult for clinicians to interpret CAP results.
 
In this study, we aim to determine factors associated with less accurate CAP measurements and define the validity criteria for CAP.
 
Discussion
 
CAP is performed together with LSM and can be used to detect fatty liver and assess the degree of liver injury simultaneously and conveniently. However, compared with LSM, the overall accuracy of CAP is lower, with a significant proportion of patients being misclassified.[9], [10], [11], [12], [13], [17] Head-to-head comparisons also showed that CAP is inferior to magnetic resonance imaging-based proton density fat fraction.24 Nevertheless, because of cost and availability, TE will likely remain a commonly used non-invasive test in clinical practice, and it is all the more important for identifying factors associated with the accuracy of CAP.
 
The IQR-to-median ratio has been the most important validity indicator for LSM.[14], [20] Normalization for the median LSM is important because the IQR of LSM increases with median LSM values. In stark contrast with LSM, we found that the IQR of CAP declined slightly with increasing median CAP values in this large biopsy cohort (Fig. 3). Thus, the IQR-to-median ratio of CAP cannot be used for normalization. Instead, it potentiates the negative correlation with median CAP and would only spuriously make CAP examination appear more accurate in patients with high CAP values. For this reason, a previous study involving 153 patients with BMI ≥28 kg/m2 failed to show any impact of the IQR-to-median ratio on the accuracy of CAP.13
 
Among the quality indicators, the IQR of CAP turns out to be the most significant factor associated with the validity of CAP. The IQR of CAP reflects the variability of measurements and how well the examination is performed. An IQR of CAP of ≥40 dB/m was consistently associated with inferior accuracy in both the derivation and validation cohorts. Importantly, it also affected the accuracy of diagnosing NAFLD, which is the main clinical use of CAP.15 Besides, the new validity criteria performed similarly across different subgroups.
 
Although this study adopted the CAP cut-offs suggested by Sasso and colleagues,9 we acknowledge the selection of cut-offs is arbitrary and represents a compromise between sensitivity and specificity. As such, we further tested the validity criteria across a wide range of suggested cut-offs in the literature and confirmed that the validity criteria worked well regardless of the cut-offs (Table S5).
 
Furthermore, a few factors are well known to confound LSM. False-positive LSM can occur in patients with high ALT,25 biliary obstruction,26 congested liver,27 and extreme BMI.28 Patients with some of these characteristics were included in this study, but ALT, bilirubin and BMI did not affect the accuracy of CAP to detect fatty liver (Fig. 2). On the other hand, the accuracy of CAP to detect moderate to severe steatosis was lower in NAFLD patients with BMI ≥30 kg/m2 and F3-4 fibrosis (Table S2). Since fibrotic stage did not influence the CAP values after adjusting for steatosis grade, the apparently poorer accuracy of CAP in patients with advanced fibrosis was probably due to the small number of subjects in this subgroup. In contrast, high BMI was consistently associated with higher CAP values after adjusting for steatosis grade. Previously, the skin capsular distance has also been shown to be associated with increased CAP values.29 Since BMI and skin capsular distance are both surrogate markers of adiposity, it is difficult to determine the mechanism underlying the association. In any case, while CAP remains robust in the diagnosis of fatty liver across different subgroups, one should exercise caution in using CAP to determine steatosis grading in obese patients.
 
Our study has the strength of a large sample size, the use of histological correlation and the inclusion of patients from different ethnic backgrounds. The consistent findings across subgroups and participating centers add weight to our conclusions. Nonetheless, there are also a few limitations. Firstly, although the histological slides were scored by one experienced pathologist at each center, there was no central pathologist to review all of the cases. However, compared with other histological features, the assessment of hepatic steatosis tends to be robust.30 We also included only satisfactory liver samples to minimize bias. Secondly, the study included patients with different etiologies. However, previous studies did not show a significant impact of specific liver diseases on the CAP value,[10], [11], [12], [13] and our subgroup analysis in the NAFLD cohort yielded consistent results. Thirdly, for historical reasons, CAP was measured using the M probe. Since patients with fatty liver have higher BMI, the applicability of the M probe is lower. To this end, the XL probe has been developed to cater for obese patients.[31], [32], [33] CAP can also be measured by the XL probe now; its performance and validity criteria warrant further studies.34 Finally, NAFLD was overrepresented in this cohort. Although consistent findings were observed across subgroups, one should exercise caution when extrapolating the results to other liver diseases. That said, in clinical practice, the diagnosis of NAFLD is the most important use of CAP, and we have specifically evaluated CAP and the validity criteria in the NAFLD population.
 
Furthermore, while the diagnosis of fatty liver based on ≥5% steatosis by liver biopsy was in accordance with international recommendations, newer magnetic resonance imaging (MRI)-based techniques such as MRI-proton density fat fraction (MRI-PDFF) and proton-magnetic resonance spectroscopy have been shown to be highly accurate and even more sensitive than liver biopsy in detecting changes in liver fat over time.35 In head-to-head comparison, MRI-PDFF has shown higher applicability and accuracy than CAP in the United States and Japan.[24], [36] It would be important to validate the validity criteria of CAP against MRI in future studies. Similarly, the performance of XL probe-based CAP measurement should also be validated against MRI. Quantitative ultrasound has also shown good correlation with MRI-PDFF and deserves further evaluation.37
 
In conclusion, the validity of CAP for the diagnosis of fatty liver is lower if the IQR of CAP is ≥40 dB/m. Traditional factors affecting the performance of LSM have little impact on the validity of CAP. Our findings provide guidance on the interpretation of CAP results in routine clinical practice.
 
Patients and methods
 
Patients

 
This was a cross-sectional study of a prospective cohort of adult patients aged 18 years or above who underwent liver biopsy for the evaluation of chronic liver diseases at two European centers and one Hong Kong center. We excluded patients with less than 10 successful CAP/LSM acquisitions and those with liver biopsy specimens shorter than 15 mm. The study protocol was approved by all participating centers. All patients provided informed written consent. Part of this cohort (94 patients from Hong Kong and 124 patients from Bordeaux) was described in previous publications[15], [16] and was included in a recent meta-analysis on the performance of CAP.17
 
To determine factors associated with the accuracy of CAP and to test the validity criteria, we divided the patients into the derivation (France) and validation (Italy and Hong Kong) cohorts. The latter allowed validation of the study findings in a multi-ethnic cohort.
 
Since the main application of CAP is to diagnose NAFLD, we further evaluated the performance of CAP and the validity criteria by comparing NAFLD patients with healthy controls, who participated in a population screening project using proton-magnetic resonance spectroscopy and TE.18 The controls had no fatty liver, as evidenced by an intrahepatic triglyceride content of <5% on two occasions, three to five years apart.
 
Clinical assessment
 
Medical history was obtained and physical examination was performed for all patients. Body mass index (BMI) was calculated as body weight (kg) divided by body height (m) squared. The liver etiologies were determined by history, viral hepatitis serology (hepatitis B surface antigen and anti-hepatitis C virus antibody), autoimmune markers and liver histology. Liver biochemistry, complete blood count and international normalized ratio were checked less than one week before liver biopsy.
 
Histological assessment
 
Liver biopsy was performed using 16G trucut needles and the specimens were evaluated by experienced pathologists. Steatosis grade was determined according to the Non-alcoholic Steatohepatitis Clinical Research Network (NASH CRN) scoring system, based on parenchymal involvement by steatosis under low- to medium-power evaluation: S0 = <5%, S1 = 5-33%, S2 = >33-66%, and S3 = >66%.19 Patients with ≥5% steatosis were considered to have fatty liver. Because it would be inappropriate to use the same fibrosis staging system in patients with different liver diseases, we classified the patients as having no to moderate fibrosis (F0-2 by the METAVIR or NASH CRN systems), bridging fibrosis (F3) and cirrhosis (F4).
 
Transient elastography (TE)
 
TE (FibroScan, Echosens, Paris, France) was performed less than one week before liver biopsy, after fasting for at least 6 h. All operators had undergone formal training and performed at least 100 examinations before this study. Measurements were performed using the M probe on the right lobe of the liver, through intercostal spaces, with the patient lying in a dorsal position with the right arm in maximal abduction.20 Only cases with 10 or more successful acquisitions were evaluated. The success rate was calculated as the number of successful measurements divided by the total number of measurements. The operators were blinded to the clinical data and diagnosis.
 
Statistical analysis
 
Statistical tests were performed using the IBM SPSS Statistics 23 (IBM Corporation, Armonk, NY). Continuous variables were expressed as mean ± standard deviation, or median (interquartile range) if the data were not normally distributed. The primary outcome was fatty liver, defined as hepatic steatosis of 5% or more by histology. Receiver-operating characteristics curves were constructed to assess the accuracy of CAP for detecting fatty liver. This study focused on the detection of fatty liver rather than individual steatosis grades, because the severity of steatosis correlates poorly with liver injury and adverse clinical outcomes,[21], [22] and previous studies showed considerable overlap in CAP values across different steatosis grades.[9], [10], [11], [12], [13] The area under the receiver-operating characteristics curves (AUROC) was compared using the method by Hanley and McNeil.23 The correlation between CAP and the potential quality indicators of TE examination was determined using the Pearson's coefficients. All statistical tests were two-sided. Significance was taken as p <0.05. With a sample size of 750 patients, the AUROC of CAP could be determined at a standard error of 0.01 to 0.04.
 
For further details regarding the materials used, please refer to the Supplementary material and the CTAT table.
 
Results
 
From May 2009 to September 2016, 1,036 patients underwent paired TE and liver biopsy. After excluding patients with suboptimal liver biopsy specimens and TE examination, 754 patients were included in the final analysis (340 from the French cohort, 203 from the Italian cohort, and 211 from the Hong Kong cohort) (Table 1, Fig. S1). The excluded patients were older, more likely to have diabetes, hypertension and high BMI, but had lower albumin, alanine aminotransferase (ALT) and platelet count (Table S1). The mean age of the patients included was 52 ± 14 years, and the BMI was 27.2 ± 5.3 kg/m2 (Table S1). One hundred and thirty-six (18%) and 141 (19%) patients had bridging fibrosis and cirrhosis, respectively. The Hong Kong cohort was enriched with patients with NAFLD and diabetes. In contrast, patients from the Italian cohort had less severe hepatic steatosis and fibrosis. Overall, 101 patients had chronic hepatitis B, 154 had chronic hepatitis C, 349 had NAFLD, 37 had autoimmune hepatitis, 49 had cholestatic liver disease, and 64 had other liver diseases. Table S2 shows the clinical characteristics of patients with different liver diseases.
 
Performance of CAP
 
CAP had good overall accuracy in detecting fatty liver, with an AUROC of 0.85 (Fig. S2). The accuracy of CAP was highest in Hong Kong (p = 0.008 vs. France and p = 0.002 vs. Italy), likely reflecting the difference in the proportion of patients with moderate to severe steatosis. Based on previously published cut-offs of 222 and 290 dB/m, CAP had a sensitivity of 87.8% and a specificity of 90.5% for detecting fatty liver, respectively (Table 2).
 
Among 435 patients with recorded abdominal ultrasonography results, the AUROC of ultrasonography for diagnosing fatty liver was 0.76 (95% confidence interval [CI] 0.72-0.81). The sensitivity, specificity, positive predictive value and negative predictive value were 65%, 88%, 92% and 54%, respectively. Using ultrasonography as the reference standard, the AUROC of CAP for detecting radiological fatty liver was 0.79 (95% CI 0.75-0.83).
 
Because the main clinical application of CAP is to diagnose NAFLD, we also tested the performance of CAP in patients with biopsy-proven NAFLD, with reference to healthy controls with intrahepatic triglyceride content of <5%, by proton-magnetic resonance spectroscopy (Table S3). As expected, patients with NAFLD had higher BMI values and were more likely to have diabetes and hypertension. They also had higher LSM and CAP values. Overall, the AUROC of CAP for diagnosing NAFLD was 0.88 (95% CI 0.85-0.90). At the cut-off of 222 dB/m, the sensitivity, specificity, positive predictive value and negative predictive value of CAP, for diagnosing NAFLD, were 93.1%, 45.1%, 57.5% and 89.1%, respectively. The corresponding values at the cut-off of 290 dB/m were 73.6%, 89.9%, 85.4% and 81.0%, respectively.
 
Factors affecting the performance of CAP
 
The performance of CAP was consistent across various subgroups. The AUROC of CAP for detecting fatty liver was similar in patients of different ages and gender (Fig. 1). In addition, traditional factors affecting the performance of LSM, like BMI, bilirubin and serum alanine aminotransferase (ALT) level did not affect the performance of CAP. The performance of CAP was also stable across fibrosis stages. On the other hand, the accuracy of CAP appeared lower in patients with chronic hepatitis C and cholestatic liver disease, though the number of subjects was small in the latter group.
 
In patients with chronic hepatitis B and C, age, gender, BMI, bilirubin, ALT and fibrosis stage did not affect the performance of CAP (Table S4). While the same factors did not affect the diagnosis of NAFLD, using healthy subjects with an intrahepatic triglyceride content of <5% as controls, the accuracy of CAP in detecting grade 2 and 3 steatosis was lower among patients with BMI ≥30 kg/m2 and F3-4 fibrosis (Table S5). At each steatosis grade, patients with BMI ≥30 kg/m2 had higher CAP values than those <30 kg/m2 (S1: 321 [272-352] vs. 278 [231-302], p = 0.003; S2: 335 [302-361] vs. 320 [296-343], p = 0.060; 350 [326-365] vs. 327 [311-348], p = 0.002). On the other hand, fibrosis stage did not affect CAP values after adjusting for steatosis grade (data not shown).
 
The IQR-to-median ratio of LSM is the most important validity indicator of LSM.14 We therefore evaluated the meaning of the IQR of CAP. In the overall population, CAP had a mild negative correlation with the IQR of CAP (Fig. 2A). As a result, adjustment using the IQR-to-median ratio of CAP potentiated the negative correlation with CAP (Fig. 2B). This contrasted with LSM, in which the IQR of LSM increased with increasing LSM (Fig. 2C). Adjustment using the IQR-to-median ratio of LSM normalized the association with LSM, making it a reasonable quality indicator across the entire range of LSM (Fig. 2D). Thus, the IQR of CAP instead of its IQR-to-median ratio would be a better marker of measurement variability.
 
In the derivation cohort, the accuracy of CAP was lower in patients with IQR of CAP ≥40 dB/m (AUROC 0.76 vs. 0.86 in those with IQR of CAP <20 dB/m and 0.89 in those 20-39 dB/m; Table 3) In contrast, the success rate of measurements did not affect the accuracy of CAP. Because the IQR-to-median ratio of LSM is used to determine the validity of LSM, we also tested if it could reflect the validity of the entire examination, including CAP measurements. However, a higher IQR-to-median ratio of LSM did not affect the validity of CAP.
 
Using the cut-off of 222 dB/m, the sensitivity, specificity, positive and negative predictive values of CAP for fatty liver detection, in the derivation cohort, were 78.1%, 65.0%, 80.3% and 61.9%, respectively in patients with an IQR of CAP ≥40 dB/m (Table 4). The corresponding figures increased to 89.5%, 65.2%, 91.0% and 61.2%, respectively, in patients with an IQR of CAP <40 dB/m. Likewise, using the cut-off of 290 dB/m, the corresponding figures were 46.6%, 87.5%, 87.2% and 47.3 in patients with IQR of CAP ≥40 dB/m; and 60.8%, 93.5%, 97.3% and 37.7% in patients with IQR of CAP <40 dB/m, respectively.
 
In the validation cohort, the AUROC of CAP for detecting fatty liver was 0.90 (95% CI 0.86-0.94) in 275 patients with an IQR of CAP of <40 dB/m, and 0.77 (95% CI 0.70-0.85) in 139 patients with an IQR of CAP of ≥40 dB/m (p = 0.004). The sensitivity of CAP values increased from 75.0% in patients with an IQR of CAP ≥40 dB/m, to 94.2% in patients with an IQR of CAP <40 dB/m, at a CAP cut-off of 222 dB/m (Table 4). The positive predictive value also increased from 72.0% to 85.5% at the same cut-off, and from 81.8% to 95.5% at the high cut-off of 290 dB/m.
 
As shown in Fig. 3, the AUROC of CAP decreased with increasing IQR of CAP, with the AUROC dropping below 0.80 when the IQR of CAP was over 40 dB/m.
 
Consistent findings were observed in the individual centers of the validation cohort. In the Italian cohort, the AUROC of CAP for detecting fatty liver was 0.83 (95% CI 0.75-0.90) in patients with an IQR of CAP of <40 dB/m, and 0.75 (95% CI 0.65-0.86) in those with an IQR of CAP of ≥40 dB/m. Likewise, in the Hong Kong cohort, the AUROC of CAP for detecting fatty liver was 0.96 (95% CI 0.92-0.99) in patients with an IQR of CAP of <40 dB/m, and 0.80 (95% CI 0.68-0.91) in those with an IQR of CAP of ≥40 dB/m.
 
The IQR of CAP did not correlate with fibrosis (rho = -0.010, p = 0.79) in the overall population. Among NAFLD patients, the IQR of CAP had a borderline negative correlation with lobular inflammation (rho = -0.103, p = 0.052) and ballooning (rho = -0.149, p = 0.001).
 
Validation of the validity criteria in NAFLD
 
Similar to the main analysis, the AUROC of CAP for diagnosing NAFLD, with reference to healthy controls with intrahepatic triglyceride content of <5%, was higher among patients with an IQR of CAP <40 dB/m (0.89 [95% CI 0.86-0.92] vs. 0.81 [95% CI 0.75-0.87]). The sensitivity of CAP values increased from 84.9% to 95.3% cut-off in patients with an IQR of CAP <40 dB/m, at the CAP cut-off of 222 dB/m (Table S6). The positive predictive value increased from 44.3% to 61.9% at the same cut-off, and from 76.5% to 87.2% at the high cut-off of 290 dB/m.
 
Sensitivity analysis
 
The validity criteria were applied similarly across subgroups, except in those with small patient numbers (Table 5). Using different CAP cut-offs in the literature, the diagnostic accuracy of CAP was consistently better among patients with an IQR of CAP <40 dB/m (Table S7). Besides, if all 845 patients with liver biopsy length of ≥10 mm were included, an IQR of CAP <40 dB/m remained predictive of better diagnostic accuracy (AUROC 0.85 [95% CI 0.81-0.88] vs. 0.79 [95% CI 0.73-0.84]).