|
|
|
|
Prediction of HIV-1 coreceptor usage based on structural descriptors of the gp120 V3 loop
|
|
|
Reported by Jules Levin
5th European HIV Drug Resistance Workshop
Cascais, Portugal
March 29, 2007
O. Sander1, T. Sing1, I. Sommer1, A.J. Low2, P.K. Cheung2, P.R. Harrigan2, T. Lengauer1, F.S. Domingues1
1Max-Planck-Institute for Informatics, Computational Biology and Applied Algorithmics, Saarbruecken, Germany, 2British Columbia Centre for Excellence in HIV/AIDS, Vancouver, Canada
Background: HIV cell entry requires one of the chemokine receptors CCR5 or CXCR4 as coreceptor, besides the cell surface receptor CD4. Monitoring
coreceptor usage is of great importance due to its relation to disease progression towards AIDS as well as its relevance for therapeutic decisions regarding
coreceptor inhibitors. Established prediction methods like the 11/25 charge rule or newer methods based on statistical learning techniques or PSSMs are used
to predict coreceptor tropism based on the V3 loop region of the viral envelope protein gp120, without requiring expensive phenotype testing. However,
all predictive methods in current use are utilizing sequence information only, while neglecting structural information. The structural basis of coreceptor specificity is still unclear.
Material & Methods: Recently, a crystal structure of gp120 with its V3 loop was resolved by Huang et al. We use this V3 loop structure as a template to model
the V3 loop of viral variants. While the backbone conformation is kept rigid, the SCWRL method is used to predict side-chain conformations. These models
of viral variants are then represented by a structural descriptor, using pairwise distance distributions between functional atoms in the loop (donor, acceptor,
ambivalent donor/acceptor, aliphatic, aromatic). This vectorial representation captures the spatial arrangement of physico-chemical properties in the V3
loop and allows to apply statistical learning methods like SVMs and random forests to discriminate between CCR5- and CXCR4-using variants. The method is evaluated by 10 replicates of 10-fold cross validation on a data set of 432 non-identical V3 sequences from clonal samples, 97 of the samples being X4 variants.
Results: The structural descriptor significantly improved prediction of coreceptor usage compared to a linear support vector machine trained on sequence data. For a given specificity of 0.95 a sensitivity of 0.77 was achieved, improving further to 0.80 when combined with a sequence-based representation using amino acid indicators. This compares favorably to the sensitivity of 0.73 for purely sequence-based prediction. The increase in performance is observed
consistently with other performance measures, such as accuracy, area under the ROC curve (AUC), and positive predictive value (PPV).
Conclusion: By using statistical importance measures, structural features relevant for cocreceptor usage can be mapped onto the structure allowing
for visual and quantitative interpretation. Future developments will be concerned with relaxation of the backbone rigidity and improved modeling of V3 variants with indels.
|
|
|
|
|
|
|