icon-folder.gif   Conference Reports for NATAP  
  Reported by Jules Levin
IDWeek Oct 26-30
New Orleans 2016
Back grey_arrow_rt.gif
Algorithm Sifting Medical Records May Predict Best PrEP Candidates - Automated Identification of Potential Candidates for HIV Pre-Exposure Prophylaxis using Electronic Health Record Data
  IDWeek 2016, October 26-30, 2016, New Orleans
Mark Mascolini
Automated identification of preexposure prophylaxis (PrEP) candidates may be possible by sorting HIV risk factors that can be retrieved from electronic medical records, according to a 14,000-person analysis of a large Boston healthcare system [1]. The researchers estimated that more than 8000 people in the healthcare system may be unidentified PrEP candidates.
The CDC estimates that 1.2 million US residents may benefit from PrEP, including 1 in 4 sexually active men who have sex with men, 1 in 5 people who inject drugs, and 1 in 200 people with heterosexual HIV risk. But identifying good PrEP candidates can be challenging in a busy medical practice with other pressing concerns.
Researchers at Boston's Beth Israel Deaconess Medical Center, Harvard, and other Boston sites asked whether clinicians can use electronic health records to single out people at high HIV risk who could benefit from PrEP. They aimed to develop an automated algorithm to identify such patients.
The study involved clients of Atrius Health, an 800,000-person, 27-site ambulatory practice in the Boston area. First the researchers extracted four sets of potentially relevant data on people who had become infected with HIV (cases) and people who had not (controls): (1) demographics (age, sex), (2) lab results (gonorrhea tests per year, HCV status), (3) diagnoses (anorectal ulcers, opioid dependence), and (4) prescriptions (bicillin, Suboxone [buprenorphine plus naloxone for opioid addiction]).
Next the Boston team matched each case to 100 controls on gender, duration of Atrius Health affiliation, and year. Matching did not include demographic criteria such as age or race because they might predict HIV risk.
The analysis involved 138 incident HIV cases matched to 13,800 HIV-negative controls. Comparison of potentially predictive variables between cases and controls showed much higher rates in people diagnosed with HIV, for example, an anal cytology procedure code (6.5% versus <0.1%), bicillin prescription in the past year (3.6% versus <0.1%), and ever having a positive gonorrhea test (5.8% versus 0.1%).
Finally, the investigators developed algorithms to predict new HIV infection and tested them by logistic regression or machine learning tools. The machine-learning Ridge regression system [2] proved the best predictor with an area under the curve of 0.76 comparing true positives with false positives.
Most people in the Atrius Health system had low or very low HIV prediction scores. But 8414 Atrius clients (1.1% of 800,000) had a score indicating high risk. Only 249 Atrius clients currently use PrEP (0.03%), so the analysis suggests a substantial number of individuals who may benefit from PrEP do not use it.
The investigators plan to validate the algorithm in Boston's Fenway Health system, which includes 2200 people with HIV and a higher proportion of clients already using PrEP (about 1500 of 30,000, or 5%). Ultimately, the researchers hope to launch a pilot test of the health record-based algorithm.
1. Krakower D, Gruber S, Menchaca JT, et al. Automated identification of potential candidates for HIV pre-exposure prophylaxis using electronic health record data. IDWeek 2016, October 26-30, 2016, New Orleans. Abstract 860.
2. Jain A. A complete tutorial on Ridge and Lasso regression in Python. January 2016. https://www.analyticsvidhya.com/blog/2016/01/complete-tutorial-ridge-lasso-regression-python/


Program abstract:
Background: HIV pre-exposure prophylaxis (PrEP) decreases HIV transmission but uptake of PrEP has been limited. We developed automated algorithms to identify persons at increased risk for acquiring HIV using routine structured electronic health record (EHR) data.
Methods: We extracted potentially relevant demographics (e.g. age, sex, race), diagnoses (e.g. anal dysplasia, substance use disorders), prescriptions (e.g. suboxone, sildenafil), laboratory tests (e.g. syphilis, gonorrhea, chlamydia, hepatitis C), and procedures (e.g. anal pap smear) from the EHR repository of Atrius Health, a large ambulatory practice group in Massachusetts with ~800,000 patients. We matched each patient with incident HIV during 2006-2015 to 100 HIV-uninfected controls with similar gender and duration of affiliation with Atrius Health as of the year of HIV diagnosis. We developed logistic regression models and machine learning algorithms to predict incident HIV infection in cases vs controls. Machine learning methods included LASSO, Ridge, SVM, and super learner. We assessed area-under-the-curve (AUC), sensitivity, and positive predictive value (PPV) of each algorithm and compared prediction scores amongst 45 patients independently started on PrEP by Atrius clinicians versus the general population.
Results: There were 138 incident HIV infections in the population. Ten-fold cross-validated AUC was highest with the Ridge method (AUC 0.77, sensitivity 59%, PPV 0.34% for patients with risk scores in the top quintile). Strong predictors included a prior diagnosis of anorectal ulcer, total number of positive gonorrhea tests, and acute HIV testing in the past two years. Median HIV prediction scores for patients independently started on PrEP by Atrius clinicians were significantly higher than those of the general population (median 5.43x10-4 vs 9.05x10-5, P<.0001). There were 3,229 patients in the general population (0.37%) with risk scores above the median scores of patients started on PrEP by Atrius clinicians. These are promising potential candidates to evaluate for interest and suitability for PrEP.
Conclusion: Automated analysis of data routinely stored in EHRs can identify patients at increased risk for incident HIV who are potential candidates for PrEP.
clinicians cite challenges to identifying persons most likely to benefit from using PrEP9