Data Scientist

Personal and professional website of Priyanka Oberoi


Data Scientist - 01/2017-present
Deep Learning Analytics
·  Computer vision, machine translation, style transfer, and other things. Mostly in TensorFlow, Caffe and Keras. 

Director of Data - 01/2018-present
Friends of Ashwani Jain, Democrat for Montgomery County Council At-Large

Data Scientist - 02/2016-01/2017
Commerce Data Service, US Department of Commerce
·  Developed predictive models to identify business with the highest probability of partnering with the Department of Commerce, hierarchical clustering models to build profiles on DOC partners, and photovoltaic energy prediction models
·  Led the data science portfolio at NIST, which involved managing teams, domain research, project scoping, communicating statistical methods and findings, and facilitating data product implementation into existing processes
·  Developed open methodology guidelines around data cleaning and data science methods
·  Taught Data Science and R classes to 100+ employees across DOC 

Pro Bono Data Science
Whitman Walker Health - 07/2016-present
· Worked with WWH to formulate goals for a data product, scoped out a multi-part data project, did domain research, cleaned and merged data from disparate sources, and developed statistical models to 1) cluster clients, 2) understand patterns in Medicare Part D usage, 3) manage caseworker loads, 4) model legal service impact on health outcomes

Sexual Orientation and Gender Identity Interagency (SOGI) Working Group - 07/2016-present
·  Used natural language processing topic modeling (latent Dirichlet allocation) to find latent topics in research on surveying sexual orientation and gender identity, in order to inform survey guidelines and research on SOGI

Data Scientist (Lead Scientist) - 07/2015-02/2016
Data Scientist (Staff Scientist) - 01/2014-06/2015
Booz Allen Hamilton
· Managed a team of 5 data scientists at the Food and Drug Administration to make data collected by the agency more readily available through data visualization and stats dashboards to promote transparency, manage workload and resources, encourage platform adoption and facilitate internal and congressional reporting
· Built trust with clients at the FDA by understanding the drug application process, identifying pain points, developing user stories, and releasing minimum viable products early on for users to give feedback on
· Prototyped binary classification models to flag applications at risk of missing PDUFA/GDUFA goal dates
· Leveraged core groups of invested FDA users and rapid iterative development cycles to release more than 25 dashboards over 8 months to multiple offices, which promoted platform adoption and retired standalone databases and excel sheets
· Coded web app with interactive data visualization for genome variant storage and analysis platform using R and R Shiny; stood up instances in Amazon Web Services so Hadoop clusters could query the variant database
· Acted as a liaison for Principal Investigators and other intramural researchers at the NIH’s National Heart, Lung and Blood Institute, to meet their IT and Application needs  
· Visualized patient data in D3.js (JavaScript) and Tableau dashboards from wearable tech to allow healthcare professionals to track patient health post-discharge and flag indicators for high readmission risk

Accreditation Manager - 07/2012-12/2013
National Committee for Quality Assurance

·  Led a cross-functional, multi-department effort to collect data, conduct analyses and prepare reports on the performance of health plans and clinical measure data for the Center for Medicaid and Medicare Services (CMS)
·  Collected, managed and conducted statistical analyses on health plan performance datasets to find gaps in performance, identify areas for reviewer improvement, reevaluate standards of care, and prepare reports for CMS
·  Independently developed and maintained SQL databases to track national health plans

SNP Assessment Analyst - 10/2010-06/2012
National Committee for Quality Assurance

· Collected, managed and conducted statistical analyses on health plan performance datasets to find gaps in performance, identify areas for reviewer improvement, reevaluate standards of care, and prepare reports for CMS

Bioinformatics Intern - 11/2010-02/2011
Smithsonian Institution, National Museum of Natural History

· Handled museum holdings, collected project material, and researched taxonomic literature to populate and organize an integrated, taxonomic, specimen, nomenclatural, and image database for Tabanidae
· Developed data processing scripts and Unix Shell scripting to parse large data files

SNP Assessment Coordinator - 01/2009-09/2010
National Committee for Quality Assurance

· Coordinated efforts between NCQA and external surveyors, internal reviewers, and national health insurance plans regarding eligibility, product-related inquiries, process updates, and timely completion of reviews


Organizer, Women in Data Science
Booz Allen Hamilton - 05/2015-02/2016

Recruiting Chair, GLOBE (LGBT) Forum
Booz Allen Hamilton - 03/2014-02/2016

Chair of Mentors, Advisors and Peers (MAPs) Program
Women in Bio - 4/2014-02/2015

Vice Chair of Mentors, Advisors and Peers (MAPs) Program
Women in Bio - 11/2012-04/2014

Speaker at Open Data Day DC (3/16), University of Chicago Data Science Convening (4/16), Lesbians Who Tech (9/16) 


MS in Biotechnology, concentration in Bioinformatics
Johns Hopkins University
Aug 2013

BA in Biology
Bard College
Dec 2009

Awards & Publications

Article on the USAjobs Data Science page (medium article)

Featured on Booz Allen Hamilton's Innovo Magazine - 2015

NCQA Annual Award for Initiative - 2011

Keesing, F., P. Oberoi, R. Vaicekonyte, K. Gowen, L. Henry, S. Mount, P. Johns, and R.S. Ostfeld. Effects of Garlic Mustard (Alliaria petiolata) on Entomopathogenic Fungi. Ecoscience 18(2)2011:164-168