Contact: prefix@suffix where prefix=ssaria and suffix=cs.jhu.edu
Twitter: Follow @suchisaria
Brief Bio: Broadly my interests span machine learning, computational statistics, and its applications
to domains where one has to draw inferences from observing a complex, real-world system evolve over time. The emphasis of my research is on Bayesian and
probabilistic graphical modeling approaches for addressing challenges associated with modeling and prediction in real-world temporal systems. In the last seven years, I have been particularly drawn to problems that involve modeling data from sensing platforms and electronic health records, as I think these present a tremendous opportunity for high impact work.
See my recent article on this topic. Also, this (undeservingly) generous article by the ACM's XRDS Crossroads (the ACM Magazine for Students) highlights some of the work in our lab.
Prior to joining Johns Hopkins, I did my PhD at Stanford with Dr. Daphne Koller. I also spent a year at Harvard University collaborating with Dr. Ken Mandl and Dr. Zak Kohane as
an NSF Computing Innovation Fellow. While in the valley, I also spent time as an early employee at Aster Data Systems, a big data startup acquired by Teradata. I am an advisor to Patient Ping. I'm also an advisor on data quality and analysis to CancerLinQ, a learning health system by the American Society of Clinical Oncology.
I'm originally from Darjeeling, India. I can be bribed with good tea.
PhD applicants: You will likely find my FAQ below useful. Please read before you send a note. Regarding specific areas
of study, we're looking to accept students interested in probabilistic modeling, scalable inference, causal inference and sequential decision making. If you're interested
in a program that allows you to get training in both computer science and statistics, our PhD students have the flexibility to do so. Apply here.
POS: Postoctoral and Research Scientist Openings: Email me a copy of your CV.
We are especially interested in candidates with experience in large scale modeling with Bayesian methods, approximate inference, non-parametric methods, or causal inference. We also have
a new position for a candidate interested in the intersection of genetics and machine learning with Dr. Chakravarti and I.
Excited to speak on AI and Healthcare at the White House Frontiers Meeting in the National Track. More here.
Selected to Popular Science's "Brilliant 10". More here and here.
Excited to speak at the CCC, AAAI and White House's Office of Science and Technology Policy (OSTP) workshops on the Future of Artificial Intelligence. I gave a talk at the AI for Social Good meeting held in DC on making "meaningful use" of healthcare data using machine learning. More here.
AI's 10 to Watch. Selected by the IEEE Intelligent Systems once every two years to celebrate "young stars" in the field of artificial intelligence (AI). Selected for research on ``Reasoning Engine for Individualizing Healthcare'' here.
IJCAI's Early Career Spotlight. Invited by IJCAI to the "early career spotlight". Here are the other spotlight presenters.
Science Transtional Medicine Cover article for work on early detection of patients at high risk for septic shock using routinely collected EHR data.
Discovery Award Our work received two (!) of the Hopkins Discovery awards, the first on a new computational framework for large-scale discovery of autoimmune regulators in rheumatic diseases and the second for translating
our models for sepsis. These are highly competitive awards and ours were 2 of the 23 that were selected from a pool of 230 submissions.
National Science Foundation Smart and Connected Health Research Grant award for developing computational models for prediction in complex, chronic conditions. More here and here.
Google Research Award for developing machine learning tools for extracting information from electronic health records. More here.
Annual Scientific Award given to the top submission by the Society of Critical Care for our work on early detection of sepsis (selected from 1000+ submissions).
Betty and Gordon Moore Foundation Research award on building safer ICUs. More here.
National Science Foundation Computing Innovation Fellowship; 17 awarded nationally.
Science Transtional Medicine Cover article. More here and here.
American Medical Informatics Association Best Paper Finalist for work on automated annotation of outcomes from electronic health record data.
Uncertainty in Artificial Intelligence Best Student Paper for work on inference for continuous time discrete space models.
[ML] Y. Xu, Y. Xu, S. Saria. A Bayesian Nonparametic Approach for Estimating Individualized Treatment-Response Curves.pdfNEW
[ML] Q. Liu, K. Henry, Y. Xu, S. Saria. Using Causal Inference to Estimate What-if Outcomes for Targeting Treatments. NIPS workshop on "What if" Reasoning, 2016. pdf.
[ML] P. Schulam, S. Saria. Integrative Analysis Using Coupled Latent Variable Models for Individualizing Prognoses. Journal of Machine Learning Research 17 (2016) 1-35. pdf. NEW
[ML] D. Robinson*, S. Saria*. Trading-Off Cost of Deployment Versus Accuracy in Learning Predictive Models. International Joint Conference of Artificial Intelligence (IJCAI), 2016. pdfNEW *equal contribution
[ML] P. Schulam, S. Saria. A Framework for Individualizing Predictions of Disease Trajectories by Exploiting Multi-resolution Structure. Neural Information Processing Systems (NIPS), 2015. pdfNEW
[ML] K. Dyagilev, S. Saria. Learning (Predictive) Risk Scores in the Presence of Censoring due to Interventions. Machine Learning, March 2016, Volume 102, Issue 3, pp 323-348. pdf, ArXivNEW
[ML] P. Schulam, F. Wigley, S. Saria. Clustering Longitudinal Clinical Marker Trajectories from Electronic Health
Data: Applications to Phenotyping and Endotype Discovery
American Association for Artificial Intelligence, January 2015.
[ML] S. Saria, A. Duchi, D. Koller. Learning Deformable Motifs in Continuous Time Series data. International Joint Conference on
Artificial Intelligence (IJCAI), 2011. pdf
[ML] S. Saria, D. Koller, A. Penn. Learning individual and population level traits from clinical temporal data. NIPS Predictive Models in Personalized Medicine, 2010.
pdf. (Other versions: short, long)
[ML] S. Saria, U. Nodelman, D. Koller. Reasoning at the Right Time Granularity.
Uncerainty in Artificial Intelligence (UAI), July 2007.
pdf (Best student paper award)
[ML] V. Jojic, S. Saria, D. Koller. Convex envelopes of complexity controlling penalties: the case against premature envelopment. Artificial Intelligence and Statistics, 2011.
[HI] K. Henry, D. Hager, P. Pronovost, S. Saria. A Targeted Real-time Early Warning Score (TREWScore) for Septic Shock. Science Translational Medicine 2015. Vol. 7, Issue 299. pdf (Cover article) NEW
[HI] S. Saria, A. Goldenberg. Subtyping: What Is It and Its Role in Precision Medicine. IEEE Intelligent Systems, 2015. Vol. 30, Issue 4.
[HI] S. Saria, A. Rajani, J. Gould, D. Koller, A. Penn. Integration of Early Physiological Responses Predicts Later Illness Severity in Preterm Infants. Science Translational Medicine,
September 2010. Vol. 2, Issue 48. Link (Cover article)
[HI] C. Paxton, A. Niculescu-Mizil, S. Saria. Developing Predictive Algorithms Using Electronic Medical Records: Challenges and Pitfalls. American Medical Informatics Association, 2013. pdf
[HI] S Saria, G McElvain, AK Rajani, AA Penn, DL Koller. Combining Structured and Free-text Data for Automatic Coding of Patient Outcomes. American Medical Informatics Association, 2010. (Best student paper finalist )
[Perspective] S. Saria. A $3 Trillion Challenge to Computational Scientists: Transforming Healthcare Delivery, August 2014. IEEE Intelligent Systems. Vol. 29, Issue 4. Link (Invited article)
[Perspective] D.W. Bates, S. Saria, L. Ohno-Machado, A. Shah, G. Escobar. Big data in health care: using analytics to identify and manage high-risk and high-cost patients, July 2014. Health Affairs. Vol. 33, Issue 7. Link (Short presentation made to an audience of policy makers at the National Press Club, Washington D.C.here.)
National Science Foundation on our work in modeling complex, chronic diseases such as scleroderma. More here.
TEDxBoston talk on Better Medicine Through Machine Learning. More.
NIPS 2016 Tutorial on ML Methods for Personalization with Application to Medicine. Video to come. More here.
Mu-Hsin Wei (2013-2014; Data Science @ Bloomberg)
Andy Ma (2014-2015; Research Scientist @ HKBU)
Gunnar Atli Sigurdsson (2013-2014; PhD student @ Carnegie Mellon)
Chris Paxton (2012-2013; PhD student @ Johns Hopkins)
Zhou Ye (2013-2014; PhD student @ UCI)
Antonia Oprescu (Summer 2014; Undergraduate @ Harvard)
Phillip Oh (Summer 2014; Undergraduate @ Johns Hopkins)
Riashat Islam (Summer 2014; Undergraduate @ UCL)
Ethan Pronovost (St. Paul's High School)
- Student news: Peter Schulam wins the Centennial Fellowship (August 2013). Miruna Oprescu wins second prize at the JHU Summer Research Expeditions program for her work with my lab on modeling health data (Aug. 2014). Ethan Pronovost selected as one of the finalists at the Americal Medical Informatics Association HSSP for his work with my lab on measuring harms due to false alarms in the ICU (Oct. 2014). Zach Barnes won second prize at the JHU Summer Research Expeditions for his work on deploying a tool for prognosticating lung fibrosis in scleroderma (Aug 2015).
- I'm the workshop co-chair for NIPS 2017 with Ralf Herbrich.
- Invited talks at the Big Data in Biomedicine, Advanced Pharma Analytics conference, Gordon Conference in Health Informatics, and the Technology, Biology and Data Science symposium by Cell Press.
- Interview with Katherine Gorman of Talking Machines
- Invited talks at the iBRIGHT Symposium, University of Oxford, Princeton, Wireless Health, Informatics Meeting at Penn, Emory University, University of Washington
- Area chair for ICML 2016, SPC for KDD 2016
- Check out our workshop on Machine Learning in Healthcare @ NIPS
- Spring-Summer'15 Invited talks at Duke, ENAR, University of Wiscosin, Google DeepMind, University of Pennsylvania, Fred Hutchinson (Data Science to Data Sense Symposium), University of Washington, the VA.
- Daniel Robinson and I won an IDIES seed grant for cost-sensitive predictive tools for preventing in-hospital adverse events (Summer 2015).
- Co-editing a special topics issue for the Journal of Machine Learning Research on "Learning from Electronic Health Data".
- Senior Program Committee for KDD 2015, IJCAI 2015.
- I lead an invited panel on Predictive Analytics @ American Medical Informatics Association annual symposium (Nov. 2014)
- I was a Keynote speaker on Big data approaches in Health @ the Big Data + Healthcare Analytics Forum by HIMSS (Nov. 2014)
- I presented work on opportunities for big data approaches to improve healthcare at D.C. National Press club (July 2014) at the inaugural event on Big Data by Health Affairs.
- I was invited to the expert's panel at the Moore Predictive Analytics Symposium (Sept. 2013) to discuss predictive models from EMR and sensing data
- I recently gave an invited talk at the Data Science for Social Good program in Chicago (August 2013)
- I gave an invited panel talk the National Science Foundation and National Institutes of Health joint meeting on Computing and Health; I spoke with three other invited panelists on the 'Exploiting
Data in Abundance' panel. (Oct. 2012)
- I gave an invited presentation at the DARPA Defense Science Office workshop on opportunities in healthcare computing (Nov. 2012)
- I gave an invited talk at INFORMS Healthcare on the big data in healthcare session (July 2013). INFORMS is the largest meeting in Operations Research. Informs Healthcare is a new meeting focused entirely on healthcare applications. There were ~600 attendees to the meeting in its 2nd year.
- I co-chaired ICML workshop on Role of Machine Learning in Transforming Healthcare (July 2013)
- I co-chaired Meaningful Use of Complex Medical Data (MUCMD) Symposium at the Children's Hospital LA (August 2012)
- Other selected invited talks: Google (Oct. 2013), Carnegie Mellon University (Oct. 2013), Institute for Computational and Experimental Research in Mathematics at Brown University (Nov. 2012), University of Vanderbilt Grand Rounds in Informatics (2012), University of Maryland Machine Learning Seminar (2012), International
Society for Bayesian Analysis (ISBA) (July 2012).
Previous: 600.476/676 Machine Learning in Complex Domains, 600.775 Seminar in Machine Learning and Data-Intensive Computing
Q00. I am an international student and I want to apply to your PhD program. Are you taking students?
We get 5+ emails of this type per week through the fall. As a once international student, I understand the anxiety of being on the other side. First, these emails are not effective unless you've read the faculty's papers and have something intelligent to say. So, don't bother wasting your time. Second, let me explain a common issue with PhD admissions for international students. Typically, CS programs tend to fund their PhD students through the length of their program (5 years). This means faculty tend to be risk averse. Having sat on PhD admissions committees, most faculty find it challenging to assess the background of an international student because they don't often know your school or your advisor. It's also difficult to gauge whether your grades are highly competitive or not. As a result, most committees pass on international students unless they have an *obviously* strong application. If you're serious about research and getting a PhD, and don't have a strong research background (i.e. published papers in top conferences and strong recommendation letters), apply to the masters program. Very often, we take our strong masters students as research assistants after a semester or two. This gives you a chance to build credibility. And, very often, you can recover the cost of your masters through industry internships which pay quite a bit. Also, at a place like Hopkins, there are many faculty outside of computer science that are looking for strong programmers for a research project. That funding can tide you over until you find a lab. But, in the long run, it's more fruitful to apply to a strong masters program with the goal of switching to a good PhD program rather than going to a PhD program with a poor fit.
Q0. I'm primarily interested in machine learning. But, I'm unsure of the application area. Do I need to have determined this ahead of time?
No. There are a number of faculty including myself that work on machine learning problems applicable to multiple domains. Look through ML@JHU. Also, look through application areas at Human Language Center of Excellence, and IDIES.
Q1. I'm a student at Hopkins and I'm interested in working with you. How can I get involved?
Please take a look at my papers. If you still remain interested, please send me an email. It's often also helpful to speak with the students in the research group to get a flavor of the problems you could get involved in.
Q2. I'm not at Hopkins currently. Can I apply to your lab for a PhD?
Yes, we are looking for creative and brilliant students to join us. However, you must formally apply to the PhD program for me to
be able to consider you. It might be helpful to read through this site on how to put together a strong graduate school application. To gain a better
understanding of the types of problems I work on, please read my papers. I soon plan to put up an active projects page but if you're interested, feel free to send my students or I a note.
Q3. I'm an undergraduate and I am looking for internship opportunities. Can I visit your lab?
Yes, we started a new internship program called the Summer Research Expeditions (SRE) in 2013. The program brings together faculty from multiple departments in engineering and
is a great opportunity to gain exposure to multidisciplinary applications of computing.
Q4. I'm looking for postdoctoral or research scientist positions. Are there positions in your lab?
We are always looking for great people to join our group. There is flexibility in terms of the projects you can get involved with. Please send me a copy of your CV if you'd like to learn more.
Q5. I'm interested in machine learning and your work but I have never worked in medicine/biology/healthcare. Do I need a medical background to work on healthcare projects?
No. In my own work, we've made significant progress from bringing in a fresh machine learning perspective to existing problems in healthcare. See my recent article to
get a flavor of the kinds of interesting computational problems that machine learning researchers can help solve in healthcare. You can learn most of what you need to know about the domain through your readings and interactions with your collaborators.
Our healthcare expenses are upwards of 2.5 trillion dollars and we're in desperate need of better approaches for improving outcomes and lowering cost. Our health system produces vasts amount of messy and heterogeneous data that we need smarter
modelers to be looking at and gleaning insights from.