Machine Learning for your Research


US-RSE Education & Training Seminar Series

US-RSE periodically presents technical talks and tutorials related to Education & Training for RSEs.


The next technical talk of the US-RSE Education & Training Seminar Series will feature Savannah Thais who will speak about “Machine Learning for your Research.” This event will take place Thursday, June 22nd at 2-4 PM ET, 1-3 PM CT, 12-2 PM MT, 11 AM - 1 PM PT

Abstract: This technical talk will give an overview of modern supervised and unsupervised machine learning (ML) methods. We will discuss the advantages and limitations of each and explore what types of problems each is best suited to address. We will also discuss best practices for using ML methods in scientific and social impact research including appropriate model evaluation/testing and reproducibility and generalizability considerations.

Learning Objectives: Attendees will leave with an understanding of common ML algorithms, the types of data they require, and what types of problems they are best suited for. Attendees will also be presented with a framework for evaluating ML models within the context of their work. If time allows, we will spend time discussing and brainstorming specific project ideas from participants’ individual research.

Intended Audience: This workshop will be most useful for people whose research has (or could have) at least some quantitative elements and who are interested in incorporating ML into their work. It might also be interesting for people not currently involved in such research but curious about how ML can be used in research more generally. Attendees should have a “big picture” concept of what ML entails, namely selecting an algorithm with a mathematically defined learning goal and then using data examples to adjust that algorithm’s parameters in order to move towards this goal is very useful but not explicitly required as we will cover these topics at the beginning of the class. Participants should also have an understanding of what sorts of data exist in their field or project and what kinds of questions they might want to answer with ML.

Attendees are expected to follow the US-RSE Code of Conduct.


Savannah Thais is a Research Scientist at the Columbia University Data Science Institute where she focuses on ML. She is interested in complex system modeling and in understanding what types of information is measurable or modelable and what impacts designing and performing measurements have on systems and societies. She is the founder and Research Director of Community Insight and Impact, an non-profit organization focused on data-driven community needs assessments for vulnerable populations and effective resource allocation. She was the ML Knowledge Convener for the CMS Experiment from 2020-2022, currently serves on the Executive Board of Women in Machine Learning and the Executive Committee of the APS Group on Data Science, and is a Founding Editor of the Springer AI Ethics journal. She received her PhD in Physics from Yale University in 2019 and was a postdoctoral researcher at Princeton University from 2019-2022.


You can register on Zoom for this technical talk.