< Back to all Announcements

Duke Machine Learning Virtual Summer School 2021

The Duke+Data Science program (+DS) is pleased to announce a virtual offering of the Duke Machine Learning School for summer 2021, which will be held June 14-17.

The 3.5 day curriculum in the Machine Learning Virtual Summer School (MLvSS) is targeted to individuals interested in learning about machine learning, with a focus on recent deep learning methodology. The MLvSS will introduce the mathematics and statistics at the foundation of modern machine learning, and provide context for the methods that have formed the foundations of rapid growth in artificial intelligence (AI). Additionally, the MLvSS will provide hands-on training in the latest machine learning software, using the widely used (and free) PyTorch framework.

Eight Duke Machine Learning Schools have been presented since 2017, reaching hundreds of participants from academia and industry and including international audiences at the SingHealth/Duke NUS Medical School and the Duke Kunshan University campus.

The 2021 MLvSS will be led by a trio of machine learning experts at Duke University: Professors Ricardo Henao, David Carlson, and Timothy Dunn. They will be joined with lectures by other Duke professors and by the founding director of the Duke machine learning schools, Lawrence Carin. Hands-on training with software will be provided by Duke graduate students who have extensive experience with these tools, and teaching assistants from the Duke AI Health Fellowship program will be available for assistance throughout the course.

Register for the MLvSS at

Who Should Attend

The MLvSS is particularly well-suited to members of academia and industry, including students and trainees, who seek a thorough introduction to the methods of machine learning, including interpretation and commentary by respected leaders in the field.

The MLvSS is meant to provide value to students at multiple levels of mathematical sophistication (including with limited such background). On each day, an initial emphasis will be placed on presenting the concepts as intuitively as possible, with minimum math and technical details. As the concepts are developed further, more math will be introduced, but only the minimum necessary to explain the concepts. then, case studies will show how the technology is used in practice, and these discussions should be accessible to most students (concepts emphasized over detailed math). Strength in mathematics and statistics is a significant plus, and will make all MLvSS material more accessible; however, it is not required to benefit from much of the program. Finally, the class will also introduce participants to the coding software used to make such technology work in practice.


The broad areas of emphasis for the three-and-a-half day class are as follows:

Monday, June 14 (9:00 AM – 4:00 PM Eastern Time):

  • Basic concepts in machine learning
  • Introduction to model building and the multi-layered perceptron (MLP)
  • Scaling to “big data” with stochastic gradient descent
  • Backpropagation as an efficient computation method
  • Case study applications of machine learning to digital health

Tuesday, June 15 (9:00 AM – 4:00 PM):

  • Image analysis with convolutional neural networks (CNNs)
  • Deep convolutional neural networks
  • Image segmentation, object detection, and object localization
  • Case study in image analysis and adversarial techniques with deep neural network (DNN) models

Wednesday, June 16 (9:00 AM – 4:00 PM):

  • Methods for natural language processing
  • Word embeddings
  • Recurrent neural networks
  • Temporal convolutional neural networks
  • Transformer networks
  • Case study of ethical and policy challenges in machine learning, including bias, transparency, and accountability

Thursday, June 17 (half day, 9:00 AM – 12:00 PM):

  • Data synthesis, with an emphasis on images
  • Generative adversarial network (GAN)
  • Deep networks for GAN
  • Learning and applications for GAN

Teaching assistants from the Duke AI Health Fellowship program will be present throughout the program to support the virtual format and will be easily available for assistance and consultation.

Program Format

The 3.5 day class will provide lectures on the mathematics and statistics at the heart of machine learning, plus hands-on training on implementing machine learning tools with the PyTorch software platform, and case studies of the methods applied to specific application areas.

Each day of the MLvSS will be arranged as follows (Eastern Time):

  • 9:00-10:15am   Lecture 1: Mathematically-light introduction to the focus of the day
  • 10:45am-noon  Lecture 2: Mathematically rigorous discussion of the focus of the day
  • 1:00-2:30pm     Software discussion and hands-on training with PyTorch
  • 3:00-4:00pm     Case Study of machine learning in practice

At the end of the MLvSS, each student will have a deeper understanding of machine learning fundamental concepts, including context for the rapidly-evolving field of artificial intelligence. For those students with sufficient mathematical background, the underlying methodology of machine learning will also be learned. Each student should be able to utilize PyTorch to implement the latest machine learning methods for analysis of images, video, and natural language (text).

Program Details: Location, Registration and Cost

Students (with a valid ID, at Duke or other universities) will pay a course fee of $50; the fee for non-students is $175, payable through the registration site. All fees are non-refundable. Once we reach maximum registration, we will maintain a waitlist, and will contact those on the waitlist as spots become available. We also have a small number of scholarships available for those who would be otherwise unable to join.

Each participant will receive a personal link for the virtual webinars, which will be held live and provide opportunities for questions and engagement with each lecturer. We strongly encourage live participation, but every participant will also have access to the video recordings to use for their personal reference.

Register for the MLvSS at


David Carlson, PhD

Lawrence Carin, PhD

Jessilyn Dunn, PhD

Timothy Dunn, PhD

Ricardo Henao, PhD

Hai (Helen) Li, PhD

Sarah Rispin Sedlak, JD

with PhD graduate students providing hands-on training with software:

Serge Assaad

Billy Carson

Nikhil Mehta

Relevance and Context

Machine learning is a field characterized by development of algorithms that are implemented in software and run on a machine (e.g., computer, mobile device, etc.). Each such algorithm is characterized by a set of parameters, and particular parameter settings yield associated algorithm characteristics. The algorithms have the capacity to learn, based on observed data. By “learn” it is meant that the algorithm can infer (or learn) which algorithm parameter settings are best matched to the data of interest. After algorithm parameters are so learned, the associated model ideally captures the underlying characteristics of the data. The algorithm, with learned parameters, may subsequently be applied to new data, with the goal of making predictions or learning insights. Machine learning methodology is primarily concerned with designing appropriate models/algorithms for datasets and problems of interest, plus the capacity to learn the model parameters given data (with challenges manifested when that data is of a massive scale).

In the context of prediction, one may be interested in developing algorithms that are capable of automatically interpreting data in a healthcare setting, to improve clinical care. In this case, the healthcare data may be radiological images, doctor notes, and/or a history of patient care (e.g., previous diagnoses, medications taken, etc.). In healthcare, the goal is to use machine learning to make improved diagnoses and recommendations for care. Similar concepts are of interest in business, where one may be interested in tailoring advertising and products to individuals. In education, machine learning may be used to tailor educational material to the level and interests of each student. Machine learning is increasingly making an impact in almost all areas of personal and professional life.

Recently, with increasing access to massive datasets, and to significant advances in computing resources, the quality of machine learning performance (e.g., prediction accuracy) has improved markedly. Further, over the last five years, significant advances have been made in a subfield of machine learning called “deep learning.”

This class will focus on the areas of machine learning that have made the biggest advances in utility over the last several years, including deep learning. The class will concentrate on methods that allow machine-learning algorithms to train effectively on massive datasets, i.e., “big data.” Emphasis will be placed on the latest methods for image and video analysis, natural language processing, representation learning, and data synthesis/modeling.

The 2021 MLvSS is presented by the Duke+Data Science (+DS) program, which is one of the partner programs supporting the mission of the Duke Center for Computational Thinking (CCT).

If you have any questions, please send an e-mail to