Learn

+DS provides training modules and learning experiences grounded in generalizable data science content, while partnering with individual units or groups to develop additional specialized content.

Online Learning

Coursera Online Modules

The online +DS modules introduce the basics of data science, across multiple important application domains. These online modules are used as prerequisites for the in-person learning experiences listed below. Together, the online content supports the in-person "flipped" learning experiences. 

Module 1: Simple Introduction to Machine Learning

The focus of this module is to introduce the concepts of machine learning with as little mathematics as possible. We will introduce basic concepts in machine learning, including logistic regression, a simple but widely employed machine learning (ML) method. Also covered is multilayered perceptron (MLP), a fundamental neural network. The concept of deep learning is discussed, and also related to simpler models.

Module 2: Basics of Model Learning

In this module we will be discussing the mathematical basis of learning deep networks. We’ll first work through how we define the issue of learning deep networks as a minimization problem of a mathematical function. After defining our mathematical goal, we will introduce validation methods to estimate real-world performance of the learned deep networks. We will then discuss how gradient descent, a classical technique in optimization, can be used to achieve this mathematical goal. Finally, we will discuss both why and how stochastic gradient descent is used in practice to learn deep networks.

Module 3: Image Analysis with Convolutional Neural Networks (CNNs)

This week will cover model training, as well as transfer learning and fine-tuning. In addition to learning the fundamentals of a CNN and how it is applied, careful discussion is provided on the intuition of the CNN, with the goal of providing a conceptual understanding.

Module 4: Recurrent Neural Networks for Natural Language Processing

This week will cover the application of neural networks to natural language processing (NLP), from simple neural models to the more complex. The fundamental concept of word embeddings is discussed, as well as how such methods are employed within model learning and usage for several NLP applications. A wide range of neural NLP models are also discussed, including recurrent neural networks, and specifically long short-term memory (LSTM) models.

Module 5: The Transformer Network for Natural Language Processing

This week we'll cover an Introduction to the Transformer Network, a deep machine learning model designed to be more flexible and robust than Recurrent Neural Network (RNN). We'll start by reviewing several machine learning building blocks of a Transformer Network: the Inner products of word vectors, attention mechanisms, and sequence-to-sequence encoders and decoders. Then, we'll put all of these components together to explore the complete Transformer Network.

Module 6: Introduction to Reinforcement Learning

This week will cover Reinforcement Learning, a fundamental concept in machine learning that is concerned with taking suitable actions to maximize rewards in a particular situation. After learning the initial steps of Reinforcement Learning, we'll move to Q Learning, as well as Deep Q Learning. We'll discuss the difference between the concepts of Exploration and Exploitation and why they are important.

In-Person and Virtual Learning Experiences

In addition to the online content, +DS offers in-person and virtual opportunities to dive deeper into the information introduced in the online modules. These learning experiences will be developed to target diverse units at Duke: from those that desire a broad understanding of what is possible with data science, and those who wish to use data-science tools (software) without a need for deep understanding of underlying methodology, to those who desire a rigorous technical proficiency of the details and methodology of data science.

See past learning experiences

+DS will begin its fall programming in September 2021. Please email plus-datascience@duke.edu if you would like to be added to our announcements listserv.

Upcoming Learning Experiences
Tuesday, November 2 - 4:00pm to 5:00pm
Location: Virtual, Classroom
Instructor: Fabricio Lopes Sanchez

Presented by:
Fabricio Lopes Sanchez, Sr. Cloud Solution Architect at Microsoft
John Brown, Sr. Cloud Solution Architect at Microsoft

Thursday, November 18 - 11:00am to 12:00pm
Location: Virtual, Classroom
Instructor: Mauricio Álvarez

In this session, Prof.  Álvarez will define a Gaussian process (GP) model and describe how it is used to tackle (non-linear) regression problems including defining the kernel function, the key function that defines the Gaussian process.

Friday, November 19 - 12:00pm to 1:00pm
Location: Virtual, Classroom
Instructor: Helge Rhodin

Presented by: Helge Rhodin, PhD; Department of Computer Science; University of British Columbia