Curriculum

Dual Focus on Theory and Practice

Academic Coursework Sample

Coursework schedule subject to change.

STATS 200AP: Intermediate Probability and Statistical Theory I

COMPSCI 220P: Databases and Data Management

COMPSCI 271P: Introduction to Artificial Intelligence

COMPSCI 201P: Computer Security

STATS 200BP: Intermediate Probability and Statistical Theory II

STATS 210P: Statistical Methods I (Regression modeling strategies)

COMPSCI 273P: Machine Learning and Data Mining

STATS 211P: Statistical Methods II (Modeling and Data Visualization)

COMPSCI 260P: Fundamentals of Algorithms with Applications

COMPSCI 274P: Neural Networks and Deep Learning

DATA 298: Curricular Practical Training
(OR)
DATA 299: Individual Study

DS 299P: Summer Non-Internship Course

DS 296P: Capstone I – Professional Writing and Communication for Data Science Careers

DS 297P: Capstone II – Design Project for Data Science

Elective

Course Descriptions

COMPSCI 220P: Databases and Data Management

Introduction to the design of databases and the use of database management systems (DBMS) for managing and utilizing data. Topics include entity-relationship modeling for design, relational data model, relational algebra, relational schema design, and use of SQL (Structured Query Language). The course will also touch on topics such as data wrangling and dataframes for data analysis and new technologies for semi-structured and/or scalable data analysis.

COMPSCI 260P: Fundamentals of Algorithms with Applications

Covers fundamental concepts in the design and analysis of algorithms and is geared toward practical application and implementation. Topics include: greedy algorithms, deterministic and randomized graph algorithms, models of network flow, fundamental algorithmic techniques and NP-completeness.

COMPSCI 273P: Machine Learning & Data Mining

Introduction to principles of machine learning and data-mining. Learning algorithms for classifications, regression, and clustering. Emphasis is on discriminative classification methods such as decision trees, rules, nearest neighbor, linear models, and naive Bayes.

STATS 200AP: Intermediate Probability and Statistical Theory I

Fundamental probability and distribution theory needed for statistical inference. Topics include axiomatic foundations of probability theory, discrete and continuous distributions, expectation and moment generating functions, multivariate distributions, transformations, sampling distributions, and limit theorems.

STATS 200BP: Intermediate Probability and Statistical Theory II

Fundamental theory and methods for making statistical inference. Topics include principles of data reduction (sufficient, ancillary, and complete statistics), methods of finding point estimators (method of moments, maximum likelihood estimators, Bayes estimators), methods of evaluating estimators (mean squared error, best unbiased estimators, asymptotic evaluations), hypothesis testing, and confidence intervals.

STATS 210P: Statistical Methods I

Statistical methods for analyzing data from multi-variable observational studies and experiments.  Topics include model selection and model diagnostics for simple and multiple linear regression and generalized linear models.

STATS 211P: Statistical Methods II

Statistical methods for designing experiments, visualizing, and analyzing experimental and observational data using generalized regression models, multivariate analysis, and methods suitable for dependent data.

DS 298P: Curricular Practical Training

Internship in which students work individually at an outside organization to gain experience with the challenges involved in data related work Materials fee.

DS 299P: Individual Study

Supervised individual study in data science.

DS 296P: Capstone I: Professional Writing and Communication for Data Science Careers

Written and oral communication for data science careers.  Production of a detailed document describing the design, methods, analytic strategy, interpretation, and conclusions as related to the concurrent capstone design and analysis class and refinement of written documents and oral communications skills needed for a successful job search.  Co-requisite: DS 297P.

DS 297P: Capstone II: Design Project for Data Science

Complete implementation of a data science analytic strategy for obtaining empirically-driven solutions to problems from science and industry.  Focuses on the problem definition and analysis, data representation, algorithm selection, solution validation, and presentation of results.  Co-requisite: DS 296P.

COMPSCI 201P: Computer Security

Introduction to computer security, including systems, technology, and management. Topics include authorization, authentication, data integrity, malware, operating systems security, network security, web security, and basic cryptography.

COMPSCI 222P: Principles of Data Management

Covers fundamental principles underlying data management systems. Understanding and implementation of key techniques including storage management, buffer management, record-oriented file system, access methods, query optimization, and query processing.

COMPSCI 223P: Transaction Processing and Distributed Data Management

Introduction to fundamental principles underlying transaction processing systems including database consistency, atomicity, concurrency control, database recovery, replication, commit protocols and fault-tolerance. Includes transaction processing in centralized, distributed, parallel, and client-server environments.

COMPSCI 224P: Big Data Management

A technical overview of emerging technologies for large-scale data management. The course will focus on Big Data management frameworks such as Hadoop and Spark. The course will also cover relational and non-relational database technologies, including document (“NoSQL”) databases as well as emerging cloud data management solutions. The underlying storage and security properties of these systems will also be covered.

COMPSCI 261P: Data Structures with Applications

Data structures and their associated management algorithms with analysis and examination of practical applications and implementations.

COMPSCI 271P: Introduction to Artificial Intelligence

The study of theories and computational models for systems which behave and act in an intelligent manner. Fundamental sub-disciplines of artificial intelligence including knowledge representation, search, deduction, planning, probabilistic reasoning, natural language parsing and comprehension, knowledge-based systems, and learning.

COMPSCI 274P: Neural Networks and Deep Learning

Introduction to principles of machine learning and neural networks. Architecture design. Feedforward and recurrent networks. Learning models and algorithms.  Applications to data analysis and prediction problems in a wide range of areas such as machine vision, natural language processing, biomedicine, and finance.  

COMPSCI 275P: Graphical Models and Statistical Learning

Introduction to principles of statistical machine learning with probabilistic graphical models. We will study efficient inference algorithms based on optimization-based variational methods, and simulation-based Monte Carlo methods. Several approaches to learning from data will be covered, including conditional models for discriminative learning, and Bayesian methods for controlling model complexity.  Methods will be motivated by applications including image and video analysis, text and language processing, sensor networks, autonomous robotics, computational biology, and social networks.

STATS 205P: Bayesian Data Analysis

Covers basic Bayesian concepts and methods with emphasis on data analysis. Special emphasis on specification of prior distributions. Development of methods and theory for one and two samples, and binary, Poisson and linear regression.

STATS 240P: Multivariate Statistical Methods

Theory and application of multivariate statistical methods.  Topics include: Statistical inference for the multivariate normal model and its extensions to multiple samples and regression, use of statistical packages for data visualization and dimension reduction, discriminant analysis, cluster analysis, and factor analysis.

STATS 245P: Time series Analysis

Statistical models for analysis of time series from the time and frequency domain perspective, with particular emphasis on applications in economics, finance, climatology, engineering, ecology.  Topics include: linear time series models for trends; models for stationary time-series, ARMA models; non-stationary time series, ARIMA models; forecasting and Kalman filtering; time-series smoothing; seasonal models; ARCH, GARCH and stochastic volatility models; multivariate time series; vector autoregressive models; spectral analysis; case studies. Statistical software R will be used throughout this course.

STATS 262P: Theory and Practice of Sample Survey

This course covers the basic techniques and statistical methods used in designing surveys and analyzing collected survey data. Topics to be covered include: simple random sampling, ratio and regression estimates, stratified sampling, cluster sampling, sampling with unequal probabilities, multistage sampling, and methods to handle nonresponse.

STATS 270P: Stochastic Processes

Introduction to the theory and application of stochastic processes. Topics include Poisson processes, Markov chains, continuous-time Markov processes, and Brownian motion. Applications include Markov chain Monte Carlo methods and financial modeling (for example, option pricing).

DATA 295P: Special Topics in Data Science

Covers one or more emerging topics in data science. Course content may vary.

DS299P: Summer Non-Internship Course

The objective of the Non-Internship Course is to support the career development of MDS students who are not engaged in formal internship experiences and help them plan and execute activities to navigate their post-graduation employment search during the summer.