Program Codes:
MSDS
Master of Science
Availability: campus
Program Philosophy
As a result of globalization and advances in technology during the 21st century, the complexity and variety of data have evolved, while the volume of data continues to increase daily. This phenomenon has been dubbed “The Data Revolution.” The world is inundated with data, stemming from, but not limited to, social media platforms, business transactions, Internet sources, cellular data usage and file sharing. Industry and government organizations collect, organize and analyze data and information for several reasons, from maintaining their competitive edge, to altering business strategies and increasing sales to enhancing national security.
Data science is one of the most important disciplines of the future, and it will intersect with every area as the reservoir of the world’s data continues to grow. According to a McKinsey Global Institute report, “the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.”
The Master of Science in Data Science is a 30-credit program. This program will train students as data scientists who will serve as key informants for decision makers in both the public and private sectors. It will serve as a cornerstone in cross-disciplinary learning. The comprehensive, challenging curriculum emphasizes programming, data visualization, machine learning, database skills and quantitative analysis to produce graduates who are innovators in producing, visualizing and communicating actionable new insights about the meaning of data for decision-makers in businesses, public agencies and nonprofits.
The ideal candidate for the Data Science program possesses an inquiring mind, an interest in the world around them, an ability to communicate with others effectively, and quantitative knowledge, skills, and abilities. The individual should be self-motivated and committed to personal and professional development. Individuals from a variety of academic and professional backgrounds are encouraged to apply; however, faculty of the Data Science program may request an interview to determine if the program will meet the applicant’s goals.
In just five years, students can earn both a bachelor’s and master’s degree at Mercyhurst University through the new 4+1 Data Science program. Undergraduate students in almost any major of study may apply for the 4+1 program. Interested students must apply by April 1 of their sophomore year or after they have 30 credits completed on their Mercyhurst transcripts. Students of the 4+1 Data Science program will take four graduate-level courses (12 credits) during their junior and senior years. These credits will count toward the master’s degree, but will be billed at the undergraduate flat rate. These credits will only count for the Master of Science degree and students must complete at least 121 undergraduate credits to earn a bachelor’s degree. As a graduate student, students must complete 18 credits during the fifth year, charged at the graduate rate.
The program chair shall conduct an annual review of the academic progress of all students enrolled in the program. Students whose GPA falls below a 3.0 or who might otherwise exhibit behavior that is not conducive to ensuring employment in this field will be placed on probation or removed from the program, depending on the outcome of review.
Mission Statement
It is the mission of the data science program at Mercyhurst University to produce graduates, through a variety of delivery modalities, who are skilled in utilizing a variety of sources of data and analytic techniques to lead the collaborative development of high-quality written and oral analytic data products that, in service toward a just world, inform decision-makers, thereby fostering an appreciation for the dignity of work and commitment to serving others.
Graduates of the Masters of Science in Data Science will be able to:
This course introduces students to concepts in probability and statistics including sampling distributions, normal theory estimation and hypothesis testing, regression and correlation, exploratory data analysis, logistic regression, discriminant analysis, resampling methods and linear model selection. Learning to do statistical analysis on a personal computer is an integral part of the course.
This course introduces students to fundamental concepts of computer programming. Topics include: algorithms, abstract data types, linear and non-linear data structures, and software engineering. Students will get hands-on experience with the use of a high level programming language to search and sort data.
This course introduces students to the traditional relational databases as well as the newer non-relational databases that have become increasingly common in data science applications. Topics include: conceptual data modeling, physical data modeling, computing on data, designing schemas, querying databases, and manipulating databases, SQL, NoSQL, the nedds that drive the development and use of each, the criteria that decision makers should consider when choosing between relational and non-relational databases.
This course introduces students to various machine learning techniques and tools. Topics include: supervised learning (linear and quadratic discriminant function analysis, logistics regression, kernel and k-nearest neighbor, naive Bayes, support vector machines, tree classification methods, and ensemble methods such as bagging, boosting, and random forests, unsupervised learning (k-means, hierarchical, and model-based clustering), and techniques for evaluating learning algorithms.
A hands-on course in data analysis and visualization based on key design principles and techniques for interactively visualizing data based on principles from the fields of statistics, perception, graphic design, cognition, communication, and data mining. Through lecture, case studies, and design studios, students will work individually and collaboratively to visualize complex datasets using software applications to identify patterns, trends, and variation across categories, space, and time. Students will obtain practical experience with the visualization of complex data including multivariate data, geospatioal data, textual data, time series, and network data.
Machine learning algorithms depend heavily upon the method of gradient descent, which is an application of multivariable calculus. In this course, students will be introduced to the concepts in calculus related to gradient descent including functions, limits, derivatives, partial derivatives, and the chain rle. Te students will then apply these concepts to problems in regression and classification.
This course will provide students pursuing a graduate degree in data science with essential tools of linear algebra. The course will focus on applications of linear algebra to machine learning. Topics will include systems of linear equations, matrix and vector operations, norms, inverses, determinants, matrix decomposition, tensors and tensor operations, eigenvalues and eigenvectors, and principle component analysis. Additional topics may be included.
Students review a seLecturetion of readings pertaining to a particular topic, study, period, and or geographic region as directed, and prepare written annotations and critical reviews.
Three (3) electives, at least one above the level of 600 and not an internship.
Short Title : Data Structures & Algorithms
Active Term : Spring / Randomly
Course Code : CIS 511
Course Description :
This course reviews data structures, algorithms, and algorithm analysis. The data structures include: linear and non-linear data structures including lists, stacks, queues, trees, and graphs. the algorithms include sorting and searching through various data structures. Techniques for analyzing various algorithms to assess their performance will also be studied.
Short Title : Data Wrangling
Active Term : Spring / Randomly
Course Code : CIS 512
Course Description :
This course teaches hands-on skills needed to acquire, transform, and manipulate real world data so that it can be analyzed and modeled. Students will learn how to read structured and unstructured information into numpy and pandas data structures.
Short Title : Big Data Analytics
Active Term : Spring / All
Course Code : CIS 551
Course Description :
This course is an overview of Hadoop, MapReduce, and Hadoop Tools. Considerable attention will be given first to Hadoop installation, both on the desktop and in the cloud. Students will learn how to navigate the Hadoop Distributed File System, and they will develop understanding of the MapReduce algorithm. Practice in writing MapReduce programs will be provided. The second half of the course will be devoted to essential Hadoop tools, including Pig, Hive, Flume, Sqoop, and Hbase. Programming experience is a prerequisite, and experience with Java and Unix will be helpful.
Short Title : Artificial Intelligence
Active Term : Spring / Randomly
Course Code : CIS 570
Course Description :
This course explores the topic of intelligent software agents with an emphasis on hands-on design of adaptive problem-solving agents for environments of increasing complexity ranging from single-agent computer games to complex real-world multi-agent environments.
Short Title : Social Media Text Mining
Active Term : Spring / Odd
Course Code : CIS 572
Course Description :
This course provides an introduction to social media mining and methods. The course provides hands-on experience mining social data for social meaning extraction (focus on natural language processing and sentiment analysis) using automated methods and machine learning technologies.
Short Title : Cyber Analytics
Active Term : Spring / Odd
Course Code : CIS 573
Course Description :
This course overviews various techniques in the field of cyber analytics. Cyber analytics is the discovery of meaningful patterns from data to increase cyber security of comuter systems and networks. Topics include: overview of machine learning, overview of cybersecurity threats, case studies of machine learning techniques for addressing various cybersecurity threats.
Short Title : SOCIAL MEDIA ANALYTICS
Course Code : CIS 574
Course Description :
Social media archive and deliver large quantitites of social data for analysis and new research opportunities for media researchers. This course introduces data analysis methods such as time series, sentiment, network, and geospatial analysis which can be applied to the information gathered from various social media platforms. The class will culminate with a project in which students will apply these methods to study a question of their interest. On completion, students will be prepared to investigate future industrial and academic problems.
Short Title : COMPUTATIONAL INTELLIGENCE
Course Code : CIS 575
Course Description :
This course introduces the concepts and algorithms for developing intelligent systems using computation. It provides exposure to a wide range of topics including neural networks, fuzzy systems, evolutionary algorithms, explainable artificial intelligence, and game AI. Students will have hands on experience developing and optimizing these models in python.
Short Title : Adv Data Visualization
Active Term : Randomly / Randomly
Course Code : CIS 581
Course Description :
Students will learn how to create interactive data visualizations for the web. These will include not only basic charts and graphs, but dashboards and more advanced visualizations of hierarchical, network, and geographic data as well.
Short Title : Research Project
Active Term : Spring / All
Course Code : CIS 599
Course Description :
The capstone course experience is designed to allow students to work under the supervision of a computing and information science faculty member to solve a real world problem and present their findings to the faculty before graduation.