Master of Science in Data Science

University of Guam logo

The University of Guam Master of Science in Data Science program is a comprehensive and cohort-based study requiring 30 credit hours. Delivered in a face-to-face format, the curriculum places a strong emphasis on the practical applications of statistical methodology, computational science, and diverse domains. It includes a range of topics, including statistical modeling, machine learning, optimization, data management, analysis of large datasets, and data acquisition.

Throughout the program, students will explore reproducible data analysis, collaborative problem-solving, and honing visualization and communication skills. Also, the curriculum addresses ethical and security issues intrinsic to data science. Students will have developed expertise in applying data science techniques to solve real-world problems across various domains.

OBJECTIVES

Master of Science in Data Science program objectives are:

  1. To establish the first regional graduate program in Data Science that provides affordable education options to local students. The program will offer lower resident tuition rates and access to financial aid and support programs compared to similar off-island programs. Additionally, the program will leverage existing UOG grants such as U54, EPSCOR, and NASA to support students.
  2. To equip students with the necessary skills to work as data analysts in both academic and industry settings. Graduates can contribute to existing research initiatives at UOG, further enhancing research capabilities at the university.

PROGRAM LEARNING OUTCOMES

Students completing the Master of Science in Data Science Program at UOG will be able to:

  1. Design and execute statistical experiments and hypothesis tests to extract meaningful insights from data.
  2. Analyze and interpret complex statistical data using advanced statistical methodologies and tools.
  3. Visualize data for exploration, analysis, and communication.
  4. Develop and implement predictive models and machine learning algorithms to make data-driven decisions.
  5. Communicate statistical analyses, findings, and recommendations to both technical and non-technical audiences effectively.
  6. Collaborate with interdisciplinary teams to design, implement, and evaluate statistical projects.

ADMISSION REQUIREMENTS

Applicants must have the following minimum qualifications, to be eligible to apply to the program:

  1. Earned baccalaureate degree in mathematics, computer science, biology, chemistry, statistics, psychology, or public health from an accredited college or university.
  2. Graduate admission application and application fee
  3. Official transcripts of all coursework completed.
  4. At least two letters of recommendation
  5. Current resume
  6. Minimum cumulative undergraduate grade point average of 3.0.

In addition, undergraduate students must complete the following prerequisites or equivalent before entering the program:

  1. Multivariate Calculus, MA-205
  2. Linear Algebra, MA-341
  3. Statistics course, MA-387+MA-387L, or BI-412+BI-412L

Or Bridge Course (no credit toward degree)

The bridge course will cover calculus, linear algebra and statistics topics necessary for data science courses. The Bridge Course will take place during the UOG summer Session C, preceding the program's start.

Calculus topics and learning outcomes:

Linear algebra learning outcomes:

Statistics topics:

All Data Science classes take place on campus in a face-to-face format, with the exception of MA-500 and MA-505, which are eight-week online courses. All required math courses will be held at 4 p.m. Elective courses may take place in the morning or other times of day.

COURSE REQUIREMENTS (30 credit hours)

Required Courses (17-20 credit hours)

This course includes: linear models, including t-tests, ANOVA, regression, and multiple regression. Residual analyses, transformations, goodness of fit, interaction and confounding. Introduction to generalized linear models: mixed, hierarchical and repeated measures. Binary regression, extensions to nominal and ordinal milticategory responses, count data, Poisson and negative binomial regression, log-linear models. Prerequisites: MA-341 and MA387, BI412 or BI507.

This course covers probability spaces; combinatorial analysis; independence and conditional probability; discrete and continuous random variables including binomial, Poisson, exponential and normal distributions; expectations; joint, marginal and conditional distribution functions; moment generating functions; law of large numbers; central line theorems. Prerequisite: MA-205.

This course covers the teaory and practical applications of the theory of sampling, statistical inference, including sufficiency, estimation, and testing. Topics include common statistical distributions, sampling, maximum likelihood and moment estimators, unbiased estimators, hypothesis testing, and Bayesian inference. Prerequisites: MA-551 and instructor's consent.

An introduction to multivariate statistical analysis, such as Multivariate ANOVA, Principal Component analysis, factor analysis, cluster analysis, discriminant analysis, possibly structural equation modeling (SEM). Prerequisites: MA-541 and instructor's consent.

This course is designed to teach students the skills and techniques needed to conduct statistical research and provide statistical consulting services. Students will learn how to design studies, collect and analyze data, and communicate results effectively to clients. Through campus-wide consulting program, students will work with researchers from various disciplines providing recommendations for statistical methodologies appropriate for their research: analyzing client data, preparing written reports and manuscripts.

This course focuses on the practical applications of machine learning techniques to real-world problems. Students will gain knowledge on how to apply and evaluate different machine learning algorithms, including linear models, k-means, support vector machines, decision trees, random forests, neural networks, and more. They will also learn how to analyze and manipulate real-world datatasets, design learning algorithms, train, and assess machine learning models. Prerequisite: MA-541.

Elective Courses (10-13 credit hours)

Complete at least 16 credit hours

This is a 3-credit course that explores the complex relationships between diet and the major diseases of Western civilization, such as cancer and atherosclerosis. Topics that will be covered include: research strategies in nutritional epidemiology; methods of dietary assessment (using data on food intake, biochemical indicators of diet, and measures of body size and composition); reproducibility and validity of dietary assessment methods; nutrition surveillance; and diet-disease associations. Prerequisites: BI/EV507.

This course focuses on applications of geospatial technologies, including geographic information systems (GIS), remote sensing, and the global positioning system (GPS). It emphasizes applications of geospatial technologies to environment science and related fields. Topics include geospatial data collection and processing, visualization, analysis, and modeling; geospatial statistical analysis; mobile cloud based geospatial applications; and integration of geospatial technologies. Students will gain an understanding of Advanced Geospatial Techniques; demonstrate abilities to geospatial data collection, processing, and analysis by the means of GIS, remote sensing and GPS; and be able to solve practical problems in environmental science and related fields using geospatial technologies. The course aims to equip students with understanding and experience with the practical use of geospatial technologies in natural sciences, particularly environmental science. Prerequisites: Recommended prerequisites for Environmental Science Graduate Program, and fundamentals of GIS or equivalent, or consent of instructor. Undergraduate students may enroll in the course with the permission of instructor.

The course begins with the basic concepts and methods of management science that relies on statistical analysis techniques as well as the art of decision-making under circumstances of constrained optimization. It introduces statistical ideas as they apply to managers. Two ideas dominate: describing data and modeling variability and randomness using probability models. The course provides tools and data analysis models for decision making that use hypothesis testing, linear programming and simulation. It also provides an understanding of the definitions and limitations of a variety of standard econometric measures.

This course introduces students to basic knowledge in programming, data management, and exploratory data analysis using SAS software. Topics covered include data import and export, data cleaning and validation, basic statistical analysis, and data visualization.

This course will help build an understanding of the basic syntax and structure of the R language for statistical analysis and graphics.

This course explores and examines how social, cultural, historic, biologic, economic, environmental, and lifestyle factors contribute to differences in outcomes across the cancer continuum among racial and ethnic minorities and the medically underserved. This course provides core grounding in understanding why cancer health disparities exist and persist and explore approached for advancing cancer health equity. The course draws heaviliy from the experience of clinical practitioners, researchers, administrators, public health professionals and community-based organizations to provide real-world examples of prioritizing cancer health equity in both research and practice.

The master's program offers flexibility by not requiring a thesis. Instead, students can pursue alternative capstone projects or practical experiences aligned with their interests and goals.


SAMPLE SCHEDULE FOR COVERING REQUIREMENTS

Below is a sample schedule of the program across four semesters (two years), offering flexibility in personalizing your educational journey: