Skip to main content

During 2021 we will continue to support students who need to study remotely due to the ongoing impacts of COVID-19 and travel restrictions. Make sure you check the location code when selecting a unit outline or choosing your units of study in Sydney Student. Find out more about what these codes mean. Both remote and on-campus locations have the same learning activities and assessments, however teaching staff may vary. More information about face-to-face teaching and assessment arrangements for each unit will be provided on Canvas.

Unit of study_

DATA2902: Data Analytics: Learning from Data (Adv)

Technological advances in science, business, and engineering have given rise to a proliferation of data from all aspects of our life. Understanding the information presented in these data is critical as it enables informed decision making into many areas including market intelligence and science. DATA2902 is an intermediate unit in statistics and data sciences, focusing on learning advanced data analytic skills for a wide range of problems and data In this unit, you will learn how to ingest, combine and summarise data from a variety of data models which are typically encountered in data science projects as well as reinforcing your programming skills through experience with statistical programming language. You will also be exposed to the concept of statistical machine learning and develop the skills to analyse various types of data in order to answer a scientific question. From this unit, you will develop knowledge and skills that will enable you to embrace data analytic challenges stemming from everyday problems.

Code DATA2902
Academic unit Mathematics and Statistics Academic Operations
Credit points 6
6 cp of DATA1901 or STAT2911 or (MATH1905 and MATH1XXX) or a mark of 65 or above in (DATA1001 or ENVX1001 or ENVX1002 or BUSS1020 or ECMT1010 or STAT1021 or STAT2011) or an average mark of 65 or above in (MATH10X5 and MATH1115)
STAT2012 or STAT2912 or DATA2002
Assumed knowledge:
Basic linear algebra and some coding for example MATH1014 or MATH1002 or MATH1902 and DATA1001 or DATA1901

At the completion of this unit, you should be able to:

  • LO1. formulate domain/context specific questions and deduce appropriate statistical analysis
  • LO2. extract and combine data from multiple data resources
  • LO3. construct, analyse and evaluate numerical and graphical summaries of different data types including large and/or complex data sets
  • LO4. have developed expertise in the use of a software version control system
  • LO5. identify, justify and implement appropriate parametric or non-parametric statistical tests
  • LO6. formulate, evaluate and interpret appropriate linear models to describe the relationships between multiple factors
  • LO7. demonstrate statistical machine learning using a given classifier and design a cross-validation scheme to calculate the prediction accuracy
  • LO8. create a reproducible report to communicate outcomes using a programming language.

Unit outlines

Unit outlines will be available 2 weeks before the first day of teaching for the relevant session.