# Data Science

Study in the discipline of Data Science is jointly offered by the School of Mathematics and Statistics in the Faculty of Science and the School of Computer Science in the Faculty of Engineering and Information Technologies. Units of study in this major are available at standard and advanced level.

## About the major

Data is an essential asset in many organisations as it enables informed decision making into many areas including market intelligence and science. In the major in Data Science, you will learn computational and analytical skill sets that stem from statistics and computer science, to manage, interpret, understand, analyse and derive key knowledge from the data.

You will develop critical thinking about data and its use, a deep understanding of the core technical skills required and an appreciation for the context in which that data was collected. At the 3000-level of study and beyond, you will develop the ability to understand problems from many disciplines and place a data-driven problem into an analytical framework, solve the problem through computational means, interpret the results and communicate them to clients or collaborators.

## Requirements for completion

A major in Data Science requires 48 credit points, consisting of:

(i) 6 credit points of 1000-level core units

(ii) 6 credit points of 1000-level units according to the following rules*:

(a) 6 credit points of selective units OR

(b) 3 credit points of statistics units and 3 credit points of computation units OR

(c) 3 credit points of advanced statistics units and 3 credit points of mathematics units OR

(d) 3 credit points of advanced statistics units and 3 credit points of linear algebra units for students in the Mathematical Sciences program^

(iii) 12 credit points of 2000-level core units

(iv) 6 credit points of 2000-level selective units

(v) 6 credit points of 3000-level core interdisciplinary project units

(vi) 6 credit points of 3000-level methodology units

(vii) 6 credit points of 3000-level methodology or application or interdisciplinary project selective units

*Students not enrolled in the BSc may substitute ECMT1010 or BUSS1020

^If elective space allows, students may substitute DATA1001/1901 for the advanced statistics unit

A minor in Data Science requires 36 credit points, consisting of:

(i) 6 credit points of 1000-level core units

(ii) 6 credit points of 1000-level units according to the following rules*:

(a) 6 credit points of selective units OR

(b) 3 credit points of statistics units and 3 credit points of computations units OR

(c) 3 credit points of advanced statistics units and 3 credit points of calculus and linear algebra units

(iii) 12 credit points of 2000-level core units

(iv) 6 credit points of 2000-level selective units

(v) 6 credit points of 3000-level methodology units

## First year

DATA1001/1901 Foundations of Data Science is a foundational unit in the Data Science major. The unit focuses on developing critical and statistical thinking skills for all students.

DATA1002/1902 Informatics: Data and Computation is a foundational unit in the Data Science major. This unit covers computation and data handling, integrating sophisticated use of existing productivity software, e.g. spreadsheets, with the development of custom software using the general-purpose Python language.

Students are strongly encouraged to take

DATA1001/1901 Foundations of Data Science and DATA1002/1902 Informatics: Data and Computation for this major.

However, there are some equivalent selective units for DATA1001 and students can choose from: ENVX1002 Introduction to Statistical Methods, MATH1005 Statistical Thinking with Data, MATH1015 Biostatistics, MATH1115 Interrogating Data, MATH1905 Statistical Thinking with Data (Advanced), MATH1021 Vector Calculus and Differential Equations, MATH1921 Calculus Of One Variable (Advanced), MATH1931 Calculus Of One Variable (SSP), MATH1023 Multivariable Calculus and Modelling, MATH1923 Multivariable Calculus and Modelling (Adv), MATH1933 Multivariable Calculus and Modelling (SSP), MATH1002 Linear Algebra, MATH1902 Linear Algebra (Advanced).

Students should refer to Table A for specific 1000-level requirements.

## Second year

DATA2001/2901 – Data Science: Scale and Data Diversity focuses on methods and techniques to efficiently explore and analyse large data collections;

DATA2002/2902 – Data Analytics: Learning from Data focuses on developing data analytic skills for a wide range of problems and data.

Students also complete one unit from a selection: COMP2123 Data Structures and Algorithms, COMP2823 Data Structures and Algorithms (Adv), COSC2002/2902 Computational Modelling, STAT2011/2911 Probability and Estimation Theory, QBUS2830 Actuarial Data Analytics.

## Third year

DATA3888 – Interdisciplinary Data Science Project is the capstone 3000-level unit for the major and will include both the disciplinary and interdisciplinary project. The main component for the unit will be a nine week project that applies the candidates’ skills and knowledge to analyse a real, messy dataset from a knowledge domain outside data science and statistics.

Students will also select 6 credit points from a selection of DATA and STAT units focusing on methodology, and 6 credit points from a selection of methodology or application and discipline-focussed units. Students should refer to Table A for all discipline-focussed units.

Note the following units are available at 3000-level: COMP3308/COMP3608 Introduction to Artificial Intelligence, COMP3027/COMP3927 Algorithm Design.

## Fourth year

The fourth year is only offered within the combined Bachelor of Science/Bachelor of Advanced Studies course.**Advanced coursework**

The Bachelor of Advanced Studies advanced coursework option consists of 48 credit points, with a minimum of 24 credit points at 4000-level or above. Of these 24 credit points, you must complete a project unit of study worth at least 12 credit points. Advanced coursework will be included in the table for 2020.**Honours**

Meritorious students in the Bachelor of Science/Bachelor of Advanced Studies may apply for admission to Honours within a subject area of the Bachelor of Advanced Studies. Admission to Honours requires the prior completion of all requirements of the Bachelor of Science, including Open Learning Environment (OLE) units. If you are considering applying for admission to Honours, ensure your degree planning takes into account the completion of a second major and all OLE requirements prior to Honours commencement.

Unit of study requirements for Honours in the area of Data Science: completion of 24 credit points of project work and 24 credit points of coursework. Honours units of study will be available in 2020.

## Contact and further information

W sydney.edu.au/science/schools/school-of-mathematics-and-statistics

E

All enquiries phone: +61 2 9351 5804 or +61 2 9351 5787

Address:

**School of Mathematics and Statistics**

Level 5, Carslaw Building F07

University of Sydney NSW 2006

Professor Jean Yang

T +61 2 9351 3012

E

## Learning Outcomes

Students who graduate from Data Science will be able to:

- Exhibit a broad and coherent body of knowledge in data science, and be able to describe the relationships between context-specific knowledge and data and evaluating how these can guide data analytics.
- Exhibit deep knowledge of the underlying concepts and principles of experimental design, analysis and data outputs, of the relationships between these concepts, and of potential pitfalls.
- Use quantitative models or visualisation methods on multiple types of data.
- Manage data, metadata and derived knowledge, using appropriate storage, access and administration tools.
- Communicate concepts and findings in data science through a range of modes for a variety of purposes and audiences, using evidence-based arguments that are robust to critique.
- Identify analytical approaches appropriate to a specific problem in data analysis, simulation-based modelling or equation-based modelling.
- Create and use databases and graphical information systems using programming skills.
- Address authentic problems in data science, working professionally and ethically and with consideration of cross-cultural perspectives, within collaborative, interdisciplinary teams.