Skip to main content

We are aiming for an incremental return to campus in accordance with guidelines provided by NSW Health and the Australian Government. Until this time, learning activities and assessments will be planned and scheduled for online delivery where possible, and unit-specific details about face-to-face teaching will be provided on Canvas as the opportunities for face-to-face learning become clear.

Unit of study_

DATA1902: Informatics: Data and Computation (Advanced)

This unit covers computation and data handling, integrating sophisticated use of existing productivity software, e. g. spreadsheets, with the development of custom software using the general-purpose Python language. It will focus on skills directly applicable to data-driven decision-making. Students will see examples from many domains, and be able to write code to automate the common processes of data science, such as data ingestion, format conversion, cleaning, summarization, creation and application of a predictive model. This unit includes the content of DATA1002, along with additional topics that are more sophisticated, suited for students with high academic achievement.

Code DATA1902
Academic unit Computer Science
Credit points 6
INFO1903 OR DATA1002
Assumed knowledge:
This unit is intended for students with ATAR at least sufficient for entry to the BSc/BAdvStudies(Advanced) stream, or for those who gained Distinction results or better, in some unit in Data Science, Mathematics, or Computer Science. Students with portfolio of high-quality relevant prior work can also be admitted.

At the completion of this unit, you should be able to:

  • LO1. automate a computational process, when given a clear account of the algorithm to be applied (to be done by writing Python programs with core techniques of procedural programming)
  • LO2. demonstrate knowledge of Python syntax and semantics, to trace and understand idiomatic code typical of data science activities, including features such as user-defined functions, exception-raising, and handling
  • LO3. understand automation of the computational process needed for examples of the various activity in the data science pipeline: data ingestion and cleaning, data format conversion, data summarization, visual and tabular presentation of the results from summarization, creation of a predictive model of a given form, application of a predictive model to new data, evaluation of a predictive model (and also, automation of a pipeline that scripts use of existing tools for these activities)
  • LO4. understand both spreadsheets, and programs in Python, for automatically performing computational processes of data science, and awareness of the similarities and differences between tools
  • LO5. understand main issues for data management in connection with data science activities, including value of data, importance of metadata, and issues when sharing data across time and users
  • LO6. understand how data sets are represented in computer files, in particular, the many-to-many relationship between the physical representation and the logical representation; advantages and disadvantages of different representations
  • LO7. understand principles of charting and information presentation, and ability to produce good charts using both Python libraries and spreadsheets; also capability to evaluate charts for effectiveness in communication.
  • LO8. use and understand some more sophisticated tools for computation or data-handling.

Unit outlines

Unit outlines will be available 2 weeks before the first day of teaching for 1000-level and 5000-level units, or one week before the first day of teaching for all other units.

There are no unit outlines available online for previous years.