Useful links
This unit of study covers the data engineering issues of building robust and scalable data processing pipelines. While data engineers may not be directly performing data analysis, they must have the technical knowledge and skillset to provide data analysts with appropriate data analytics architectures and to provide them with reliable and well-formed data that is ready to be analysed. Topics covered range from data ingestion from various sources including databases, text files and web services, to data cleaning and data transformation approaches, and the system architectures that allow the pipeline to run efficiently and automatically. Special consideration is given to building scalable data analysis solutions using a blend of Big Data processing techniques including data stream processing and distributed data processing platforms such as Apache Spark.
Code | OCMP5339 |
---|---|
Academic unit | Computer Science |
Credit points | 6 |
Prerequisites:
?
|
COMP5310 or OCMP5310 |
---|---|
Corequisites:
?
|
None |
Prohibitions:
?
|
COMP5329 or COMP4329 |
Assumed knowledge:
?
|
Proficiency in programming, especially Python, and in database querying with SQL; basic Unix scripting |
The learning outcomes for this unit will be available two weeks before the first day of teaching.
Unit outlines will be available 1 week before the first day of teaching for the relevant session.
Key dates through the academic year, including teaching periods, census, payment deadlines and exams.
Enrolment, course planning, fees, graduation, support services, student IT
Code of Conduct for Students, Conditions of Enrollment, University Privacy Statement, Academic Integrity
Academic appeals process, special consideration, rules and guidelines, advice and support
Policy register, policy search
Scholarships, interest free loans, bursaries, money management
Learning Centre, faculty and school programs, Library, online resources
Student Centre, counselling & psychological services, University Health Service, general health and wellbeing