Skip to main content
Unit of study_

BSTA5004: Data Management and Stats Computing (DMC)

2024 unit information

The aim of this unit is to provide students with the knowledge and skills required to undertake moderate to high level data manipulation and management in preparation for statistical analysis of data typically arising in health and medical research. Specific objectives are for students to: gain experience in data manipulation and management using two major statistical software packages (Stata and R); learn how to display and summarise data using statistical software; become familiar with the checking and cleaning of data; learn how to link files through use of unique and non-unique identifiers; acquire fundamental programming skills for efficient use of software packages; and learn key principles of confidentiality and privacy in data storage, management and analysis. The topics covered are: Module 1 - Stata and R: The basics (importing and exporting data, recoding data, formatting data, labelling variable names and data values; using dates, data display and summary presentation); and creating programs. Module 2 - Stata and R: graphs, data management and statistical quality assurance methods (including advanced graphics to produce publication-quality graphs); Module 3 - Data management using Stata and R (using functions to generate new variables, appending, merging, transposing longitudinal data; programming skills for efficient and reproducible use of these packages, including loops and arguments.

Unit details and rules

Managing faculty or University school:

Public Health

Code BSTA5004
Academic unit Public Health
Credit points 6
Prerequisites:
? 
None
Corequisites:
? 
None
Prohibitions:
? 
None
Assumed knowledge:
? 
None

At the completion of this unit, you should be able to:

  • LO1. undertake data manipulation and management using two major statistical software packages (Stata and R)
  • LO2. appropriately display and summarise data using statistical software
  • LO3. understand how to check and clean data
  • LO4. link data files through unique and non-unique identifiers
  • LO5. have fundamental programming skills for efficient use of statistical software
  • LO6. understand key principles of confidentiality and privacy in data storage, management, and analysis

Unit availability

This section lists the session, attendance modes and locations the unit is available in. There is a unit outline for each of the unit availabilities, which gives you information about the unit including assessment details and a schedule of weekly activities.

The outline is published 2 weeks before the first day of teaching. You can look at previous outlines for a guide to the details of a unit.

Session MoA ?  Location Outline ? 
Semester 1 2024
Online Camperdown/Darlington, Sydney
Semester 2 2024
Online Camperdown/Darlington, Sydney
Outline unavailable
Session MoA ?  Location Outline ? 
Semester 1 2020
Online Camperdown/Darlington, Sydney
Semester 2 Early 2020
Online Camperdown/Darlington, Sydney
Outline unavailable
Semester 1 2021
Online Camperdown/Darlington, Sydney
Semester 2 2021
Online Camperdown/Darlington, Sydney
Semester 1 2022
Online Camperdown/Darlington, Sydney
Semester 2 2022
Online Camperdown/Darlington, Sydney
Semester 1 2023
Online Camperdown/Darlington, Sydney
Semester 2 2023
Online Camperdown/Darlington, Sydney

Modes of attendance (MoA)

This refers to the Mode of attendance (MoA) for the unit as it appears when you’re selecting your units in Sydney Student. Find more information about modes of attendance on our website.