Skip to main content
Unit of study_

STAT4027: Advanced Statistical Modelling

Applied Statistics fundamentally brings statistical learning to the wider world. Some data sets are complex due to the nature of their responses or predictors or have high dimensionality. These types of data pose theoretical, methodological and computational challenges that require knowledge of advanced modelling techniques, estimation methodologies and model selection skills. In this unit you will investigate contemporary model building, estimation and selection approaches for linear and generalised linear regression models. You will learn about two scenarios in model building: when an extensive search of the model space is possible; and when the dimension is large and either stepwise algorithms or regularisation techniques have to be employed to identify good models. These particular data analysis skills have been foundational in developing modern ideas about science, medicine, economics and society and in the development of new technology and should be in the toolkit of all applied statisticians. This unit will provide you with a strong foundation of critical thinking about statistical modelling and technology and give you the opportunity to engage with applications of these methods across a wide scope of applications and for research or further study.

Code STAT4027
Academic unit Mathematics and Statistics Academic Operations
Credit points 6
(STAT3X12 or STAT3X22 or STAT4022) and (STAT3X13 or STAT3X23 or STAT4023)
Assumed knowledge:
A three year major in statistics or equivalent including familiarity with material in DATA2X02 and STAT3X22 (applied statistics and linear models) or equivalent

At the completion of this unit, you should be able to:

  • LO1. Apply inference methods to estimate the model parameters. These methods include maximum likelihood, expectation maximumisation, iterative re-weighted least square, M-estimation, quasi-likelihood method and generalised estimating equation.
  • LO2. Understand the idea of generalised linear models and exponential family to model counts, binary data, and data with a positive domain.
  • LO3. Apply the different modeling strategies to describe the location of a data distribution including generalised additive model, regime switching, quantile, mixture and state space model.
  • LO4. Analyse survival data with censoring using Kaplan Meier model and perform regression using proportional hazard with Weibull, piece-wise exponential hazard and Cox's proportional hazard models.
  • LO5. Perform regression for count data allowing for different levels of dispersion using mixture model and Poisson, negative binomial and generalised Poisson distributions as well as allowing for zero inflation using zero-inflated and hurdle models.
  • LO6. Perform regression for binary data using logit, probit and complementary log-log link functions. Understand the properties of these models and goodness-of-fit. Apply Fisher exact test to 2x2 contingency table and measure association between two binary variables.
  • LO7. Perform regression for multinominal data in contingency table with different experimental designs using log-linear model and two logit structures: multinominal and hierarchical. Explore the relationship with Poisson and binominal regressions. Interpret the types of assoication for different log-linear models. Study special cases of collapsing table, decomposable table, incomplete table, symmetric (and quasi-symmetric) table and marginal homogenous table.
  • LO8. Perform regression for ordinal data using order logit link.
  • LO9. Perform beta regression for rate data.

Unit outlines

Unit outlines will be available 2 weeks before the first day of teaching for the relevant session.