Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/119934
Type: Thesis
Title: Application of Expectation Maximisation Algorithm on Mixed Distributions
Author: Diassinas, Christopher Luke
Issue Date: 2019
School/Discipline: School of Physical Sciences
Abstract: Mixed distributions are a statistical tool used for modelling a range of phenomena in fields as diverse as marketing, genetics, medicine, artificial intelligence, and finance. A mixture model is capable of describing a quite complex distribution of data, often in situations where a single parametric distribution is unable to provide a satisfactory result. The Expectation Maximisation (EM) algorithm is an iterative maximum likelihood method typically used to estimate parameters in incomplete data problems, such as mixtures. This thesis presents a thorough analysis of mixture modelling and estimation via the EM algorithm for normal, Weibull, exponential, gamma, loglogistic, and uniform component distributions. Full derivations of relevant EM equations are provided, including censored EM equations for exponential and Weibull component distributions. Goodness-of-fit tests assess how well an hypothesised statistical model fits a set of observations. This thesis considers two goodness-of-fit testing frameworks, the first being formal hypothesis based testing, the second being model selection via information criteria. It has been empirically justified that critical values for Kolmogorov-Smirnov, Kuiper, Cramér-von Mises, and Anderson-Darling goodness-of-fit tests don’t exhibit the same parameter independent properties as single distributions. Critical values are in fact parameter dependent, as well as being dependent on sample size, significance level, and truncation level. A comprehensive analysis is also provided of model selection via information criteria, for the Akaike information criterion, and Bayesian information criterion. Goodness-of-fit testing in this manner was found to be more appropriate for mixture modelling. The work culminates with the application of previously discussed statistical methodology to an analysis of limit-order inter-arrival times, and mid-price waiting times on the London Stock Exchange. It is reasoned that censored mixtures which include a Weibull component most appropriately model this data.
Advisor: Kizilersu, Ayse
Thomas, Anthony W.
Dissertation Note: Thesis (MPhil) -- University of Adelaide, School of Physical Sciences, 2019
Keywords: EM algorithm
mixed distributions
statistics
goodness-of-fit
econophysics
Monte Carlo
Provenance: This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals
Appears in Collections:Research Theses

Files in This Item:
File Description SizeFormat 
Diassinas2019_MPhil.pdfThesis7.85 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.