## Sunday, April 22, 2007

### Maximum likelihood method for missing data

• ML is a model-based estimation procedure.
• direct maximum likelihood (FIML) and the expectation maximization (EM) algorithm, can be used to obtainMLparameter estimates for structural equation models with missing data.
• approaches to ML estimation of these parameters when some data are missing: factoring the likelihood, the EM algorithm, and direct ML.
• EM Algorithm-- many SEM analysts have used the means and covariance matrix produced by the EM algorithm as input to SEM software. However, this two-step approach is less than ideal, for two reasons. When the SEM to be estimated is just-identified (i.e., the model implies no restrictions on the covariance matrix), then the resulting parameter estimates are true ML estimates. But in the more usual case when
the SEM is overidentified, the resulting estimates are not true ML estimates and are generally less efficient (although the loss of efficiency is likely to be small). Moreover,
the standard errors reported by SEM software using this two-step method will not be consistent estimates of the true standard errors.
• Direct ML-- “raw” ML (because it requires raw data as input) or “full information” ML, direct ML solves the problems that arise in the two-step EM method. Arbuckle (1996)
proposed the use of direct ML for general missing data patterns and implemented the method in the Amos program. Since then, other SEM programs have also introduced direct ML for missing data, including LISREL, M-PLUS, and MX. In Amos, the default is to use direct ML whenever the data set has missing data. Direct ML appears to be the best method for handling missing data for most SEM applications