## Wednesday, March 15, 2017

if you click save, then choose the following four

Predicted probabilities – This creates a new variable that tells us for each case the predicted probability that the outcome will occur (that fiveem will be achieved) based on the model.

Predicted Group Membership – This new variable estimates the outcome for each participant based on their predicted probability. If the predicted probability is >0.5 then they are predicted to achieve the outcome, if it is <.5 they are predicted not to achieve the outcome. This .5 cut-point can be changed, but it is sensible to leave it at the default. The predicted classification is useful for comparison with the actual outcome!

Residuals (standardised) – This provides the residual for each participant (in terms of standard deviations for ease of interpretation). This shows us the difference between the actual outcome (0 or 1) and the probability of the predicted outcome and is therefore a useful measure of error.

Cook’s – We’ve come across this in our travels before. This generates a statistic called Cook’s distance for each participant which is useful for spotting cases which unduly influence the model (a value greater than ‘1’ usually warrants further investigation).

http://www.restore.ac.uk/srme/www/fac/soc/wie/research-new/srme/modules/mod4/11/

if you click option, then

Classification plots – Checking this option requests a chart which shows the distribution of outcomes over the probability range (classification plot). This is useful for visually identifying where the model makes most incorrect categorizations.

Hosmer-Lemeshow Goodness of fit – This option provides a X2 (Chi-square) test of whether or not the model is an adequate fit to the data. The null hypothesis is that the model is a ‘good enough’ fit to the data and we will only reject this null hypothesis (i.e. decide it is a ‘poor’ fit) if there are sufficiently strong grounds to do so (conventionally if p<.05). We will see that with very large samples as we have here there can be problems with this level of significance, but more on that later.

CI for exp(B) – CI stands for confidence interval and this option requests the range of values that we are confident that each odds ratio lies within. The setting of 95% means that there is only a p < .05 that the value for the odds ratio, exp(B), lies outside the calculated range (you can change the 95% confidence level if you are a control freak!).

Casewise listing of residuals