Matrix Ill-Conditioned – What went wrong?
For those working with general linear models, this warning message may look familiar:
A lot is going on in this message. What it often boils down to is the units and scale of the data is getting in the way of a good model. I think of this message occurring when one X variable has a wide range, and therefore larger variance to match and another x variable has a much smaller scale and range with small variance. The scale of the data gets in the way of the math behind the scenes and can create regression models that are unstable.
When this message is displayed, my first consideration is a simple scaling of data. Say I have a variable on a scale of 100,000 to 5,000,000. Simply dividing this column by 1,000 or 10,000 perhaps is all I would need to avoid this message and get a good, stable regression model. Alternatively, the measures on a scale of 0.00025 to 0.001 for example could be multiplied by 100 to even out the scale as well. This article should help you get started in creating a variable and using spreadsheet functions.
Not sure which variable is the offender? Use descriptive statistics to compute the variance of your set of X predictor variables to find out. The check performed by GLM compares the ratio of smallest variance to largest. We could sort the descriptive statistics output by the variance column and quickly see not only the highest and lowest variance variables, but also any others that have a similar variance that might need scaled.
Of course, you can go the other route the message suggests; increase sweep delta. This changes the threshold for the test checking for a variance issue in the set of X predictor variables. Making this change does not affect the computations that produce a regression equation. It only affects the threshold for the variance stability check.