When creating segments using Numeric Questions, in some situations it can be useful to standardize (normalize) the variables prior to doing the analysis. For example, if one question is on a 10 point scale and another is on a 5 point scale, in cluster analysis, the data on the 10 point scale will usually dominate the analysis, all else being equal.
The two main ways of creating segments in Q are:
- Latent Class Analysis (recommended).
- Cluster Analysis.
A numeric variable you want to standardize when creating segments.
Standardizing data with latent class analysis
By default, Q's latent class algorithms automatically normalize data between questions. For example, if you have one question with a 5 point scale and another with a 10 point scale, the mathematics of latent class analysis implicitly treats both questions as if they were on the same scale. You can modify the extent of importance of a particular question in the analysis using Question weights
However, within a particular question the data is not automatically normalized, which means that within a question, variables with higher standard deviations will, all else being equal, be more influential. You can, however, get Q to also standardize within questions. This is done with Segments by selecting Advanced and changing the Distribution of segments to Multivariate Normal - Diagonal.
Standardizing data with cluster analysis
To standardize the data prior to cluster analysis, it is necessary to standardize the variables.