What is Latent Factor Analysis?

Latent Factor Analysis is a statistical method used to uncover hidden, unobservable factors that explain the relationships among a set of observed variables.

It's a powerful technique primarily used to reduce a large number of observed variables into a smaller number of underlying constructs, known as latent variables or factors. Imagine you have many questions on a survey (observed variables); factor analysis helps determine if these questions are measuring a few core ideas (latent factors).

Understanding the Core Concept

At its heart, latent factor analysis seeks to model the structure of relationships among observed variables. It does this by assuming that the correlation among these variables is due to their shared variance explained by one or more underlying latent factors.

Observed Variables: These are the variables you can directly measure (e.g., responses to survey questions, test scores, observable behaviors).
Latent Variables (Factors): These are unobservable constructs or dimensions that are inferred from the relationships among the observed variables (e.g., "intelligence," "extroversion," "customer satisfaction").

As the reference highlights, latent variables created by factor analysis generally represent "shared" variance, or the degree to which variables "move" together. This means that if several observed variables are highly correlated, factor analysis can infer that they are influenced by a common underlying latent factor.

Key Principle: Correlation is Crucial

A fundamental principle of latent factor analysis, as stated in the reference, is that variables that have no correlation cannot result in a latent construct based on the common factor model. If observed variables don't show any tendency to vary together, there's no shared variance for a latent factor to explain. The technique relies on the relationships (correlations) between your observed variables to infer the presence and nature of the latent factors.

Why Use Latent Factor Analysis?

Researchers and analysts use this technique for several reasons:

Dimensionality Reduction: Simplifying complex datasets by replacing numerous correlated observed variables with fewer latent factors.
Identifying Underlying Structures: Discovering the hidden dimensions or constructs that explain observed phenomena.
Data Summarization: Creating composite scores for latent factors for use in further analysis.
Scale Development: Validating whether survey questions or test items reliably measure specific theoretical constructs.

Example

Consider a psychologist studying personality. They might give participants a survey with many questions about social behavior, feelings, and activities. Using latent factor analysis, they might find that the responses to questions like "I enjoy parties," "I make friends easily," and "I feel comfortable in groups" are highly correlated. Factor analysis could identify a latent factor, perhaps called "Extroversion," which explains why these observed variables tend to move together.

Observed Variable	Sample Correlation with "Extroversion"
I enjoy parties	High
I make friends easily	High
I prefer quiet evenings	Low (or negative)
I worry about the future	Low

In this simplified example, the high correlations among the first two variables (and low with others) would suggest they tap into the latent factor of Extroversion, explaining their shared variance.

In Summary

Latent factor analysis is a method for finding underlying patterns in data by explaining the correlations among observed variables through unobservable latent factors. It hinges on the concept that observed variables that vary together (shared variance) are influenced by common hidden constructs, and notably, requires that the observed variables actually are correlated to derive these latent factors.