Overview
PCA (Principal Component Analysis) allows for a better understanding of the multi-dimensional relationships among samples and attributes. The visual representation allows for an easier interpretation of these relationships.
The PCA is generated by analyzing the correlation structure of a group of multivariate observations and identifying the axis along which the maximum variability in the data occurs (Factor 1). The subsequent factors identify the greatest amount of remaining variability.
Requirements
For a valid PCA (Principal Component Analysis), the following minimum requirements need to be met:
-
Samples: 3+
-
Attributes: 6+
This workflow is focused on the typical test setup using only
Category, or only
Line Scale questions, or other scale questions not mixed with others. Please see the bottom of this page for information about
PCA on CATA data.
Generate the PCA
Workbook
-
In the test, click
Results
.
If you need to exclude specific samples and/or sample sets from the analysis, click
Filters
.
-
Click on the
Reports
menu and select
Create report
.
-
Under the
1 Select Report Type
, select
PCA workbook
.
-
Under the
2 Select Options
, specify the options you wish to include in the report.
-
Threshold as percentage of average eigenvalues.
This is a way of selecting how many factors will be presented in the workbook. A higher threshold value will only include factors with larger explained variability. By lowering the threshold, more factors may be seen. Default is set at 0.95, but can be adjusted as needed.
The eigenvalue is the variance capture by each factor.
-
Data label text size (px).
This is the font size in the report. The default is 10 pixels.
-
Appearance of loadings.
Attribute loadings are the relationship (sign and degree) between the new factor and the original sensory attribute. You can specify how you want the attributes to be marked. The available options are:
-
Diamond
-
Square
-
Circle
(default selection)
-
Vector
-
Display Labels.
Select this option to display attribute names or attribute numbers. When the option is unchecked, no attribute information will be displayed.
-
Covariance or Correlation.
PCA can either be run on the covariance or correlation matrices.
-
Covariance.
Retains the original variance structure of the attributes. Situations where you would likely use covariance option:
-
Variation in the actual scores to be used.
-
Panel has been properly calibrated with reference standards.
-
Correlation.
Standardizes the data before running the PCA computations. This amounts to re-scaling the attributes to a standard scale. Situations where you would likely use correlation option:
-
Variables that are measured on different scales.
-
Variables with unequal variance.
-
Using an untrained panel with little knowledge of the product.
-
Include Only Significant Attributes.
This option allows you to remove the attributes which are not statistically significant.
You may wish to simplify the map by removing the attributes
that do not have significant difference. However, you should be mindful
not to reduce it to the point of no longer being meaningful.
The significant difference between attributes is based on the Alpha value
and
, for scaled attributes, on the ANOVA type. Please refer to the points below titled Alpha and Analysis options from defaults for further details before making the decision whether or not to include only significant attributes.
We recommend running the
Summary Report
on scaled attributes using your preferred analysis options
to see whether there is a significant difference in any of them before
running the PCA. That can help you determine if you want to use this
checkbox, or manually deselect any of the attributes (see step 5) before
generating the report.
Use zero for missing data.
When checked, this option allows you to substitute incomplete data points with zeros. When this option is unchecked, the missing data points will not be included in the analysis.
If you know that there are missing data points (use the
Data
tab to identify whether there is missing data) and you do not wish to include that panelist's data at all (their sample set), we recommend leaving this option unchecked and then filtering out the incomplete sample sets.
To filter sample sets:
-
In the top right-hand corner of the
Results page, click
Filters
.
-
In the top right-hand corner of the
Filters
window, click
Sample Sets
.
-
Select
Complete
and click
Apply Filters
.
-
Alpha. This option will allow you to set the desired alpha for determining significant difference.
-
Analysis Options from Defaults. Click Change advanced analysis options to select the ANOVA type that suits your analysis needs.
-
Under the
3 Select Questions
, select the questions and attributes that you wish to include in the report. All scaled attributes from the same question type, such as Line Scale, for example, will be combined and reported onto two sheets (PCA and Biplot).
-
The
4 Select Export Type
has only one export type. Click
Create my report
.
-
Click the download arrow, save the report to a location on your computer or network drive and open it.
Report Details - PCA/Biplots
There are two main sheets in this workbook:
-
PCA (means)
-
Biplot (means)
If a combination of different question types compatible with the PCA Workbook is used in the test, each question type will have its own pair of sheets. Please see the section below about
PCA on CATA.
Both sheets consist of graphs and two tables:
-
Loadings.
Attribute loadings are the relationship (sign and degree) between the new factor and the original sensory attribute.
-
Scores.
Sample scores are the value of each sample on the new factor.
Each factor is presented with the percentage of variance explained, this comes from the eigenvalue.
The attribute loadings and sample scores on the PCA sheet are graphed separately, while on the Biplot sheet, the attribute loadings and sample scores are graphed together.
The biplot visualizes the relationship between attributes, and the relationship between attributes and samples.
PCA on CATA Data
Considerations for the Test Setup
PCA report will combine attributes from multiple scale questions. For example, all attributes from all Line Scale questions in a test will be combined in the PCA analysis.
This is not the case with attributes listed as Check All That Apply (CATA) in multiple CATA questions. List of CATA attributes will generate PCA and Biplot sheets for each individual CATA question.
if your objective is to generate the PCA across all CATA attributes, then you will need to list all attributes in one CATA question.
Analysis Considerations
Before running the PCA on CATA data, if you would like to identify whether there is significant difference between attributes, we recommend running the
Standard Report
using the Cochran's Q Test. Based on this finding you can decide whether you wish to include attributes that do not have significant difference, as described in the Generate PCA section of this workflow.