PCA Workbook

PCA Workbook

Overview

PCA (Principal Component Analysis) allows for a better understanding of the multi-dimensional relationships among samples and attributes. The visual representation allows for an easier interpretation of these relationships.

The PCA is generated by analyzing the correlation structure of a group of multivariate observations and identifying the axis along which the maximum variability in the data occurs (Factor 1). The subsequent factors identify the greatest amount of remaining variability.




Requirements

For a valid PCA (Principal Component Analysis), the following minimum requirements need to be met:  
  1. Samples: 3+  
  2. Attributes: 6+  
The PCA reports are compatible with sample related Line Scale, Category Scale , Choose , CATA , LMS and Numeric question types .  
This workflow is focused on the typical test setup using only Category, or only Line Scale questions, or other scale questions not mixed with others. Please see the bottom of this page for information about PCA on CATA data.


Generate the PCA Workbook

  1. In the test, click Results . If you need to exclude specific samples and/or sample sets from the analysis, click Filters .

  2. Click on the Reports menu and select Create report .

  3. Under the 1 Select Report Type , select PCA workbook .

  4. Under the 2 Select Options , specify the options you wish to include in the report.


    1. Threshold as percentage of average eigenvalues. This is a way of selecting how many factors will be presented in the workbook. A higher threshold value will only include factors with larger explained variability. By lowering the threshold, more factors may be seen. Default is set at 0.95, but can be adjusted as needed.
      The eigenvalue is the variance capture by each factor.

    2. Data label text size (px). This is the font size in the report. The default is 10 pixels.

    3. Appearance of loadings. Attribute loadings are the relationship (sign and degree) between the new factor and the original sensory attribute. You can specify how you want the attributes to be marked. The available options are:
      1. Diamond
      2. Square
      3. Circle (default selection)
      4. Vector

    4. Display Labels. Select this option to display attribute names or attribute numbers. When the option is unchecked, no attribute information will be displayed.

    5. Covariance or Correlation. PCA can either be run on the covariance or correlation matrices.
      1. Covariance. Retains the original variance structure of the attributes. Situations where you would likely use covariance option:
        1. Variation in the actual scores to be used.
        2. Panel has been properly calibrated with reference standards.

      2. Correlation. Standardizes the data before running the PCA computations. This amounts to re-scaling the attributes to a standard scale. Situations where you would likely use correlation option:
        1. Variables that are measured on different scales.
        2. Variables with unequal variance.
        3. Using an untrained panel with little knowledge of the product.


    6. Include Only Significant Attributes. This option allows you to remove the attributes which are not statistically significant.
      You may wish to simplify the map by removing the attributes that do not have significant difference. However, you should be mindful not to reduce it to the point of no longer being meaningful.

      The significant difference between attributes is based on the Alpha value and , for scaled attributes, on the ANOVA type. Please refer to the points below titled Alpha and Analysis options from defaults for further details before making the decision whether or not to include only significant attributes.

      We recommend running the Summary Report on scaled attributes using your preferred analysis options to see whether there is a significant difference in any of them before running the PCA. That can help you determine if you want to use this checkbox, or manually deselect any of the attributes (see step 5) before generating the report.

    7. Use zero for missing data. When checked, this option allows you to substitute incomplete data points with zeros. When this option is unchecked, the missing data points will not be included in the analysis.

      If you know that there are missing data points (use the Data tab to identify whether there is missing data) and you do not wish to include that panelist's data at all (their sample set), we recommend leaving this option unchecked and then filtering out the incomplete sample sets.


      To filter sample sets:
      1. In the top right-hand corner of the Results page, click Filters .
      2. In the top right-hand corner of the Filters window, click Sample Sets .
      3. Select Complete and click Apply Filters .

    8. Alpha. This option will allow you to set the desired alpha for determining significant difference.

    9. Analysis Options from Defaults. Click Change advanced analysis options to select the ANOVA type that suits your analysis needs.


  5. Under the 3 Select Questions , select the questions and attributes that you wish to include in the report. All scaled attributes from the same question type, such as Line Scale, for example, will be combined and reported onto two sheets (PCA and Biplot).

  6. The 4 Select Export Type has only one export type. Click Create my report .

  7. Click the download arrow, save the report to a location on your computer or network drive and open it. 


Report Details - PCA/Biplots

There are two main sheets in this workbook:
  1. PCA (means)
  2. Biplot (means)
If a combination of different question types compatible with the PCA Workbook is used in the test, each question type will have its own pair of sheets. Please see the section below about PCA on CATA.


Both sheets consist of graphs and two tables:
  1. Loadings. Attribute loadings are the relationship (sign and degree) between the new factor and the original sensory attribute.

  2. Scores. Sample scores are the value of each sample on the new factor.

Each factor is presented with the percentage of variance explained, this comes from the eigenvalue. 

The attribute loadings and sample scores on the PCA sheet are graphed separately, while on the Biplot sheet, the attribute loadings and sample scores are graphed together.

The biplot visualizes the relationship between attributes, and the relationship between attributes and samples.


PCA on CATA Data

Considerations for the Test Setup
PCA report will combine attributes from multiple scale questions. For example, all attributes from all Line Scale questions in a test will be combined in the PCA analysis.

This is not the case with attributes listed as Check All That Apply (CATA) in multiple CATA questions. List of CATA attributes will generate PCA and Biplot sheets for each individual CATA question.

if your objective is to generate the PCA across all CATA attributes, then you will need to list all attributes in one CATA question. 

Analysis Considerations
Before running the PCA on CATA data, if you would like to identify whether there is significant difference between attributes, we recommend running the Standard Report using the Cochran's Q Test. Based on this finding you can decide whether you wish to include attributes that do not have significant difference, as described in the Generate PCA section of this workflow.



    • Related Articles

    • TI Workbook

      Overview The Time Intensity (TI) test data can be analyzed directly in Compusense by generating the TI Workbook report. On this page you will learn how to generate the report and what various options exist to help you meet your analysis needs. ...
    • Descriptive Analysis Workbook Review

      Overview The Descriptive Analysis Workbook provides detailed analyses on product and panelist performance. The report is compatible with sample related Line Scale , Category , LMS , Numeric , and ' Choose 1 ' questions when 2 or more samples are ...
    • Temporal Choice Workbook: TDS

      Overview The Temporal Dominance of Sensations (TDS) data can be analyzed directly in Compusense by generating the Temporal Choice Workbook report. On this page you will learn how to generate the report and what various options exist to help you meet ...
    • Temporal Choice Workbook: TOS

      Overview The Temporal Order of Sensations (TOS) data can be analyzed directly in Compusense by generating the Temporal Choice Workbook report. On this page you will learn how to generate the report and what various options exist to help you meet your ...
    • Temporal Choice Workbook: TCATA

      Overview The Temporal Check All That Apply (TCATA) data can be analyzed directly in Compusense by generating the Temporal Choice Workbook report. On this page you will learn how to generate the report and what various options exist to help you meet ...