4  Multivariate

Check out the user manual for more details about this module’s features.

PCA

Use the Multivariate >> PCA module to overview the major modes of variance in the data.

Methods

Select calculate and view the analysis methods and results.

Explore and plot

Create a cumulative screeplot to view the cumulative (sum) variance explained for the selected components. Note the first two principal components, principal plane, explain the major modes of variance in the data. However, to better reconstruct the full data enough components to reproduce at least 80% of of the variance (as denoted by the eigenvalues or cross-validated q2).

Create a plot of sample (row) scores to view the cumulative (sum) variance explained for the selected components. Note the first two principal components, principal plane, explain the major modes of variance in the data. However, to better reconstruct the full data enough components to reproduce at least 80% of of the variance (as denoted by the eigenvalues or cross-validated q2).

Create a diagnostic plot to overview extreme (leverage) and intermediate (DmodX) sample outliers. Leverage corresponds to sample outliers in the principal plane and DmodX samples which are far in the projection (orthogonal) to the principal plane. Any samples with high leverage and DmodX may need to be removed to improve the model fit of projection pursuits like PCA.

Create a loadings plot to view variable contributions to samples scores. Use the labels to specify variable names and text size. Variables with large loadings on the x-axis or y-axis have the largest contribution to defining sample scores in the x-axis and y-axis, respectively. For example the following plot suggests that as sample scores move left on the x-axis the contain larger amounts of 1,5-anhydroglucitol. Variable loadings can be colored base on variable meta data. For example, we can view which variables displayed significant differences for class based on the statistical test (shown in green).

Create a biplot which overlays samples scores and variable loadings. This plot can be used to get an overview of the variable abundance among samples as encoded by the PCA. For examples, diabetic samples generally have high amounts of 1,5,-anhydroglucitol (the reverse can be said of non-diabetic samples) and diabetic samples have high amounts of 2-hydroxybutanoic acid (the reverse can be said of diabetic samples).

Customize plots by specifying which components to view. Which groups to base point colors on and their border. As well as point size and transparency. The labels menu can be used to modify point labels text and size and the legend title.

Report

Create areport to save methods and results.

Save

Save the results for later analyses. This will save the principal component scores,sample leverage and DmodX.