1  Preprocessing

Check out the user manual for more details about this module’s features.

Merge

Metabolomic data from a real-world experiment comparing diabetic to non-diabetic samples will be loaded and preprocessed for analysis.

Load the data

Use the data module to load the example data. Select the load menu then Data type = example then the load button. Next select dave from the data menu.

Prepare the data

Use the preprocess >> merge module to format the data set and merge with the variable meta data. Select the blue question mark icon at the bottom right to get detailed instructions.

Add non-numeric variables to the row meta data.

Merge the data index with meta data describing the numeric variables (e.g. KEGG and CID identifiers).

Select calculate and overview the results.

Explore and plot

View the dimensions for the created data assets.

Overview missing values for each sample (row) and variable. Connected lines show missing values for the same sample for multiple variables.

Report

Create a report capturing the methods results and figures.

Save the results for later analyses.

The created sample and variable meta data can be viewed in the data menu as dave__row_meta and dave__col_meta, respectively.

From here on every module’s analysis will follow the same basic workflow.

Missing

Methods

Next remove and/or impute missing values. Use the preprocess >> missing module to remove and/or impute missing values. Note this is an important step to make sure down stream analyses are compatible.
Use the filter menu to drop variables which have > greater than the missing cutoff percent missing values for a factor of interest, group. For examples the example selection will remove any variables which have more than 50% missing values for any levels of the class factor.

Use the [impute] {.txbox_blue} menu to impute any remaining missing values. For example, the example options will impute the remaining missing values as 2/3 the minimum for of each variable. This is a reasonable assumption if values are missing because they are below the limits of detection.

Select calculate and overview the results.

Explore and plot

Review missing values for each row and any removed variables based on the missingness filter.

Review any remaining missing values in the data.

Next create a report and save the analysis results.