q2-longitudinal

Formats¶

longitudinal nmit¶

Perform nonparametric microbial interdependence test to determine longitudinal sample similarity as a function of temporal microbial composition. For more details and citation, please see doi.org/10.1002/gepi.22065

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Calculates first differences in "metric" between sequential states for samples collected from individual subjects sampled repeatedly at two or more states. First differences can be performed on a metadata column (including artifacts that can be input as metadata) or a feature in a feature table. Outputs a data series of first differences for each individual subject at each sequential pair of states, labeled by the SampleID of the second state (e.g., paired differences between time 0 and time 1 would be labeled by the SampleIDs at time 1). This file can be used as input to linear mixed effects models or other longitudinal or diversity methods to compare changes in first differences across time or among groups of subjects. Also supports differences from baseline (or other static comparison state) by setting the "baseline" parameter.

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Calculates first distances between sequential states for samples collected from individual subjects sampled repeatedly at two or more states. This method is similar to the "first differences" method, except that it requires a distance matrix as input and calculates first differences as distances between successive states. Outputs a data series of first distances for each individual subject at each sequential pair of states, labeled by the SampleID of the second state (e.g., paired distances between time 0 and time 1 would be labeled by the SampleIDs at time 1). This file can be used as input to linear mixed effects models or other longitudinal or diversity methods to compare changes in first distances across time or among groups of subjects. Also supports distance from baseline (or other static comparison state) by setting the "baseline" parameter.

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Performs paired difference testing between samples from each subject. Sample pairs may represent a typical intervention study (e.g., samples collected pre- and post-treatment), paired samples from two different timepoints (e.g., in a longitudinal study design), or identical samples receiving different treatments. This action tests whether the change in a numeric metadata value "metric" differs from zero and differs between groups (e.g., groups of subjects receiving different treatments), and produces boxplots of paired difference distributions for each group. Note that "metric" can be derived from a feature table or metadata.

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Performs pairwise distance testing between sample pairs from each subject. Sample pairs may represent a typical intervention study, e.g., samples collected pre- and post-treatment; paired samples from two different timepoints (e.g., in a longitudinal study design), or identical samples receiving different two different treatments. This action tests whether the pairwise distance between each subject pair differs between groups (e.g., groups of subjects receiving different treatments) and produces boxplots of paired distance distributions for each group.

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Linear mixed effects models evaluate the contribution of exogenous covariates "group_columns" and "random_effects" to a single dependent variable, "metric". Perform LME and plot line plots of each group column. A feature table artifact is required input, though whether "metric" is derived from the feature table or metadata is optional.

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Perform an ANOVA test on any factors present in a metadata file and/or metadata-transformable artifacts. This is followed by pairwise t-tests to examine pairwise differences between categorical sample groups.

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Generate an interactive control chart depicting the longitudinal volatility of sample metadata and/or feature frequencies across time (as set using the "state_column" parameter). Any numeric metadata column (and metadata-transformable artifacts, e.g., alpha diversity results) can be plotted on the y-axis, and are selectable using the "metric_column" selector. Metric values are averaged to compare across any categorical metadata column using the "group_column" selector. Longitudinal volatility for individual subjects sampled over time is co-plotted as "spaghetti" plots if the "individual_id_column" parameter is used. state_column will typically be a measure of time, but any numeric metadata column can be used.

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Plots an interactive control chart of feature abundances (y-axis) in each sample across time (or state; x-axis). Feature importance scores and descriptive statistics for each feature are plotted in interactive bar charts below the control chart, facilitating exploration of longitudinal feature data. This visualization is intended for use with the feature-volatility pipeline; use that pipeline to access this visualization.

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Identify features that are predictive of a numeric metadata column, state_column (e.g., time), and plot their relative frequencies across states using interactive feature volatility plots. A supervised learning regressor is used to identify important features and assess their ability to predict sample states. state_column will typically be a measure of time, but any numeric metadata column can be used.

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Calculates a "microbial maturity" index from a regression model trained on feature data to predict a given continuous metadata column, e.g., to predict age as a function of microbiota composition. The model is trained on a subset of control group samples, then predicts the column value for all samples. This visualization computes maturity index z-scores to compare relative "maturity" between each group, as described in doi:10.1038/nature13421. This method can be used to predict between-group differences in relative trajectory across any type of continuous metadata gradient, e.g., intestinal microbiome development by age, microbial succession during wine fermentation, or microbial community differences along environmental gradients, as a function of two or more different "treatment" groups.

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶

Bokulich et al., 2018; Subramanian et al., 2014; Bokulich et al., 2018

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata containing collection time (state) values for each sample. Must contain exclusively numeric values.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Estimator method to use for sample prediction.[default: 'RandomForestRegressor']
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]

Outputs¶

filtered_table: FeatureTable[RelativeFrequency]: Feature table containing only important features.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
volatility_plot: Visualization: Interactive volatility plot visualization.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
sample_estimator: SampleEstimator[Regressor]: Trained sample regressor.[required]

longitudinal maturity-index¶

Citations¶

Inputs¶

table: FeatureTable[Frequency]: Feature table containing all features that should be used for target prediction.[required]

Parameters¶

metadata: Metadata: <no description>[required]
state_column: Str: Numeric metadata column containing sampling time (state) data to use as prediction target.[required]
group_by: Str: Categorical metadata column to use for plotting and significance testing between main treatment groups.[required]
control: Str: Value of group_by to use as control group. The regression model will be trained using only control group data, and the maturity scores of other groups consequently will be assessed relative to this group.[required]
individual_id_column: Str: Optional metadata column containing IDs for individual subjects. Adds individual subject (spaghetti) vectors to volatility charts if a column name is provided.[optional]
estimator: Str % Choices('RandomForestRegressor', 'ExtraTreesRegressor', 'GradientBoostingRegressor', 'AdaBoostRegressor[DecisionTree]', 'AdaBoostRegressor[ExtraTrees]', 'ElasticNet', 'Ridge', 'Lasso', 'KNeighborsRegressor', 'LinearSVR', 'SVR'): Regression model to use for prediction.[default: 'RandomForestRegressor']
n_estimators: Int % Range(1, None): Number of trees to grow for estimation. More trees will improve predictive accuracy up to a threshold level, but will also increase time and memory requirements. This parameter only affects ensemble estimators, such as Random Forest, AdaBoost, ExtraTrees, and GradientBoosting.[default: 100]
test_size: Float % Range(0.0, 1.0): Fraction of input samples to exclude from training set and use for classifier testing.[default: 0.5]
step: Float % Range(0.0, 1.0, inclusive_start=False): If optimize_feature_selection is True, step is the percentage of features to remove at each iteration.[default: 0.05]
cv: Int % Range(1, None): Number of k-fold cross-validations to perform.[default: 5]
random_state: Int: Seed used by random number generator.[optional]
n_jobs: Threads: Number of jobs to run in parallel.[default: 1]
parameter_tuning: Bool: Automatically tune hyperparameters using random grid search.[default: False]
optimize_feature_selection: Bool: Automatically optimize input feature selection using recursive feature elimination.[default: False]
stratify: Bool: Evenly stratify training and test data among metadata categories. If True, all values in column must match at least two samples.[default: False]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']
feature_count: Int % Range(0, None): Filter feature table to include top N most important features. Set to zero to include all features.[default: 50]

Outputs¶

sample_estimator: SampleEstimator[Regressor]: Trained sample estimator.[required]
feature_importance: FeatureData[Importance]: Importance of each input feature to model accuracy.[required]
predictions: SampleData[RegressorPredictions]: Predicted target values for each input sample.[required]
model_summary: Visualization: Summarized parameter and (if enabled) feature selection information for the trained estimator.[required]
accuracy_results: Visualization: Accuracy results visualization.[required]
maz_scores: SampleData[RegressorPredictions]: Microbiota-for-age z-score predictions.[required]
clustermap: Visualization: Heatmap of important feature abundance at each time point in each group.[required]
volatility_plots: Visualization: Interactive volatility plots of MAZ and maturity scores, target (column) predictions, and the sample metadata.[required]

This QIIME 2 plugin supports methods for analysis of time series data, involving either paired sample comparisons or longitudinal study designs.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-longitudinal
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:: Bokulich et al., 2018

Actions¶

Name	Type	Short Description
nmit	method	Nonparametric microbial interdependence test
first-differences	method	Compute first differences or difference from baseline between sequential states
first-distances	method	Compute first distances or distance from baseline between sequential states
pairwise-differences	visualizer	Paired difference testing and boxplots
pairwise-distances	visualizer	Paired pairwise distance testing and boxplots
linear-mixed-effects	visualizer	Linear mixed effects modeling
anova	visualizer	ANOVA test
volatility	visualizer	Generate interactive volatility plot
plot-feature-volatility	visualizer	Plot longitudinal feature volatility and importances
feature-volatility	pipeline	Feature volatility analysis
maturity-index	pipeline	Microbial maturity index prediction.

Artifact Classes¶

Formats¶

longitudinal nmit¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to use for microbial interdependence test.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
corr_method: Str % Choices('kendall', 'pearson', 'spearman'): The temporal correlation test to be applied.[default: 'kendall']
dist_method: Str % Choices('fro', 'nuc'): Temporal distance method, see numpy.linalg.norm for details.[default: 'fro']

Outputs¶

distance_matrix: DistanceMatrix: The resulting distance matrix.[required]

longitudinal first-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for computing first differences.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static differences instead of first differences (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample differences at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]

Outputs¶

first_differences: SampleData[FirstDifferences]: Series of first differences.[required]

longitudinal first-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
baseline: Float: A value listed in the state_column metadata column against which all other states should be compared. Toggles calculation of static distances instead of first distances (which are calculated if no value is given for baseline). If a "baseline" value is provided, sample distances at each state are compared against the baseline state, instead of the previous state. Must be a value listed in the state_column.[optional]
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

first_distances: SampleData[FirstDifferences]: Series of first distances.[required]

longitudinal pairwise-differences¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table to optionally use for paired comparisons.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
metric: Str: Numerical metadata or artifact column to test.[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
group_column: Str: Metadata column on which to separate groups for comparison[optional]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal pairwise-distances¶

Citations¶

Inputs¶

distance_matrix: DistanceMatrix: Matrix of distances between pairs of samples.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
group_column: Str: Metadata column on which to separate groups for comparison[required]
state_column: Str: Metadata column containing state (e.g., Time) across which samples are paired.[required]
state_1: Str: Baseline state column value.[required]
state_2: Str: State column value to pair with baseline.[required]
individual_id_column: Str: Metadata column containing subject IDs to use for pairing samples. WARNING: if replicates exist for an individual ID at either state_1 or state_2, that subject will be dropped and reported in standard output by default. Set replicate_handling="random" to instead randomly select one member.[required]
parametric: Bool: Perform parametric (ANOVA and t-tests) or non-parametric (Kruskal-Wallis, Wilcoxon, and Mann-Whitney U tests) statistical tests.[default: False]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
replicate_handling: Str % Choices('error', 'random', 'drop'): Choose how replicate samples are handled. If replicates are detected, "error" causes method to fail; "drop" will discard all replicated samples; "random" chooses one representative at random from among replicates.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal linear-mixed-effects¶

Citations¶

Bokulich et al., 2018; Seabold & Perktold, 2010

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metric.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[required]
metric: Str: Dependent variable column name. Must be a column name located in the metadata or feature table files.[optional]
group_columns: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine mean structure of "metric".[optional]
random_effects: Str: Comma-separated list (without spaces) of metadata columns to use as independent covariates used to determine the variance and covariance structure (random effects) of "metric". To add a random slope, the same value passed to "state_column" should be passed here. A random intercept for each individual is set by default and does not need to be passed here.[optional]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow', 'cividis'): Color palette to use for generating boxplots.[default: 'Set1']
lowess: Bool: Estimate locally weighted scatterplot smoothing. Note that this will eliminate confidence interval plotting.[default: False]
ci: Float % Range(0, 100): Size of the confidence interval for the regression estimate.[default: 95]
formula: Str: R-style formula to use for model specification. A formula must be used if the "metric" parameter is None. Note that the metric and group columns specified in the formula will override metric and group columns that are passed separately as parameters to this method. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal anova¶

Citations¶

Parameters¶

metadata: Metadata: Sample metadata containing formula terms.[required]
formula: Str: R-style formula specifying the model. All terms must be present in the sample metadata or metadata-transformable artifacts and can be continuous or categorical metadata columns. Formulae will be in the format "a ~ b + c", where "a" is the metric (dependent variable) and "b" and "c" are independent covariates. Use "+" to add a variable; "+ a:b" to add an interaction between variables a and b; "*" to include a variable and all interactions; and "-" to subtract a particular term (e.g., an interaction term). See https://patsy.readthedocs.io/en/latest/formulas.html for full documentation of valid formula operators. Always enclose formulae in quotes to avoid unpleasant surprises.[required]
sstype: Str % Choices('I', 'II', 'III'): Type of sum of squares calculation to perform (I, II, or III).[default: 'II']
repeated_measures: Bool: Perform ANOVA as a repeated measures ANOVA. Implemented via statsmodels, which has the following limitations: Currently, only fully balanced within-subject designs are supported. Calculation of between-subject effects and corrections for violation of sphericity are not yet implemented.[default: False]
individual_id_column: Str: The column containing individual ID with repeated measures to account for.This should not be included in the formula.[optional]
rm_aggregate: Bool: If the data set contains more than a single observation per individual id and cell of the specified model, this function will be used to aggregate the data by the mean before running the ANOVA. Only applicable for repeated measures ANOVA. [default: False]

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing metrics.[optional]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
default_metric: Str: Numeric metadata or artifact column to test by default (all numeric metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']

Outputs¶

visualization: Visualization: <no description>[required]

Examples¶

longitudinal_volatility¶

[Command Line]

[Python API]

[Galaxy]

[R API]

[View Source]

wget -O 'metadata.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'

qiime longitudinal volatility \
  --m-metadata-file metadata.tsv \
  --p-state-column month \
  --o-visualization volatility-plot.qzv

from qiime2 import Metadata
from urllib import request
import qiime2.plugins.longitudinal.actions as longitudinal_actions

url = 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn = 'metadata.tsv'
request.urlretrieve(url, fn)
metadata_md = Metadata.load(fn)

volatility_plot_viz, = longitudinal_actions.volatility(
    metadata=metadata_md,
    state_column='month',
)

Using the Upload Data tool:

On the first tab (Regular), press the Paste/Fetch data button at the bottom.
1. Set "Name" (first text-field) to: metadata.tsv
2. In the larger text-area, copy-and-paste: https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv
3. ("Type", "Genome", and "Settings" can be ignored)
Press the Start button at the bottom.

Using the qiime2 longitudinal volatility tool:

For "metadata":
- Perform the following steps.
  1. Leave as Metadata from TSV
  2. Set "Metadata Source" to metadata.tsv
Set "state_column" to month
Press the Execute button.

Once completed, for the new entry in your history, use the Edit button to set the name as follows:

(Renaming is optional, but it will make any subsequent steps easier to complete.)

History Name	"Name" to set (be sure to press [Save])
`#: qiime2 longitudinal volatility [...] : visualization.qzv`	`volatility-plot.qzv`

library(reticulate)

Metadata <- import("qiime2")$Metadata
longitudinal_actions <- import("qiime2.plugins.longitudinal.actions")
request <- import("urllib")$request

url <- 'https://amplicon-docs.qiime2.org/en/latest/data/examples/longitudinal/volatility/1/metadata.tsv'
fn <- 'metadata.tsv'
request$urlretrieve(url, fn)
metadata_md <- Metadata$load(fn)

action_results <- longitudinal_actions$volatility(
    metadata=metadata_md,
    state_column='month',
)
volatility_plot_viz <- action_results$visualization

metadata.tsv | download
volatility-plot.qzv | download | view

longitudinal plot-feature-volatility¶

Citations¶

Inputs¶

table: FeatureTable[RelativeFrequency]: Feature table containing features found in importances.[required]
importances: FeatureData[Importance]: Feature importance scores.[required]

Parameters¶

metadata: Metadata: Sample metadata file containing individual_id_column.[required]
state_column: Str: Metadata column containing state (time) variable information.[required]
individual_id_column: Str: Metadata column containing IDs for individual subjects.[optional]
default_group_column: Str: The default metadata column on which to separate groups for comparison (all categorical metadata columns will be available in the visualization).[optional]
yscale: Str % Choices('linear', 'pow', 'sqrt', 'log'): y-axis scaling strategy to apply.[default: 'linear']
importance_threshold: Float % Range(0, None, inclusive_start=False) | Str % Choices('q1', 'q2', 'q3'): Filter feature table to exclude any features with an importance score less than this threshold. Set to "q1", "q2", or "q3" to select the first, second, or third quartile of values. Set to "None" to disable this filter.[optional]
feature_count: Int % Range(1, None) | Str % Choices('all'): Filter feature table to include top N most important features. Set to "all" to include all features.[default: 100]
missing_samples: Str % Choices('error', 'ignore'): How to handle missing samples in metadata. "error" will fail if missing samples are detected. "ignore" will cause the feature table and metadata to be filtered, so that only samples found in both files are retained.[default: 'error']

Outputs¶

visualization: Visualization: <no description>[required]

longitudinal feature-volatility¶

Citations¶