q2-quality-control

Formats¶

quality-control exclude-seqs¶

This method aligns feature sequences to a set of reference sequences to identify sequences that hit/miss the reference within a specified perc_identity, evalue, and perc_query_aligned. This method could be used to define a positive filter, e.g., extract only feature sequences that align to a certain clade of bacteria; or to define a negative filter, e.g., identify sequences that align to contaminant or human DNA sequences that should be excluded from subsequent analyses. Note that filtering is performed based on the perc_identity, perc_query_aligned, and evalue thresholds (the latter only if method==BLAST and an evalue is set). Set perc_identity==0 and/or perc_query_aligned==0 to disable these filtering thresholds as necessary.

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Filter out (or keep) demultiplexed single- or paired-end sequences that align to a reference database, using bowtie2 and samtools. This method can be used to filter out human DNA sequences and other contaminants in any FASTQ sequence data (e.g., shotgun genome or amplicon sequence data), or alternatively (when exclude_seqs is False) to only keep sequences that do align to the reference.

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

This visualizer compares the feature composition of pairs of observed and expected samples containing the same sample ID in two separate feature tables. Typically, feature composition will consist of taxonomy classifications or other semicolon-delimited feature annotations. Taxon accuracy rate, taxon detection rate, and linear regression scores between expected and observed observations are calculated at each semicolon-delimited rank, and plots of per-level accuracy and observation correlations are plotted. A histogram of distance between false positive observations and the nearest expected feature is also generated, where distance equals the number of rank differences between the observed feature and the nearest common lineage in the expected feature. This visualizer is most suitable for testing per-run data quality on sequencing runs that contain mock communities or other samples with known composition. Also suitable for sanity checks of bioinformatics pipeline performance.

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

This action aligns a set of query (e.g., observed) sequences against a set of reference (e.g., expected) sequences to evaluate the quality of alignment. The intended use is to align observed sequences against expected sequences (e.g., from a mock community) to determine the frequency of mismatches between observed sequences and the most similar expected sequences, e.g., as a measure of sequencing/method error. However, any sequences may be provided as input to generate a report on pairwise alignment quality against a set of reference sequences.

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

This visualizer compares a pair of observed and expected taxonomic assignments to calculate precision, recall, and F-measure at each taxonomic level, up to maximum level specified by the depth parameter. These metrics are calculated at each semicolon-delimited rank. This action is useful for comparing the accuracy of taxonomic assignment, e.g., between different taxonomy classifiers or other bioinformatics methods. Expected taxonomies should be derived from simulated or mock community sequences that have known taxonomic affiliations.

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

show_alignments: Bool: Option to plot pairwise alignments of query sequences and their top hits.[default: False]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-taxonomy¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_taxa: FeatureData[Taxonomy]: Expected taxonomic assignments[required]
observed_taxa: FeatureData[Taxonomy]: Observed taxonomic assignments[required]
feature_table: FeatureTable[RelativeFrequency]: Optional feature table containing relative frequency of each feature, used to weight accuracy scores by frequency. Must contain all features found in expected and/or observed taxa. Features found in the table but not the expected/observed taxa will be dropped prior to analysis.[optional]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[required]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
require_exp_ids: Bool: Require that all features found in observed taxa must be found in expected taxa or raise error.[default: True]
require_obs_ids: Bool: Require that all features found in expected taxa must be found in observed taxa or raise error.[default: True]
sample_id: Str: Optional sample ID to use for extracting frequency data from feature table, and for labeling accuracy results. If no sample_id is provided, feature frequencies are derived from the sum of all samples present in the feature table.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-score-viz¶

Creates histogram based on the output of decontam identify

Inputs¶

decontam_scores: Collection[FeatureData[DecontamScore]]: Output from decontam identify to be visualized[required]
table: Collection[FeatureTable[Frequency]]: Raw OTU/ASV table that was used as input to decontam-identify[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate sequences will be removed from[optional]

Parameters¶

threshold: Float % Range(0.0, 1.0, inclusive_end=True): Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float % Range(0.0, 1.0, inclusive_end=True): Select bin size for the histogram[default: 0.02]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control decontam-identify-batches¶

This method breaks an ASV table into batches based on the given metadata and identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]
rep_seqs: FeatureData[Sequence]: Representative Sequences table which contaminate seqeunces will be removed from[optional]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
split_column: Str: input metadata columns that you wish to subset the ASV table byNote: Column names must be in quotes and delimited by a space[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[required]
filter_empty_features: Bool: If true, features which are not present in a split feature table are dropped.[default: True]
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]
threshold: Float: Select threshold cutoff for decontam algorithm scores[default: 0.1]
weighted: Bool: weight the decontam scores by their associated read number[default: True]
bin_size: Float: Select bin size for the histogram[default: 0.02]

Outputs¶

batch_subset_tables: Collection[FeatureTable[Frequency]]: Directory where feature tables split based on metadata and parameter split_column values should be written.[required]
decontam_scores: Collection[FeatureData[DecontamScore]]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]
score_histograms: Visualization: The vizulaizer histograms for all decontam score objects generated from the pipeline[required]

This QIIME 2 plugin supports methods for assessing and controlling the quality of feature and sequence data.

version: 2025.10.0.dev0
website: https://github.com/qiime2/q2-quality-control
user support:: Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org

Actions¶

Name	Type	Short Description
exclude-seqs	method	Exclude sequences by alignment
filter-reads	method	Filter demultiplexed sequences by alignment to reference database.
bowtie2-build	method	Build bowtie2 index from reference sequences.
decontam-identify	method	Identify contaminants
evaluate-composition	visualizer	Evaluate expected vs. observed taxonomic composition of samples
evaluate-seqs	visualizer	Compare query (observed) vs. reference (expected) sequences.
evaluate-taxonomy	visualizer	Evaluate expected vs. observed taxonomic assignments
decontam-score-viz	visualizer	Generate a histogram representation of the scores
decontam-identify-batches	pipeline	Identify contaminants in Batch Mode

Artifact Classes¶

Formats¶

quality-control exclude-seqs¶

Citations¶

Inputs¶

query_sequences: FeatureData[Sequence]: Sequences to test for exclusion[required]
reference_sequences: FeatureData[Sequence]: Reference sequences to align against feature sequences[required]

Parameters¶

method: Str % Choices('blast', 'blastn-short') | Str % Choices('vsearch'): Alignment method to use for matching feature sequences against reference sequences[default: 'blast']
perc_identity: Float % Range(0.0, 1.0, inclusive_end=True): Reject match if percent identity to reference is lower. Must be in range [0.0, 1.0][default: 0.97]
evalue: Float: BLAST expectation (E) value threshold for saving hits. Reject if E value is higher than threshold. This threshold is disabled by default.[optional]
perc_query_aligned: Float: Percent of query sequence that must align to reference in order to be accepted as a hit.[default: 0.97]
threads: Threads: Number of threads to use. Only applies to vsearch method.[default: 1]
left_justify: Bool % Choices(False) | Bool: Reject match if the pairwise alignment begins with gaps[default: False]

Outputs¶

sequence_hits: FeatureData[Sequence]: Subset of feature sequences that align to reference sequences[required]
sequence_misses: FeatureData[Sequence]: Subset of feature sequences that do not align to reference sequences[required]

quality-control filter-reads¶

Citations¶

Langmead & Salzberg, 2012; Li et al., 2009

Inputs¶

demultiplexed_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The sequences to be trimmed.[required]
database: Bowtie2Index: Bowtie2 indexed database.[required]

Parameters¶

n_threads: Threads: Number of alignment threads to launch.[default: 1]
mode: Str % Choices('local', 'global'): Bowtie2 alignment settings. See bowtie2 manual for more details.[default: 'local']
sensitivity: Str % Choices('very-fast', 'fast', 'sensitive', 'very-sensitive'): Bowtie2 alignment sensitivity. See bowtie2 manual for details.[default: 'sensitive']
ref_gap_open_penalty: Int % Range(1, None): Reference gap open penalty.[default: 5]
ref_gap_ext_penalty: Int % Range(1, None): Reference gap extend penalty.[default: 3]
exclude_seqs: Bool: Exclude sequences that align to reference. Set this option to False to exclude sequences that do not align to the reference database.[default: True]

Outputs¶

filtered_sequences: SampleData[SequencesWithQuality¹ | PairedEndSequencesWithQuality²]: The resulting filtered sequences.[required]

quality-control bowtie2-build¶

Build bowtie2 index from reference sequences.

Citations¶

Langmead & Salzberg, 2012

Inputs¶

sequences: FeatureData[Sequence]: Reference sequences used to build bowtie2 index.[required]

Parameters¶

n_threads: Threads: Number of threads to launch.[default: 1]

Outputs¶

database: Bowtie2Index: Bowtie2 index.[required]

quality-control decontam-identify¶

This method identifies contaminant sequences from an OTU or ASV table and reports them to the user

Inputs¶

table: FeatureTable[Frequency]: Feature table which contaminate sequences will be identified from[required]

Parameters¶

metadata: Metadata: metadata file indicating which samples in the experiment are control samples, assumes sample names in file correspond to the table input parameter[required]
method: Str % Choices('combined', 'frequency', 'prevalence'): Select how to which method to id contaminants with; Prevalence: Utilizes control ASVs/OTUs to identify contaminants, Frequency: Utilizes sample concentration information to identify contaminants, Combined: Utilizes both Prevalence and Frequency methods when identifying contaminants[default: 'prevalence']
freq_concentration_column: Str: Input column name that has concentration information for the samples, used in Frequency or Combined methods[optional]
prev_control_column: Str: Input column name containing experimental or control sample metadata, used in Prevalence or Combined methods[optional]
prev_control_indicator: Str: indicate the control sample identifier (e.g. "control" or "blank"), used in Prevalence or Combined methods[optional]

Outputs¶

decontam_scores: FeatureData[DecontamScore]: The resulting table of scores from the decontam algorithm which scores each feature on how likely they are to be a contaminant sequence[required]

quality-control evaluate-composition¶

Citations¶

Bokulich et al., 2018

Inputs¶

expected_features: FeatureTable[RelativeFrequency]: Expected feature compositions[required]
observed_features: FeatureTable[RelativeFrequency]: Observed feature compositions[required]

Parameters¶

depth: Int: Maximum depth of semicolon-delimited taxonomic ranks to test (e.g., 1 = root, 7 = species for the greengenes reference sequence database).[default: 7]
palette: Str % Choices('Set1', 'Set2', 'Set3', 'Pastel1', 'Pastel2', 'Paired', 'Accent', 'Dark2', 'tab10', 'tab20', 'tab20b', 'tab20c', 'viridis', 'plasma', 'inferno', 'magma', 'terrain', 'rainbow'): Color palette to utilize for plotting.[default: 'Set1']
plot_tar: Bool: Plot taxon accuracy rate (TAR) on score plot. TAR is the number of true positive features divided by the total number of observed features (TAR = true positives / (true positives + false positives)).[default: True]
plot_tdr: Bool: Plot taxon detection rate (TDR) on score plot. TDR is the number of true positive features divided by the total number of expected features (TDR = true positives / (true positives + false negatives)).[default: True]
plot_r_value: Bool: Plot expected vs. observed linear regression r value on score plot.[default: False]
plot_r_squared: Bool: Plot expected vs. observed linear regression r-squared value on score plot.[default: True]
plot_bray_curtis: Bool: Plot expected vs. observed Bray-Curtis dissimilarity scores on score plot.[default: False]
plot_jaccard: Bool: Plot expected vs. observed Jaccard distances scores on score plot.[default: False]
plot_observed_features: Bool: Plot observed features count on score plot.[default: False]
plot_observed_features_ratio: Bool: Plot ratio of observed:expected features on score plot.[default: True]
metadata: MetadataColumn[Categorical]: Optional sample metadata that maps observed_features sample IDs to expected_features sample IDs.[optional]

Outputs¶

visualization: Visualization: <no description>[required]

quality-control evaluate-seqs¶

Citations¶