This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

This QIIME 2 plugin uses cutadapt to work with adapters (e.g. barcodes, primers) in sequence data.

version: 2024.10.0
website: https://github.com/qiime2/q2-cutadapt
user support:
Please post to the QIIME 2 forum for help with this plugin: https://forum.qiime2.org
citations:
Martin, 2011

Actions

NameTypeShort Description
trim-singlemethodFind and remove adapters in demultiplexed single-end sequences.
trim-pairedmethodFind and remove adapters in demultiplexed paired-end sequences.
demux-singlemethodDemultiplex single-end sequence data with barcodes in-sequence.
demux-pairedmethodDemultiplex paired-end sequence data with barcodes in-sequence.


cutadapt trim-single

Search demultiplexed single-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[SequencesWithQuality]

The single-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read.[optional]

anywhere: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[SequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt trim-paired

Search demultiplexed paired-end sequences for adapters and remove them. The parameter descriptions in this method are adapted from the official cutadapt docs - please see those docs at https://cutadapt.readthedocs.io for complete details.

Citations

Martin, 2011

Inputs

demultiplexed_sequences: SampleData[PairedEndSequencesWithQuality]

The paired-end sequences to be trimmed.[required]

Parameters

cores: Threads

Number of CPU cores to use.[default: 1]

adapter_f: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in forward read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_f: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in forward read.[optional]

anywhere_f: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in forward read.[optional]

adapter_r: List[Str]

Sequence of an adapter ligated to the 3' end. The adapter and any subsequent bases are trimmed. If a $ is appended, the adapter is only found if it is at the end of the read. Search in reverse read. If your sequence of interest is "framed" by a 5' and a 3' adapter, use this parameter to define a "linked" primer - see https://cutadapt.readthedocs.io for complete details.[optional]

front_r: List[Str]

Sequence of an adapter ligated to the 5' end. The adapter and any preceding bases are trimmed. Partial matches at the 5' end are allowed. If a ^ character is prepended, the adapter is only found if it is at the beginning of the read. Search in reverse read.[optional]

anywhere_r: List[Str]

Sequence of an adapter that may be ligated to the 5' or 3' end. Both types of matches as described under adapter and front are allowed. If the first base of the read is part of the match, the behavior is as with front, otherwise as with adapter. This option is mostly for rescuing failed library preparations - do not use if you know which end your adapter was ligated to. Search in reverse read.[optional]

error_rate: Float % Range(0, 1, inclusive_end=True)

Maximum allowed error rate.[default: 0.1]

indels: Bool

Allow insertions or deletions of bases when matching adapters.[default: True]

times: Int % Range(1, None)

Remove multiple occurrences of an adapter if it is repeated, up to times times.[default: 1]

overlap: Int % Range(1, None)

Require at least overlap bases of overlap between read and adapter for an adapter to be found.[default: 3]

match_read_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in reads.[default: False]

match_adapter_wildcards: Bool

Interpret IUPAC wildcards (e.g., N) in adapters.[default: True]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

discard_untrimmed: Bool

Discard reads in which no adapter was found.[default: False]

max_expected_errors: Float % Range(0, None)

Discard reads that exceed maximum expected erroneous nucleotides.[optional]

max_n: Float % Range(0, None)

Discard reads with more than COUNT N bases. If COUNT_or_FRACTION is a number between 0 and 1, it is interpreted as a fraction of the read length.[optional]

quality_cutoff_5end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 5 prime end.[default: 0]

quality_cutoff_3end: Int % Range(0, None)

Trim nucleotides with Phred score quality lower than threshold from 3 prime end.[default: 0]

quality_base: Int % Range(0, None)

How the Phred score is encoded (33 or 64).[default: 33]

Outputs

trimmed_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting trimmed sequences.[required]


cutadapt demux-single

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedSingleEndBarcodeInSequence

The single-end sequences to be demultiplexed.[required]

Parameters

barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes.[required]

cut: Int

Remove the specified number of bases from the sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences.[default: 0]

anchor_barcode: Bool

Anchor the barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate. The default value specified by cutadapt is 0.1 (=10%), which is greater than demux emp-*, which is 0.0 (=0%).[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[SequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedSingleEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

demux_single

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-single/1/md.tsv'

qiime cutadapt demux-single \
  --i-seqs seqs.qza \
  --m-barcodes-file md.tsv \
  --m-barcodes-column BarcodeSequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza

cutadapt demux-paired

Demultiplex sequence data (i.e., map barcode reads to sample ids). Barcodes are expected to be located within the sequence data (versus the header, or a separate barcode file).

Citations

Martin, 2011

Inputs

seqs: MultiplexedPairedEndBarcodeInSequence

The paired-end sequences to be demultiplexed.[required]

Parameters

forward_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the forward reads.[required]

reverse_barcodes: MetadataColumn[Categorical]

The sample metadata column listing the per-sample barcodes for the reverse reads.[optional]

forward_cut: Int

Remove the specified number of bases from the forward sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

reverse_cut: Int

Remove the specified number of bases from the reverse sequences. Bases are removed before demultiplexing. If a positive value is provided, bases are removed from the beginning of the sequences. If a negative value is provided, bases are removed from the end of the sequences. If --p-mixed-orientation is set, then both --p-forward-cut and --p-reverse-cut must be set to the same value.[default: 0]

anchor_forward_barcode: Bool

Anchor the forward barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the forward sequence. Can speed up demultiplexing if used.[default: False]

anchor_reverse_barcode: Bool

Anchor the reverse barcode. The barcode is then expected to occur in full length at the beginning (5' end) of the reverse sequence. Can speed up demultiplexing if used.[default: False]

error_rate: Float % Range(0, 1, inclusive_end=True)

The level of error tolerance, specified as the maximum allowable error rate.[default: 0.1]

batch_size: Int % Range(0, None)

The number of samples cutadapt demultiplexes concurrently. Demultiplexing in smaller batches will yield the same result with marginal speed loss, and may solve "too many files" errors related to sample quantity. Set to "0" to process all samples at once.[default: 0]

minimum_length: Int % Range(1, None)

Discard reads shorter than specified value. Note, the cutadapt default of 0 has been overridden, because that value produces empty sequence records.[default: 1]

mixed_orientation: Bool

Handle demultiplexing of mixed orientation reads (i.e. when forward and reverse reads coexist in the same file).[default: False]

cores: Threads

Number of CPU cores to use.[default: 1]

Outputs

per_sample_sequences: SampleData[PairedEndSequencesWithQuality]

The resulting demultiplexed sequences.[required]

untrimmed_sequences: MultiplexedPairedEndBarcodeInSequence

The sequences that were unmatched to barcodes.[required]

Examples

paired

[Command Line]
[Python API]
[Galaxy]
[R API]
[View Source]
wget -O 'seqs.qza' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/seqs.qza'

wget -O 'md.tsv' \
  'https://amplicon-docs.qiime2.org/en/latest/data/examples/cutadapt/demux-paired/1/md.tsv'

qiime cutadapt demux-paired \
  --i-seqs seqs.qza \
  --m-forward-barcodes-file md.tsv \
  --m-forward-barcodes-column barcode-sequence \
  --o-per-sample-sequences per-sample-sequences.qza \
  --o-untrimmed-sequences untrimmed-sequences.qza