Taxonomic Classification¶
We classify ASVs by comparing them to the GreenGenes2 reference database using a pre-trained Naive Bayes classifier. We then visualize taxonomic composition and remove reads that originated from host mitochondria and chloroplasts.
Get the Taxonomic Classifier¶
Copy the pre-trained GreenGenes2 V4 classifier:
About the Classifier
This is a Naive Bayes classifier trained on the GreenGenes2 (2024.09) backbone sequences trimmed to the V4 hypervariable region. It is specific to the primer set and read length used in this workshop. For your own data you will need a classifier trained on the appropriate region.
Classify ASVs¶
qiime feature-classifier classify-sklearn \
--i-reads seqs.qza \
--i-classifier 2024.09.backbone.v4.nb.qza \
--o-classification taxonomy_gg2.qza
Inspect the Classification¶
Group Samples for Barplots¶
Group samples by the type_days metadata column (combined sample type + timepoint) using the mean-ceiling method:
qiime feature-table group \
--i-table table.qza \
--m-metadata-file metadata_q2_workshop.txt \
--m-metadata-column type_days \
--p-mode mean-ceiling \
--p-axis sample \
--o-grouped-table table_type_days.qza
Taxonomy Barplots¶
qiime taxa barplot \
--i-table table_type_days.qza \
--i-taxonomy taxonomy_gg2.qza \
--o-visualization taxa_barplot_type_days.qzv
Remove Mitochondria, Chloroplasts, and Contaminants¶
Filter the grouped table:
taxa_barplot_type_days.qzv and look for taxa that are host-derived rather than microbial. Mitochondria and chloroplasts amplify with 16S primers but are not bacteria. sp004296775 is a GreenGenes2 identifier for a contaminant present in this dataset. Use a comma-separated list with no spaces.
--p-exclude, Comma-separated list of strings to match against taxonomy annotations (case-insensitive substring match). Features whose taxonomy contains any of these strings are removed from the table.
Also filter the full (ungrouped) table, this version will be used in downstream R analyses:
--p-exclude, Comma-separated list of taxonomy substring matches. Same logic as the grouped table filter above.
Taxonomy Barplots Without Contaminants¶
qiime taxa barplot \
--i-table table_type_days_nomitochloro.qza \
--i-taxonomy taxonomy_gg2.qza \
--o-visualization taxa_barplot_type_days_nomitochloro.qzv
Outputs¶
| File | Type | Description |
|---|---|---|
taxonomy_gg2.qza |
Artifact | Taxonomic assignments for all ASVs |
taxonomy_gg2.qzv |
Visualization | Tabulated taxonomy with confidence scores |
table_type_days.qza |
Artifact | Feature table grouped by sample type + day |
taxa_barplot_type_days.qzv |
Visualization | Taxonomy barplot (all reads) |
table_type_days_nomitochloro.qza |
Artifact | Grouped table, contaminants removed |
table_nomitochloro.qza |
Artifact | Full table, contaminants removed |
taxa_barplot_type_days_nomitochloro.qzv |
Visualization | Taxonomy barplot (contaminants removed) |
This completes Day 1. Continue to Day 2, Community & Advanced Analyses.