Skip to content

Day 1

Sequence Processing & Taxonomy

Day 1 is split into three blocks. The morning is foundations, Alpine and Linux, the biology behind microbiome data, and the upstream wet-lab steps. Late morning shifts to tooling, VS Code, metadata, and activating QIIME2. The afternoon is the pipeline proper, importing sequences, denoising into ASVs with DADA2, and assigning taxonomy.

Tutorials

Foundations

  1. Alpine Overview & Linux Refresher

    Connect to Alpine, claim an interactive node, and brush up on the Linux commands you'll use all week.

  2. Microbial Ecology & QIIME2 Core Concepts

    What microbiome data is, what 16S amplicons measure, and how QIIME2's artifacts and visualizations are organized.

  3. Sample Preparation, Sequencing Technology & NECs

    Sample collection, DNA extraction, library prep, sequencing chemistry, and the controls that protect every batch.

Tooling

  1. Keeping Track of Code & Using VS Code

    Editing on Alpine via VS Code Remote-SSH and keeping a re-runnable record of your analysis.

  2. Metadata & the Decomposition Microbiome

    Frame the dataset's biological context (vertebrate decomposition, postmortem succession, ADD), then walk through the metadata columns that encode it.

  3. Activating QIIME2 & Importing Metadata

    Load the QIIME2 module, set up your working directory, copy the metadata file, and validate it.

Pipeline

  1. Importing Sequences

    Use manifest files to import paired-end FASTQ files from two sequencing runs into QIIME2 artifacts.

  2. Denoising with DADA2

    Remove sequencing errors, merge paired reads, remove chimeras, and merge outputs from both runs into a single feature table.

  3. Taxonomic Classification

    Classify ASVs against GreenGenes2, generate taxonomy bar plots, and remove host mitochondrial / chloroplast reads.

Key Outputs

Artifact Description
demux_run2.qza / demux_run3.qza Imported demultiplexed sequences per run
table_run2.qza / table_run3.qza Per-run ASV feature tables
seqs_run2.qza / seqs_run3.qza Per-run representative sequences
dada2_stats_run2.qza / dada2_stats_run3.qza Per-run denoising statistics
table.qza Merged feature table (both runs)
seqs.qza Merged representative sequences (both runs)
taxonomy_gg2.qza Taxonomic classification for all ASVs
table_nomitochloro.qza Feature table with mitochondria / chloroplast reads removed