Rarefaction¶
Rarefaction (even sampling depth) is used to normalize for unequal sequencing depth across samples before calculating diversity metrics. The alpha rarefaction curves help you identify a depth where diversity estimates have stabilized, allowing you to retain as many samples as possible while excluding only those with very low read counts.
Alpha Rarefaction Curves¶
This command calculates multiple alpha diversity metrics (Faith's PD, observed features, Shannon entropy) at rarefaction depths ranging from 1 to --p-max-depth and plots them as curves:
table_nomitochloro.qzv from the previous step. In the interactive table summary, find the value around the 95th percentile of per-sample read counts. Setting max depth too low truncates the curves before they plateau; too high wastes computation. For this dataset, that value is 5500.
--p-max-depth, Maximum rarefaction depth (integer). The command calculates diversity at multiple depths from 1 to this value and plots them as curves. Should be approximately the 95th percentile of per-sample read counts from your table summary.
Choosing a Max Depth
Set --p-max-depth to approximately the 95th percentile of your per-sample read counts, which you can identify in the table_nomitochloro.qzv summary from the previous step. Setting it too high wastes computation; too low and your curves may not plateau.
Interpreting the Curves¶
Open alpha_rarefaction_curves.qzv in QIIME2 View and look for the depth where diversity metrics plateau, this is where additional reads no longer add new diversity information.
Workshop Sampling Depth
For this dataset we use a sampling depth of 1,500 reads in the core metrics step. This retains the majority of samples while excluding those with insufficient sequencing coverage.
Outputs¶
| File | Type | Description |
|---|---|---|
alpha_rarefaction_curves.qzv |
Visualization | Alpha diversity vs. rarefaction depth |
Next: Phylogenetic Tree