anvio

read recruitment glossary

read here means short pieces of dna, and recruitment here means matching similar short pieces together. it’s like you have a job (a genome), and people (reads) get recruited for it. so, if we have a genome and want to see if it, as a whole or in parts, exists in our sample, we use this technique.

please visit this video to fully grasp the idea.

before doing the read recruitment exercise, here is a glossary that you might find useful:

sequencing

Determining the precise order of nucleotides in a DNA or RNA.

reference sequence
metagenomes
short reads
short reads generation
  1. Fragmentation: Using physical, enzymatic, or chemical methods. These produce fragments of varying size but within a specific range (200–500 base pairs).
  2. Size selection: Using electrophoresis and purification. Fragments within the desired size range are enriched, while others are discarded.
  3. Adapter ligation: Short DNA sequences (adapters) are added to the ends of the fragments to attach them to the sequencing platform.
  4. Sequencing by synthesis: Illumina sequences the fragments. The length is defined by the number of cycles during sequencing.
contig
amplicon
metagenomic

Metagenomics is the study of genetic material recovered directly from environmental samples, bypassing the need to isolate and culture individual organisms.

single copy core genes
MAG
What is the purpose of read recruitment?
What can serve as a reference?
What can serve as short reads?

Short reads are raw sequencing reads from your dataset. These are obtained by extracting and fragmenting DNA in vitro, sequencing them with a machine, and then analyzing the sequences in silico.

Are amplicons connected or separated?

Amplicons can be connected or separated, depending on the context of the analysis:

What is the difference: amplicons - reads?

The difference between amplicons and reads can be visualized in the following image:

Amplicon vs Read
can you get amplicons from a read? We don’t usually produce amplicons from reads because reads are random fragments of DNA generated during sequencing. Amplicons, on the other hand, are specific DNA regions amplified during a PCR-based process, targeting a particular part of the genome (e.g., the 16S rRNA gene). Reads can, however, be used to reconstruct amplicons when they originate from sequencing targeted amplicons.