Introduction to Ancient DNA (aDNA)
Ancient DNA (aDNA) refers to genetic material recovered from archaeological, paleontological, and historical specimens. This field merges molecular biology, archaeology, and bioinformatics to reveal insights about past populations, evolution, and human history. The analysis of aDNA has revolutionized our understanding of human migrations, extinct species, disease evolution, and adaptation processes. Despite significant challenges such as DNA degradation, contamination, and limited sample availability, technological advances in the past two decades have transformed aDNA from a niche specialty to a mainstream scientific discipline. This cheat sheet provides a comprehensive overview of ancient DNA technologies, methodologies, applications, and best practices for researchers and practitioners.
Core Concepts and Terminology
Key aDNA Characteristics
Characteristic | Description | Implications for Research |
---|---|---|
Fragmentation | DNA breaks into short fragments (typically <100 bp) | Requires specialized extraction and sequencing approaches |
Low Endogenous Content | Often <1% of extracted DNA belongs to target organism | Necessitates enrichment methods and deep sequencing |
Chemical Modifications | Cytosine deamination (C→T) especially at fragment ends | Creates sequencing errors; used for authentication |
Cross-linking | Chemical bonds between DNA and proteins | Reduces extraction efficiency; requires specialized protocols |
Contamination | Modern human/microbial DNA introduction | Complicates analysis; requires strict lab controls |
Limited Quantity | Very small amounts of preserved DNA | Drives need for ultra-sensitive methods |
Preservation Factors
- Temperature: Colder environments better preserve DNA (permafrost, caves)
- Humidity: Dry conditions (desert) or waterlogged anaerobic environments
- pH: Neutral to slightly alkaline conditions are optimal
- Microbial Activity: Less microbial action means better preservation
- Time: Generally, older samples contain less intact DNA
- Specimen Type: Dense materials (petrous bone, teeth) preserve DNA better
Sample Collection and Handling
Optimal Sample Types
- Petrous Portion of Temporal Bone: Highest DNA yield in humans/animals
- Teeth: Especially dentine and cementum layers
- Dense Cortical Bone: Better preservation than cancellous (spongy) bone
- Hair with Root Sheath: Contains nuclear DNA
- Plant Seeds/Tissues: In dry preservation contexts
- Coprolites: For dietary and microbiome studies
- Sediments: For environmental DNA studies
Contamination Prevention Protocol
Field Collection:
- Use sterile tools and gloves
- Change gloves between samples
- Minimize handling
- Store in sterile containers
- Avoid exposure to direct sunlight
- Keep cool if possible
Laboratory Processing:
- Physical cleaning (UV, bleach treatment of surfaces)
- Sample surface removal (2-3mm)
- Dedicated aDNA facilities with:
- Positive air pressure
- HEPA filtration
- UV irradiation
- Separate pre- and post-PCR areas
- PPE (coveralls, face masks, hairnets, shoe covers, double gloves)
- Unidirectional workflow
Extraction Technologies
Classic Extraction Methods
Phenol-Chloroform Extraction
- Involves organic separation of DNA from proteins
- Limited sensitivity for highly degraded DNA
Silica-Based Methods
- Binds DNA to silica in presence of chaotropic salts
- Modified versions more effective for aDNA
- Examples: QIAquick PCR Purification Kit (modified)
Specialized aDNA Extraction Protocols
Dabney Protocol (2013)
- Optimized for ultrashort DNA fragments
- Uses extended binding buffer and modified silica columns
- Significantly higher recovery of fragments <50bp
Rohland & Hofreiter Method
- Uses guanidinium thiocyanate buffer
- Optimized for challenging samples like cave sediment
Glocke & Meyer Protocol
- Pre-digestion step to remove contamination
- Optimized for highly contaminated samples
- Multiple digestion steps
DNA from Sediments
- Phosphate buffers for clay-rich samples
- DNA binding to clay particles requires specialized approaches
Pre-Treatment Innovations
- EDTA Demineralization: Dissolves bone mineral matrix
- Proteinase K Digestion: Breaks down proteins bound to DNA
- N-phenacylthiazolium Bromide (PTB): Breaks DNA-protein crosslinks
- Bleach Surface Decontamination: Removes external contamination
- UV Irradiation of Surface: Damages contaminant DNA
Library Preparation Methods
Single-Stranded vs. Double-Stranded DNA Libraries
Aspect | Single-Stranded Method | Double-Stranded Method |
---|---|---|
Recovery Efficiency | Higher (captures both strands) | Lower (requires intact dsDNA) |
Fragment Size Recovery | Better for ultra-short fragments | Less efficient for <40bp |
Technical Difficulty | More complex protocol | Simpler, more established |
Contamination Susceptibility | Lower (can distinguish damage patterns) | Higher |
Cost | Higher | Lower |
Typical Applications | Very old/challenging samples | Relatively well-preserved samples |
Key Library Preparation Protocols
Meyer & Kircher Protocol (2010)
- Double-stranded approach
- Illumina platform compatibility
- Widely used standard method
Gansauge & Meyer Protocol (2013)
- Single-stranded approach
- Significantly higher DNA recovery
- Critical breakthrough for very old samples
Swift Accel-NGS 2S
- Commercial kit with modifications for aDNA
- Reduces chimera formation
NEBNext Ultra II DNA
- Commercial kit optimized for low-input DNA
- Requires modifications for aDNA
Unique Molecular Identifiers (UMIs)
- Short, random nucleotide sequences added during library preparation
- Allow identification of PCR duplicates
- Critical for accurate quantification and error correction
- Implementation:
- Add during adapter ligation
- Track during bioinformatic processing
- Collapse identical reads with same UMI
Enrichment Technologies
Target Enrichment Methods
Hybridization Capture
- Principle: Uses DNA/RNA baits complementary to targets
- Approaches:
- Array-based capture (MYbaits, Agilent SureSelect)
- In-solution capture (more common for aDNA)
- Design Types:
- Whole genome capture (e.g., human DNA from microbial background)
- Targeted capture (specific genes, chromosomes)
- Exome capture (all coding regions)
PCR-Based Enrichment
- Limited utility for highly fragmented DNA
- Used for specific targets with well-preserved samples
- Multiplex PCR allows multiple targets
CRISPR-Cas Systems
- Emerging technology for targeted enrichment
- Can be more specific than hybridization capture
- Examples: CATCH, CRISPR-Cap
Common Enrichment Targets
Mitochondrial DNA:
- Higher copy number than nuclear DNA
- Used for maternal lineage analysis
- Complete mitogenome more informative than HVR only
Y-Chromosome:
- Paternal lineage information
- Lower success rate than mtDNA due to copy number
Autosomal DNA:
- Whole-genome or selected SNPs
- Population genetics, phenotypic traits
Pathogen DNA:
- Disease-causing organisms in archaeological remains
- Custom capture designs for specific pathogens
- Examples: Yersinia pestis (plague), M. tuberculosis, M. leprae
Sequencing Technologies for aDNA
Platform Comparison for aDNA Applications
Platform | Advantages | Limitations | Best Applications |
---|---|---|---|
Illumina | Short-read ideal for fragmented aDNA; high accuracy; established pipelines | Limited read length | Most aDNA applications; standard choice |
Ion Torrent | Fast run times; scalable output | Higher error rates in homopolymers; shorter read length | Rapid screening; smaller projects |
PacBio HiFi | High accuracy long reads | Requires high molecular weight DNA; limited utility for aDNA | Special cases with exceptional preservation |
Oxford Nanopore | Ultra-long reads; portable | Higher error rates; challenges with short fragments | Environmental aDNA; field applications |
MGI/BGI | Cost-effective high throughput | Limited established aDNA protocols | Large-scale population studies |
Sequencing Considerations for aDNA
- Read Length: Short reads (75-150bp) optimal for fragmented aDNA
- Sequencing Depth: Higher coverage compensates for damage/contamination
- Shotgun: 0.1-1X for screening, >1X for analysis
- Targeted: 20-100X minimum coverage
- Paired-End Sequencing: Improves mapping quality
- Platforms with Lower GC Bias: Important for accurate representation
- Multiplexing Strategy: Index design to avoid cross-contamination
Bioinformatic Analysis Workflows
Raw Data Processing
Quality Control
- FastQC/MultiQC for sequence quality assessment
- Adapter trimming (Cutadapt, AdapterRemoval)
- Quality filtering
- Length filtering (typically keep >30bp)
aDNA-Specific Processing
- Damage pattern assessment (mapDamage, DamageProfiler)
- UMI processing if applicable (UMI-tools)
- Read merging for overlapping pairs (FLASH, PEAR)
Mapping to Reference
- BWA-aln with ancient DNA parameters (-l 1024)
- Bowtie2 with –very-sensitive-local
- Specialized mappers (e.g., paleomix)
- Consider closely related species for extinct organisms
Post-Mapping Processing
- Remove duplicates (Picard, samtools)
- Remove low-quality mappings (MapQ filtering)
- Rescale quality scores at damaged positions
- Local realignment around indels
Authentication Methods
Damage Pattern Analysis
- C→T transitions at 5′ ends (G→A at 3′ ends)
- Damage increases with sample age
- Tools: mapDamage, DamageProfiler, PMDtools
Fragment Length Distribution
- Authentic aDNA shows shorter fragment length
- Typically 30-70bp average for ancient samples
Contamination Estimation
- Nuclear DNA: Heterozygosity on X chromosome in males (ANGSD)
- mtDNA: Mismatches at haplotype-defining sites (Schmutzi, contamMix)
- Reference-free methods: DICE, AuthentiCT
Sex Determination
- Ratio of X/Y/autosomal coverage
- Consistent sex assignment across methods
Population Genetics Analysis
Variant Calling Approaches
- Genotype likelihood methods (ANGSD)
- Pseudo-haploid calls (random allele sampling)
- Joint calling with modern references (GATK with modifications)
Key Analysis Methods
- Principal Component Analysis: Project onto modern variation
- ADMIXTURE/Structure: Ancestry proportions
- f-statistics: Population relationships (f3, f4, qpAdm)
- Phylogenetic methods: Maximum likelihood, Bayesian
- Identity-by-descent: Relatedness estimation
Specialized aDNA Software
- EAGER/nf-core/eager: Processing pipeline for aDNA
- ATLAS: Genotype likelihood estimation for aDNA
- admixtools: Suite for f-statistics
- Schmutzi: Contamination estimation and consensus calling
Common Applications and Case Studies
Human Evolution and Migration
Out of Africa Expansions
- Early human dispersal patterns
- Interbreeding with archaic hominins
Agricultural Transitions
- Neolithic expansion in Europe
- Steppe migrations and Indo-European spread
Recent Population History
- Colonial era admixture
- Historical demographic changes
Ancient Pathogen Genomics
Epidemic Diseases
- Plague (Yersinia pestis) evolution
- Historical tuberculosis and leprosy strains
- Ancient viral DNA (HBV, smallpox)
Host-Pathogen Coevolution
- Immune gene adaptation
- Virulence changes over time
Extinct Species Analysis
Megafauna Extinction
- Woolly mammoth genomics
- Causes of population decline
- Genetic diversity before extinction
Archaic Humans
- Neanderthal and Denisovan genomes
- Genetic contributions to modern humans
Environmental and Climate Studies
Ancient Sedimentary DNA
- Paleoenvironmental reconstruction
- Ecosystem changes
- Species presence/absence
Ancient Ice Core DNA
- Long-term biodiversity records
- Climate change impacts
Common Challenges and Solutions
Challenge | Cause | Solutions |
---|---|---|
Low Endogenous DNA | Degradation, microbial contamination | Optimized extraction, target enrichment, petrous bone sampling |
Contamination | Modern human/microbial DNA introduction | Clean room protocols, surface decontamination, bioinformatic filtering |
DNA Damage | Age-related modifications | UDG treatment, damage filtering, statistical correction |
Limited Sample Material | Conservation requirements, small remains | Non-destructive methods, microsampling, single-stranded libraries |
Population Reference Bias | Modern reference genomes may not reflect ancient variation | De novo assembly where possible, multiple reference genomes, aware of biases |
Complex Bioinformatics | Integration of multiple data types | Standardized pipelines, reproducible workflows |
Best Practices and Guidelines
Sample Selection Strategy
Preliminary Screening
- Sample multiple individuals when possible
- Prioritize best preservation contexts
- Consider pilot studies with few samples before large studies
Non-Destructive Assessment
- ZooMS (Zooarchaeology by Mass Spectrometry) for species ID
- Bone density/collagen preservation as DNA proxy
- pXRF for elemental composition screening
Ethical Considerations
- Indigenous community consultation and approval
- Minimal destruction sampling approaches
- Long-term sample preservation plans
- Data sharing agreements
Analytical Best Practices
Authentication Standards
- Multiple authentication methods required
- Replicate extractions for critical samples
- Independent laboratory confirmation for extraordinary claims
- Publishing all authentication metrics
Reporting Standards
- Detailed methods descriptions
- Repository submission of raw data
- Contamination estimates included
- Coverage statistics and mapping parameters
Interpretation Guidelines
- Consider temporal and geographic sampling biases
- Integrate with archaeological/historical context
- Acknowledge limitations in conclusions
- Clearly separate data from interpretation
Resources for Further Learning
Key Publications
Methodological Papers
- Dabney et al. (2013). “Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments.”
- Gansauge & Meyer (2013). “Single-stranded DNA library preparation for the sequencing of ancient or damaged DNA.”
- Orlando et al. (2021). “Ancient DNA analysis.”
- Pinhasi et al. (2015). “Optimal Ancient DNA Yields from the Inner Ear Part of the Human Petrous Bone.”
Review Articles
- Skoglund & Mathieson (2018). “Ancient genomics of modern humans: The first decade.”
- Brunson & Reich (2019). “The Promise of Paleogenomics Beyond Our Own Species.”
- Fellows Yates et al. (2021). “Reproducible, portable, and efficient ancient genome reconstruction with nf-core/eager.”
Online Resources and Tools
Databases
Software Collections
- nf-core/eager – Reproducible aDNA pipeline
- EAGER – Original aDNA pipeline
- Paleomix – Pipeline for aDNA processing
Educational Resources
- Ancient DNA virtual labs and protocols
- Workshop materials from leading aDNA centers
- Specialized summer schools and courses
Professional Organizations
- Society for Archaeological Sciences
- International Society for Evolution, Medicine, and Public Health
- Society for Molecular Biology and Evolution
- SPAAM Community (Standards, Precautions and Advances in Ancient Metagenomics)
This cheat sheet provides a comprehensive overview of the current state of ancient DNA technologies, methodologies, and best practices. The field continues to evolve rapidly, with new methods and applications emerging regularly. Researchers should stay current with the latest literature and community standards as they develop their own ancient DNA research programs.