Archaeological Informatics Complete Cheat Sheet: From Data Collection to Digital Preservation

Introduction

Archaeological informatics is the application of computational and information science methods to archaeological research, data management, analysis, and preservation. It bridges traditional archaeological practices with digital technologies to enhance data collection, organization, analysis, interpretation, and dissemination. As archaeology generates increasingly complex and voluminous datasets, informatics approaches have become essential for maintaining data integrity, facilitating collaborative research, enabling sophisticated analyses, and ensuring long-term preservation of irreplaceable archaeological information.

Core Principles of Archaeological Informatics

PrincipleDescription
Data IntegrityMaintaining accuracy and consistency throughout the data lifecycle
InteroperabilityEnsuring data can be exchanged and used across different systems
ReproducibilityDocumenting methods to allow verification and replication of results
SustainabilityCreating systems and formats that remain accessible over time
TransparencyClearly documenting data collection, processing, and analysis methods
AccessibilityMaking data available to various stakeholders while respecting ethical constraints
ScalabilityDesigning systems that can accommodate growing data volumes and complexity

Digital Data Collection Systems

Field Recording Technologies

TechnologyApplicationsAdvantagesLimitations
Mobile AppsContext recording, find registrationReal-time data entry, error reductionBattery life, screen visibility
Digital FormsStandardized recordingConsistent data structure, validationRequires internet or syncing
Tablet-Based GISSpatial recording, feature mappingDirect spatial data captureLearning curve, equipment cost
Digital PhotographyVisual documentationImmediate review, metadata captureStorage requirements, backup needs
Barcode/RFID SystemsArtifact trackingRapid registration, reduced errorsSetup costs, equipment maintenance
Wearable TechnologyHands-free documentationContinuous recording, hands-freeBattery life, environmental challenges

Popular Field Data Collection Platforms

  • FAIMS (Federated Archaeological Information Management System)

    • Open-source
    • Customizable forms
    • Offline capability
    • Synchronization features
  • ARK (Archaeological Recording Kit)

    • Web-based
    • Modular design
    • Multi-user support
    • Customizable workflows
  • ESRI Collector/Field Maps

    • Integration with ArcGIS
    • Strong spatial capabilities
    • Form customization
    • Online/offline modes
  • Open Data Kit (ODK)

    • Free and open-source
    • Form builder
    • Multiple data types
    • Cross-platform
  • iDig/Digital Dig House

    • iPad-based
    • Integrated database
    • Real-time visualization
    • Multi-user coordination

Database Design and Management

Database Types for Archaeological Data

Database TypeBest ForExamplesConsiderations
RelationalComplex relationships, structured dataMySQL, PostgreSQL, Microsoft AccessStrong data integrity, requires schema design
Spatial/GISLocation-based data, mappingPostGIS, ArcGIS Geodatabase, QGISSpatial querying, coordinate system management
NoSQLHeterogeneous data, flexibilityMongoDB, CouchDBAccommodates varying data structures, potential consistency issues
GraphNetwork analysis, relationshipsNeo4j, OrientDBGood for complex relationships, specialized query language
HybridComprehensive projectsIntegrated systemsComplex setup, maintenance challenges

Key Database Design Considerations

  1. Entity-Relationship Modeling

    • Identify core entities (contexts, finds, samples, etc.)
    • Define relationships between entities
    • Determine cardinality (one-to-many, many-to-many)
    • Establish unique identifiers/primary keys
  2. Controlled Vocabularies

    • Standardize terminology for artifact types, materials, periods
    • Use established thesauri when possible
    • Document local terms with clear definitions
    • Include multilingual support where appropriate
  3. Database Normalization

    • First normal form: Eliminate repeating groups
    • Second normal form: Remove partial dependencies
    • Third normal form: Remove transitive dependencies
    • Balance normalization with query performance
  4. Metadata Standards

    • Dublin Core for basic description
    • CIDOC-CRM for cultural heritage
    • Archaeological Data Service guidelines
    • ISO 19115 for geospatial components

Database Management Best Practices

  • Backup Protocol: 3-2-1 rule (3 copies, 2 types of media, 1 off-site)
  • Version Control: Track schema changes and data modifications
  • Access Management: Define user roles and permissions
  • Data Validation: Implement constraints and validation rules
  • Documentation: Maintain data dictionaries and relationship diagrams
  • Maintenance Schedule: Regular integrity checks and optimization
  • Migration Planning: Strategy for future platform changes

Spatial Data Management and Analysis

GIS Data Models for Archaeology

Data ModelBest ForExamplesCommon File Formats
VectorDiscrete features, boundariesSite perimeters, architectural featuresShapefile, GeoJSON, GeoPackage
RasterContinuous data, surfacesElevation models, density analysisGeoTIFF, ASCII Grid, IMG
TINIrregular surfaces, 3D modelingTerrain reconstructionCOLLADA, OBJ
Point CloudHigh-precision 3D recordingStructure recording, landscape surveysLAS, LAZ, E57, PLY
Web ServicesOnline data sharingBackground maps, collaborative platformsWMS, WFS, WMTS

Essential GIS Operations for Archaeologists

  1. Basic Operations

    • Georeferencing historical maps
    • Digitizing features
    • Creating buffer zones
    • Overlay analysis
  2. Spatial Analysis

    • Viewshed analysis
    • Cost surface/least cost path
    • Kernel density estimation
    • Cluster analysis
  3. Terrain Analysis

    • Slope and aspect calculation
    • Hydrological modeling
    • Topographic position index
    • Solar radiation modeling
  4. 3D GIS Applications

    • Stratigraphic modeling
    • Volumetric analysis
    • Visibility analysis in 3D
    • Multi-temporal landscape reconstruction

Common GIS Tools for Archaeology

  • QGIS: Open-source, cross-platform, extensive plugin ecosystem
  • ArcGIS: Commercial suite, comprehensive capabilities, strong support
  • GRASS GIS: Advanced analysis, raster processing, scientific applications
  • SAGA GIS: Specialized geoscientific analyses, terrain processing
  • R with spatial packages: Statistical analysis with spatial components
  • PostGIS: Spatial database extension for PostgreSQL
  • WebGIS platforms: CARTO, Leaflet, MapBox for online visualization

Data Analysis Methods

Statistical Approaches in Archaeology

MethodApplicationsCommon ToolsKey Considerations
Descriptive StatisticsArtifact assemblage summarizationR, SPSS, ExcelData distribution, outliers
Multivariate AnalysisPattern recognition, typologyR, PAST, SPSSVariable selection, data transformation
Spatial StatisticsDistribution analysis, clusteringR spatial, ArcGIS, CrimeStatSpatial autocorrelation, edge effects
Bayesian StatisticsChronological modeling, hypothesis testingOxCal, BCal, BUGS, RPrior selection, model sensitivity
Network AnalysisInteraction studies, trade networksGephi, igraph, PajekNetwork boundaries, centrality measures
Machine LearningClassification, pattern recognitionPython (scikit-learn), R, TensorFlowTraining data quality, overfitting

Specialized Analytical Tools

  • OxCal: Radiocarbon date calibration and Bayesian modeling
  • PAST: Paleontological Statistics software with archaeological applications
  • CIRAM: Correspondence Analysis for archaeology
  • Ceramicware: Pottery analysis and classification
  • Harris Matrix Composer: Stratigraphic relationship analysis
  • Lithics3D: Stone tool analysis and morphometrics

Quantitative Methods by Archaeological Domain

DomainCommon MethodsKey Metrics
Lithic AnalysisMorphometrics, use-wear quantificationDimensions, edge angles, fracture patterns
Ceramic StudiesThin section analysis, XRF data analysisElemental composition, temper proportions
ZooarchaeologySpecies abundance indices, mortality profilesNISP, MNI, age distributions
ArchaeobotanyPresence analysis, composition statisticsUbiquity, diversity indices
Landscape ArchaeologyPredictive modeling, viewshed analysisSite location factors, intervisibility
Mortuary AnalysisSpatial clustering, correspondence analysisGrave good associations, demographic patterns

Data Visualization Techniques

Visualization Methods by Data Type

Data TypeVisualization MethodsToolsBest Practices
Spatial DataMaps, heat maps, 3D terrainQGIS, ArcGIS, BlenderAppropriate projection, clear legend, scale
Temporal DataTimelines, Harris matrices, phase diagramsTimelineJS, Harris Matrix ComposerClear periodization, uncertainty indication
Quantitative DataHistograms, scatterplots, box plotsR, Python, D3.jsData transformation consideration, axis scaling
Categorical DataBar charts, pie charts, treemapsTableau, R, ExcelColor scheme consistency, clear labeling
Network DataNode-link diagrams, adjacency matricesGephi, Cytoscape, igraphLayout algorithm selection, edge weighting
Multivariate DataPCA plots, bivariate plots, parallel coordinatesR, PAST, PythonDimension reduction, variable selection

Interactive Visualization Platforms

  • Shiny (R): Interactive statistical visualizations
  • Plotly: Cross-platform interactive graphs
  • Tableau: Data dashboard creation
  • Power BI: Microsoft’s business intelligence platform
  • D3.js: Custom web-based visualizations
  • Leaflet: Interactive web mapping
  • Potree: Web-based point cloud visualization

Data Visualization Best Practices

  1. Choose Appropriate Visualization Types

    • Match visualization to data type and question
    • Consider audience expertise level
    • Use established conventions where possible
  2. Design for Clarity

    • Minimize chart junk
    • Use consistent color schemes
    • Provide clear legends and labels
    • Consider colorblind-friendly palettes
  3. Represent Uncertainty

    • Include error bars/confidence intervals
    • Use transparency or gradient effects
    • Provide alternative interpretations
    • Document data quality issues
  4. Enable Exploration

    • Provide multiple linked views
    • Include filtering capabilities
    • Allow drilling down to raw data
    • Support different scales of analysis

Digital Preservation and Data Sharing

Data Management Planning

  1. Project Planning Phase

    • Identify data types and volumes
    • Establish file naming conventions
    • Select appropriate formats
    • Determine storage requirements
    • Plan for sensitive data handling
  2. Active Project Phase

    • Implement backup procedures
    • Document metadata consistently
    • Conduct regular quality checks
    • Manage versions effectively
    • Perform interim archiving
  3. Project Closure Phase

    • Clean and validate final datasets
    • Complete documentation
    • Prepare data for repository submission
    • Assign persistent identifiers
    • Plan for long-term access

Recommended File Formats for Preservation

Data TypePreferred FormatsFormats to Avoid
TextPDF/A, TXT, XML, TEIDOC, DOCX, RTF
Tabular DataCSV, TSV, ODSXLS, XLSX, MDB
ImagesTIFF, JPEG2000, PNGPSD, BMP, proprietary RAW
Spatial DataGeoTIFF, GML, GeoPackageSHP (without complete collection), proprietary geodatabases
3D ModelsOBJ, PLY, X3D, COLLADA3DS, MAX, proprietary formats
CADDXF, SVGDWG, proprietary formats
AudioWAV, FLACMP3, AAC, proprietary formats
VideoMKV, Motion JPEG 2000AVI, proprietary codecs

Digital Repositories for Archaeological Data

  • Open Context: Publication platform with editorial processes
  • tDAR (the Digital Archaeological Record): Comprehensive repository
  • ADS (Archaeology Data Service): UK-based data archiving
  • DANS-EASY: Dutch data repository with archaeological collections
  • Zenodo: General-purpose repository with DOI assignment
  • Harvard Dataverse: Research data repository network
  • Figshare: Research data sharing platform

FAIR Data Principles in Archaeology

  • Findable

    • Use persistent identifiers (DOIs, ARKs)
    • Create rich metadata
    • Register with searchable resources
    • Include clear citation information
  • Accessible

    • Retrievable via standardized protocols
    • Specify access conditions
    • Maintain metadata accessibility even if data is restricted
    • Provide contact information for restricted data
  • Interoperable

    • Use formal, accessible, shared vocabularies
    • Follow CIDOC-CRM or other domain standards
    • Include qualified references to related datasets
    • Use compatible file formats
  • Reusable

    • Clear data licenses (Creative Commons recommended)
    • Detailed provenance information
    • Meeting domain-relevant community standards
    • Detailed methodology documentation

Common Challenges and Solutions

ChallengeSolution
Data HeterogeneityImplement crosswalks between schemas; use flexible data models; focus on core metadata elements
Legacy Data IntegrationDevelop migration pathways; document transformation decisions; maintain original alongside standardized versions
Project SustainabilityUse open standards and formats; document thoroughly; deposit in institutional repositories; secure funding for maintenance
Technical Expertise GapsProvide training resources; develop user-friendly interfaces; create detailed documentation; build community support systems
Ethical Data SharingDevelop protocols with stakeholder communities; implement tiered access systems; practice informed consent; respect traditional knowledge
Data Volume ManagementImplement sampling strategies; use cloud storage solutions; develop data triage protocols; focus on high-value datasets
Software ObsolescenceUse open-source solutions where possible; document computational environments; virtual machine preservation; focus on data format longevity

Best Practices and Practical Tips

  • Start with Data Management Planning: Create a plan before fieldwork begins
  • Document Everything: Maintain detailed logs of data collection and processing decisions
  • Build in Redundancy: Implement multiple backup systems from the beginning
  • Think Long-term: Consider how data will be used 5, 10, or 50 years from now
  • Prioritize Standardization: Use established standards whenever possible
  • Implement Version Control: Track changes to data and code systematically
  • Test Data Collection Systems: Pilot test all systems before full deployment
  • Validate Data Regularly: Build in quality control checkpoints throughout the workflow
  • Consider Collaborative Potential: Design systems that facilitate data sharing
  • Allocate Sufficient Resources: Budget time and money for data management and preservation

Resources for Further Learning

Key Publications

  • Archaeology in the Digital Era edited by G. Earl et al.
  • Digital Archaeology: Bridging Method and Theory by T.L. Evans and P. Daly
  • The Oxford Handbook of Archaeological Theory (sections on digital archaeology)
  • Computational Approaches to Archaeological Spaces edited by A. Bevan and M. Lake

Organizations

  • Computer Applications and Quantitative Methods in Archaeology (CAA)
  • Society for American Archaeology Digital Data Interest Group
  • European Association of Archaeologists
  • Digital Humanities Centers Network

Online Resources

  • Archaeological Data Service Guides to Good Practice
  • Journal of Open Archaeology Data
  • Programming Historian tutorials
  • Open Context Data Publishing Guidelines
  • Digital Antiquity Data Management Resources

Training Opportunities

  • Digital Archaeological Practice workshops
  • Digital Humanities Summer Institutes
  • Software Carpentry workshops
  • Repository-sponsored data management workshops
  • CAA Conference workshops
Scroll to Top