Introduction to Apache Cassandra
Apache Cassandra is a free, open-source, distributed NoSQL database management system designed to handle large amounts of data across many commodity servers. It provides high availability with no single point of failure, linear scalability, and superior performance for write-heavy workloads. Originally developed at Facebook to power their Inbox Search feature, Cassandra was designed to handle massive datasets across distributed systems with exceptional fault tolerance and tunable consistency.
Core Cassandra Concepts
Data Model Architecture
| Concept | Description |
|---|
| Keyspace | Container for tables, similar to a schema in relational databases |
| Table | Collection of rows and columns, similar to tables in relational databases |
| Partition Key | First part of primary key that determines data distribution across nodes |
| Clustering Key | Optional second part of primary key that determines sort order within a partition |
| Column | Name-value pair with a defined data type |
| Row | Collection of columns identified by a primary key |
Cassandra’s CAP Characteristics
- Consistency: Tunable consistency levels (ONE, QUORUM, ALL, etc.)
- Availability: No single point of failure, continuous availability
- Partition Tolerance: Designed to operate across distributed nodes
Cassandra vs. Traditional RDBMS
| Feature | Cassandra | Traditional RDBMS |
|---|
| Data Model | Column-family, schema-flexible | Row-based, rigid schema |
| Scaling | Horizontal (add nodes) | Vertical (bigger servers) |
| Transactions | Limited (lightweight transactions) | ACID compliant |
| Joins | Not supported natively | Fully supported |
| Architecture | Masterless, peer-to-peer | Master-slave |
| Best For | Write-heavy workloads, time-series data | Complex queries, transactional data |
| Consistency | Tunable (eventual to strong) | Strong consistency |
Cassandra Query Language (CQL)
CQL Data Types
| Category | Types |
|---|
| Numeric | int, bigint, smallint, tinyint, float, double, decimal, varint |
| Text | text, varchar, ascii |
| Time/Date | timestamp, date, time, duration |
| Identifiers | uuid, timeuuid |
| Collections | list, set, map, tuple |
| Others | boolean, blob, inet, counter |
Basic CQL Commands
Database Operations
-- Create a keyspace
CREATE KEYSPACE my_keyspace
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
-- Use a keyspace
USE my_keyspace;
-- Drop a keyspace
DROP KEYSPACE my_keyspace;
Table Operations
-- Create a table
CREATE TABLE users (
user_id uuid PRIMARY KEY,
first_name text,
last_name text,
email text,
created_at timestamp
);
-- Alter table (add column)
ALTER TABLE users ADD age int;
-- Drop table
DROP TABLE users;
-- Truncate table (remove all data)
TRUNCATE users;
Data Manipulation
-- Insert data
INSERT INTO users (user_id, first_name, last_name, email, created_at)
VALUES (uuid(), 'John', 'Doe', 'john@example.com', toTimestamp(now()));
-- Update data
UPDATE users
SET email = 'newemail@example.com'
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Delete data
DELETE FROM users
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Select data
SELECT * FROM users WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
Secondary Indexes
-- Create an index
CREATE INDEX ON users (email);
-- Drop an index
DROP INDEX users_email_idx;
Collections
-- Table with collections
CREATE TABLE user_preferences (
user_id uuid PRIMARY KEY,
favorite_colors set<text>,
address_history list<text>,
phone_numbers map<text, text>
);
-- Insert into collections
INSERT INTO user_preferences (user_id, favorite_colors, address_history, phone_numbers)
VALUES (
uuid(),
{'blue', 'green', 'red'},
['123 Main St', '456 Oak Ave'],
{'home': '555-1234', 'work': '555-5678'}
);
-- Update collections
UPDATE user_preferences
SET favorite_colors = favorite_colors + {'yellow'}
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
-- Remove from collections
UPDATE user_preferences
SET phone_numbers = phone_numbers - {'work'}
WHERE user_id = 123e4567-e89b-12d3-a456-426614174000;
Cassandra Data Modeling
Key Principles
- Model around queries, not entities
- Denormalize for performance
- Partition based on access patterns
- Avoid hotspots by distributing workload
- Minimize partitions read per query
Primary Key Design Patterns
| Pattern | Structure | Use Case |
|---|
| Simple Primary Key | PRIMARY KEY (id) | Single record lookup |
| Compound Key | PRIMARY KEY ((partition_key), clustering_column) | Sorted data within partition |
| Composite Partition Key | PRIMARY KEY ((key1, key2), clustering_column) | Distributing data evenly |
| Time Series | PRIMARY KEY ((entity_id), timestamp) | Time-ordered events per entity |
| Bucketing | PRIMARY KEY ((entity_id, bucket), timestamp) | Managing wide partitions |
Common Data Modeling Techniques
One-to-Many Relationship
CREATE TABLE posts (
user_id uuid,
post_id timeuuid,
content text,
created_at timestamp,
PRIMARY KEY (user_id, post_id)
) WITH CLUSTERING ORDER BY (post_id DESC);
Many-to-Many Relationship
-- User to groups
CREATE TABLE user_groups (
user_id uuid,
group_id uuid,
joined_at timestamp,
PRIMARY KEY (user_id, group_id)
);
-- Group to users (duplication for query efficiency)
CREATE TABLE group_users (
group_id uuid,
user_id uuid,
joined_at timestamp,
PRIMARY KEY (group_id, user_id)
);
Time Series Data
CREATE TABLE temperature_by_sensor (
sensor_id uuid,
day date,
timestamp timestamp,
temperature float,
PRIMARY KEY ((sensor_id, day), timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);
Consistency Levels
Read Consistency Levels
| Level | Description |
|---|
| ONE | Return data from nearest replica |
| QUORUM | Return data when majority of replicas respond |
| LOCAL_QUORUM | Quorum of replicas in same datacenter |
| EACH_QUORUM | Quorum of replicas in each datacenter |
| ALL | Return data when all replicas respond |
| LOCAL_ONE | Return data from nearest replica in local datacenter |
Write Consistency Levels
| Level | Description |
|---|
| ANY | Write to any node (can be hinted handoff coordinator) |
| ONE | Write confirmed by at least one replica |
| QUORUM | Write confirmed by majority of replicas |
| LOCAL_QUORUM | Quorum of replicas in same datacenter |
| EACH_QUORUM | Quorum of replicas in each datacenter |
| ALL | Write confirmed by all replicas |
| LOCAL_ONE | Write confirmed by at least one replica in local datacenter |
Setting Consistency Levels
-- Set read consistency for session
CONSISTENCY QUORUM;
-- Per-query consistency
SELECT * FROM users WHERE user_id = 123 USING CONSISTENCY QUORUM;
Cassandra Architecture
Key Components
| Component | Description |
|---|
| Node | Single Cassandra instance |
| Cluster | Collection of nodes that store your data |
| Data Center | Group of related nodes (often a physical location) |
| Rack | Collection of servers (often on same switch) |
| Ring | The ring structure that represents data distribution |
| Gossip Protocol | How nodes exchange state information |
| Snitch | Determines network topology |
| Partitioner | Determines how data is distributed |
Replication Strategies
| Strategy | Description | Usage |
|---|
| SimpleStrategy | Places replicas on consecutive nodes around the ring | Testing, single datacenter |
| NetworkTopologyStrategy | Precise control over replica placement by datacenter | Production, multiple datacenters |
Write Path
- Write to commit log (durability)
- Write to memtable (in-memory)
- Periodically flush memtable to SSTable (immutable on disk)
- Eventually compact SSTables
Read Path
- Check row cache (if enabled)
- Check partition key cache (if enabled)
- Check memtable
- Check SSTables (using Bloom filters and indexes)
- Perform read repair if needed
Performance Optimization
Performance Tuning Parameters
| Parameter | Description | Recommendation |
|---|
concurrent_reads | Number of concurrent reads | 16 × number of drives |
concurrent_writes | Number of concurrent writes | 8 × number of CPU cores |
memtable_flush_writers | Writers for flushing memtables | Number of disks |
compaction_throughput_mb_per_sec | Throttle for compaction | Start at 16-32, adjust based on load |
read_request_timeout_in_ms | Read timeout | Default 5000ms |
write_request_timeout_in_ms | Write timeout | Default 2000ms |
Compaction Strategies
| Strategy | Description | Best For |
|---|
| SizeTieredCompactionStrategy (STCS) | Default, groups similarly sized SSTables | Write-heavy workloads |
| LeveledCompactionStrategy (LCS) | Organize SSTables in levels | Read-heavy workloads |
| TimeWindowCompactionStrategy (TWCS) | Optimized for time series data | Time series, TTL data |
| DateTieredCompactionStrategy (DTCS) | Deprecated, replaced by TWCS | Legacy systems |
Caching Options
| Cache Type | Description | Use Case |
|---|
| Row Cache | Caches entire rows | Frequently accessed, rarely changing rows |
| Key Cache | Caches partition keys | Default cache, improves read performance |
| Counter Cache | Caches counters | High-volume counters |
| Chunk Cache | Caches chunks of data | Improves read performance for wide rows |
Backup and Recovery
Backup Strategies
Snapshots
# Create snapshot of all keyspaces
nodetool snapshot
# Create snapshot of specific keyspace
nodetool snapshot my_keyspace
# Create snapshot with a name
nodetool snapshot -t backup_name my_keyspace
Incremental Backups
Enable in cassandra.yaml:
incremental_backups: true
Restore Process
# Stop Cassandra
service cassandra stop
# Clear data (except for snapshots and backups)
rm -rf /var/lib/cassandra/data/my_keyspace/my_table/*
# Restore from snapshot
cp -R /var/lib/cassandra/snapshots/snapshot_name/* /var/lib/cassandra/data/my_keyspace/my_table/
# Restart Cassandra
service cassandra start
# Run repair
nodetool repair my_keyspace my_table
Monitoring and Maintenance
Essential nodetool Commands
# Check cluster status
nodetool status
# Check node status
nodetool info
# Get statistics
nodetool tablestats my_keyspace.my_table
# Run repair
nodetool repair
# Run cleanup
nodetool cleanup
# Flush memtables to disk
nodetool flush
# Compact SSTables
nodetool compact
Monitoring Metrics
| Metric Category | Key Metrics to Monitor |
|---|
| Latency | Read/write latency, request coordinator latency |
| Throughput | Read/write requests per second |
| Compaction | Pending compactions, compaction history |
| Storage | Disk usage, SSTable count |
| Cache | Cache hit rates, cache size |
| GC | Garbage collection pauses, frequency |
| Thread Pools | Pending/blocked tasks |
Recommended Monitoring Tools
- Prometheus with Cassandra exporter
- Grafana dashboards
- DataStax OpsCenter
- Instaclustr Console
- JMX tools (JConsole, jmxterm)
Common Challenges and Solutions
Challenge: Tombstones
Solution:
- Set appropriate TTL values
- Use USING TIMESTAMP for overwrites
- Schedule regular tombstone GC with nodetool garbagecollect
- Configure gc_grace_seconds based on repair frequency
Challenge: Wide Partitions
Solution:
- Implement bucketing in primary key design
- Split logical entities across multiple tables
- Use time-based bucketing for time series
- Monitor partition sizes with nodetool tablehistograms
Challenge: Hot Partitions
Solution:
- Review partition key design
- Add more granularity to partition key
- Consider application-level sharding
- Cache hot data in application
Challenge: Read Before Write
Solution:
- Use lightweight transactions (WITH IF EXISTS)
- Consider timestamp-based conflict resolution
- Design for idempotent operations
Best Practices
Schema Design
- Design tables based on query patterns
- Keep related data in same partition
- Limit partition size (aim for <100MB)
- Choose appropriate compaction strategy
- Use TTL for temporary data
Operational
- Run repairs regularly
- Monitor tombstone counts
- Plan capacity in advance
- Test with realistic data volumes
- Use vnodes for easier scaling
- Consider dedicated seed nodes
Application Integration
- Use prepared statements
- Implement retry policies
- Use token-aware load balancing
- Batch with caution (use unlogged batches)
- Consider asynchronous operations for throughput
Resources for Further Learning
Official Resources
Books
- “Cassandra: The Definitive Guide” by Jeff Carpenter and Eben Hewitt
- “Mastering Apache Cassandra” by Nishant Neeraj
- “Learning Apache Cassandra” by Sandeep Yarabarla
Online Courses
- DataStax Academy courses
- Udemy: “Apache Cassandra for Beginners”
- Pluralsight: “Getting Started with Apache Cassandra”
Community Resources
- Stack Overflow Cassandra tag
- Cassandra mailing lists
- #cassandra IRC channel
- CQLSH cheat sheets
Remember: Cassandra excels at handling massive scale with high availability, but requires thinking differently about data modeling than traditional relational databases. Design for your queries, not your entities!