Aerospike NoSQL: The Ultimate Developer’s Reference Guide

Introduction to Aerospike

Aerospike is a distributed, high-performance NoSQL database optimized for flash storage and in-memory computing. It features a hybrid memory architecture that intelligently manages data between DRAM and SSDs to deliver sub-millisecond latency at scale. Aerospike excels in high-throughput, low-latency use cases like real-time bidding, fraud detection, payment processing, and user profile stores. Its shared-nothing architecture ensures linear scalability while maintaining strong consistency, making it suitable for mission-critical applications that demand reliability and performance at petabyte scale.

Core Aerospike Concepts

Data Model

  • Namespace: Highest level container (similar to a database)
    • Contains records
    • Has its own storage configuration
    • Has its own replication factor, memory settings, and persistence rules
  • Set: Collection of records (similar to a table)
    • Optional organizational unit
    • Can have specialized policies
  • Record: A collection of bins identified by a unique key
    • Each record belongs to a namespace
    • Optionally belongs to a set
    • Has a configurable Time-To-Live (TTL)
  • Bin: Name-value pair within a record (similar to a column)
    • Names are limited to 14 characters (older versions) or 15 characters (newer versions)
    • Different records in the same set can have different bins
  • Key: Unique identifier for a record
    • Composed of namespace, set (optional), and primary key value
    • Primary key can be a string, integer, or blob

Data Types

Data TypeDescriptionSize LimitExample
Integer64-bit signed integer8 bytes123
StringUTF-8 encoded text1 MB"Hello Aerospike"
Blob/BytesArbitrary binary data1 MBRaw bytes
DoubleIEEE 754 double-precision float8 bytes3.14159
ListOrdered collection of elements1 MB[1, "a", true]
MapCollection of key-value pairs1 MB{"name": "John", "age": 30}
GeoJSONLocation data1 MB{"type": "Point", "coordinates": [103.851959, 1.290270]}
HyperLogLogProbabilistic data structureImplementation specificUsed for cardinality estimation

Cluster Architecture

  • Shared-nothing architecture: Each node is independent and self-sufficient
  • Automatic data distribution: Consistent hashing using a partition map
  • Replication: Data replication for high availability (configurable factor)
  • Smart Client: Client is cluster-aware, connects directly to nodes owning the data
  • Cross-datacenter replication (XDR): Asynchronous multi-datacenter replication

Basic Aerospike Commands & Operations

Installation & Setup

Docker Quick Start

docker run -d --name aerospike -p 3000-3002:3000-3002 aerospike/aerospike-server

Basic Configuration (aerospike.conf)

service {
  user root
  group root
  paxos-single-replica-limit 1
  pidfile /var/run/aerospike/asd.pid
  service-threads 4
  transaction-queues 4
  transaction-threads-per-queue 4
  proto-fd-max 15000
}

logging {
  file /var/log/aerospike/aerospike.log {
    context any info
  }
}

network {
  service {
    address any
    port 3000
  }
  heartbeat {
    mode multicast
    address 239.1.99.222
    port 9918
    interval 150
    timeout 10
  }
  fabric {
    port 3001
  }
  info {
    port 3002
  }
}

namespace test {
  replication-factor 2
  memory-size 4G
  default-ttl 30d
  storage-engine memory
}

namespace data {
  replication-factor 2
  memory-size 4G
  default-ttl 30d
  storage-engine device {
    device /dev/sdb
    write-block-size 128K
    data-in-memory true
  }
}

Basic Client Operations (Using Multiple Language APIs)

Record Operations

Writing Records

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')
bins = {
    'name': 'John Doe',
    'email': 'john@example.com',
    'age': 30,
    'scores': [85, 90, 92]
}

client.put(key, bins)
client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");
Bin nameBin = new Bin("name", "John Doe");
Bin emailBin = new Bin("email", "john@example.com");
Bin ageBin = new Bin("age", 30);
Bin scoresBin = new Bin("scores", Value.get(new int[] {85, 90, 92}));

client.put(null, key, nameBin, emailBin, ageBin, scoresBin);
client.close();

Node.js

const Aerospike = require('aerospike')

const config = {
  hosts: '127.0.0.1:3000'
}

Aerospike.connect(config).then(client => {
  const key = new Aerospike.Key('test', 'users', 'user1')
  const bins = {
    name: 'John Doe',
    email: 'john@example.com',
    age: 30,
    scores: [85, 90, 92]
  }
  
  return client.put(key, bins)
    .then(() => {
      client.close()
    })
}).catch(error => {
  console.error('Error:', error)
})
Reading Records

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')
(key, metadata, bins) = client.get(key)

print(bins)
client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");
Record record = client.get(null, key);

System.out.println(record.toString());
client.close();

Node.js

const Aerospike = require('aerospike')

const config = {
  hosts: '127.0.0.1:3000'
}

Aerospike.connect(config).then(client => {
  const key = new Aerospike.Key('test', 'users', 'user1')
  
  return client.get(key)
    .then(record => {
      console.log(record.bins)
      client.close()
    })
}).catch(error => {
  console.error('Error:', error)
})
Deleting Records

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')
client.remove(key)
client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");
client.delete(null, key);
client.close();

Node.js

const Aerospike = require('aerospike')

const config = {
  hosts: '127.0.0.1:3000'
}

Aerospike.connect(config).then(client => {
  const key = new Aerospike.Key('test', 'users', 'user1')
  
  return client.remove(key)
    .then(() => {
      client.close()
    })
}).catch(error => {
  console.error('Error:', error)
})

Batch Operations

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

keys = [
    ('test', 'users', 'user1'),
    ('test', 'users', 'user2'),
    ('test', 'users', 'user3')
]

records = client.get_many(keys)
for key, metadata, bins in records:
    print(key, bins)

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key[] keys = new Key[] {
    new Key("test", "users", "user1"),
    new Key("test", "users", "user2"),
    new Key("test", "users", "user3")
};

Record[] records = client.get(null, keys);
for (Record record : records) {
    if (record != null) {
        System.out.println(record.toString());
    }
}

client.close();

Queries and Scans

Secondary Index Creation

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

# Create secondary index on 'age' bin
client.index_integer_create('test', 'users', 'age', 'users_age_idx')
client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

// Create secondary index on 'age' bin
IndexTask task = client.createIndex(null, "test", "users", "users_age_idx", "age", IndexType.NUMERIC);
task.waitTillComplete();

client.close();
Query With Secondary Index

Python

import aerospike
from aerospike import predicates as p

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

query = client.query('test', 'users')
query.select('name', 'age')
query.where(p.between('age', 25, 35))

results = []
def process_record(record):
    results.append(record)

query.foreach(process_record)
for key, metadata, bins in results:
    print(bins)

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Statement stmt = new Statement();
stmt.setNamespace("test");
stmt.setSetName("users");
stmt.setBinNames("name", "age");
stmt.setFilter(Filter.range("age", 25, 35));

RecordSet rs = client.query(null, stmt);
try {
    while (rs.next()) {
        Record record = rs.getRecord();
        System.out.println(record.toString());
    }
} finally {
    rs.close();
}

client.close();
Scan All Records in a Set

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

scan = client.scan('test', 'users')
scan.select('name', 'email')

results = []
def process_record(record):
    results.append(record)

scan.foreach(process_record)
for key, metadata, bins in results:
    print(bins)

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

ScanPolicy policy = new ScanPolicy();
client.scanAll(policy, "test", "users", (key, record) -> {
    System.out.println(key.userKey + ": " + record.toString());
});

client.close();

Advanced Aerospike Features

Bin Operations (Atomic Updates)

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')

# Increment age by 1
client.increment(key, 'age', 1)

# Append to string
client.append(key, 'name', ' Jr.')

# Prepend to string
client.prepend(key, 'title', 'Dr. ')

# Get the updated record
(key, metadata, bins) = client.get(key)
print(bins)

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");

// Increment age by 1
client.add(null, key, new Bin("age", 1));

// Append to string
client.append(null, key, new Bin("name", " Jr."));

// Prepend to string
client.prepend(null, key, new Bin("title", "Dr. "));

// Get the updated record
Record record = client.get(null, key);
System.out.println(record.toString());

client.close();

List Operations

Python

import aerospike
from aerospike import list_operations as lop

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')

# Add elements to the list
ops = [
    lop.list_append('scores', 95),
    lop.list_insert('scores', 1, 88)
]
client.operate(key, ops)

# Get the 2nd element (index 1)
ops = [lop.list_get('scores', 1)]
_, _, bins = client.operate(key, ops)
print("Element at index 1:", bins['scores'])

# Get list size
ops = [lop.list_size('scores')]
_, _, bins = client.operate(key, ops)
print("List size:", bins['scores'])

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");

// Add elements to the list
Operation[] ops = {
    ListOperation.append("scores", Value.get(95)),
    ListOperation.insert("scores", 1, Value.get(88))
};
client.operate(null, key, ops);

// Get the 2nd element (index 1)
Record record = client.operate(null, key, 
    ListOperation.getByIndex("scores", 1, ListReturnType.VALUE)
);
System.out.println("Element at index 1: " + record.bins.get("scores"));

// Get list size
record = client.operate(null, key, 
    ListOperation.size("scores")
);
System.out.println("List size: " + record.bins.get("scores"));

client.close();

Map Operations

Python

import aerospike
from aerospike import map_operations as mop

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

key = ('test', 'users', 'user1')

# Initialize a map or update existing
ops = [
    mop.map_put('profile', 'address', '123 Main St', aerospike.MAP_CREATE_ONLY_UNORDERED)
]
client.operate(key, ops)

# Add/update multiple items
ops = [
    mop.map_put_items('profile', {'city': 'New York', 'zip': '10001'})
]
client.operate(key, ops)

# Get a value by key
ops = [mop.map_get_by_key('profile', 'city', aerospike.MAP_RETURN_VALUE)]
_, _, bins = client.operate(key, ops)
print("City:", bins['profile'])

# Get map size
ops = [mop.map_size('profile')]
_, _, bins = client.operate(key, ops)
print("Map size:", bins['profile'])

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

Key key = new Key("test", "users", "user1");

// Initialize a map or update existing
Operation[] ops = {
    MapOperation.put(MapPolicy.Default, "profile", Value.get("address"), Value.get("123 Main St"))
};
client.operate(null, key, ops);

// Add/update multiple items
Map<Value, Value> items = new HashMap<>();
items.put(Value.get("city"), Value.get("New York"));
items.put(Value.get("zip"), Value.get("10001"));

client.operate(null, key, 
    MapOperation.putItems(MapPolicy.Default, "profile", items)
);

// Get a value by key
Record record = client.operate(null, key,
    MapOperation.getByKey("profile", Value.get("city"), MapReturnType.VALUE)
);
System.out.println("City: " + record.bins.get("profile"));

// Get map size
record = client.operate(null, key,
    MapOperation.size("profile")
);
System.out.println("Map size: " + record.bins.get("profile"));

client.close();

UDFs (User-Defined Functions)

Register UDF (Lua)

Register UDF

-- user_profile.lua
function update_activity(rec, last_activity)
    rec['last_activity'] = last_activity
    rec['activity_count'] = rec['activity_count'] + 1
    aerospike:update(rec)
    return rec
end

function get_activity_summary(rec)
    local result = map()
    result['user_id'] = rec['user_id']
    result['activity_count'] = rec['activity_count']
    result['last_activity'] = rec['last_activity']
    return result
end

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

# Register UDF module
client.udf_put('user_profile.lua')

# Apply UDF on a record
key = ('test', 'users', 'user1')
current_time = "2023-04-30T12:34:56Z"
result = client.apply(key, 'user_profile', 'update_activity', [current_time])
print(result)

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

// Register UDF module
RegisterTask task = client.register(null, "udf/user_profile.lua", "user_profile.lua", Language.LUA);
task.waitTillComplete();

// Apply UDF on a record
Key key = new Key("test", "users", "user1");
String currentTime = "2023-04-30T12:34:56Z";
Object result = client.execute(null, key, "user_profile", "update_activity", Value.get(currentTime));
System.out.println(result);

client.close();

Transactions

Python

import aerospike

config = {'hosts': [('127.0.0.1', 3000)]}
client = aerospike.client(config).connect()

# Multi-operation transaction
key = ('test', 'accounts', 'user1')
ops = [
    # Read balance
    {"op": aerospike.OP_READ, "bin": "balance"},
    # Subtract amount
    {"op": aerospike.OP_INCR, "bin": "balance", "val": -100},
    # Add to transaction history
    {"op": aerospike.OP_APPEND, "bin": "transactions", "val": ",withdrawal:100"}
]

try:
    _, _, bins = client.operate(key, ops)
    print("New balance:", bins["balance"])
except Exception as e:
    print("Transaction failed:", str(e))

client.close()

Java

AerospikeClient client = new AerospikeClient("localhost", 3000);

// Multi-operation transaction
Key key = new Key("test", "accounts", "user1");
Operation[] ops = {
    // Read balance
    Operation.get("balance"),
    // Subtract amount
    Operation.add(new Bin("balance", -100)),
    // Add to transaction history
    Operation.append(new Bin("transactions", ",withdrawal:100"))
};

try {
    Record record = client.operate(null, key, ops);
    System.out.println("New balance: " + record.getInt("balance"));
} catch (AerospikeException e) {
    System.out.println("Transaction failed: " + e.getMessage());
}

client.close();

Performance Tuning & Configuration

Client Configuration

Optimal Client Settings

// Java client configuration example
ClientPolicy policy = new ClientPolicy();
policy.timeout = 50;                  // 50ms timeout
policy.maxConnsPerNode = 300;         // Max connections per node
policy.connPoolsPerNode = 2;          // Connection pools per node
policy.tendInterval = 1000;           // Cluster tending interval in ms
policy.failIfNotConnected = true;     // Fail operations if not connected

Write Policies

Policy OptionDefaultRecommendedDescription
writeCommitLevelCOMMIT_ALLCOMMIT_MASTERWhen to report write completion
maxRetries2Application specificMaximum number of retries
socketTimeout30s50-200msSocket idle timeout
totalTimeout0 (no time limit)50-200msTotal transaction timeout
sleepBetweenRetries1ms1-5msTime to sleep between retries
keySEND (with record)DIGEST_ONLYHow to send the key
durableDeletefalseApplication specificIf a delete is persisted

Read Policies

Policy OptionDefaultRecommendedDescription
readModeAPONEONE/ALLRead mode for AP (availability) mode
readModeSCSESSIONSESSION/LINEARIZERead mode for SC (strong consistency) mode
replicaSEQUENCEMASTER/MASTER_PROLESReplica preference for reads

Storage Engine Configuration

Memory Storage Engine

namespace memory {
  replication-factor 2
  memory-size 4G
  default-ttl 30d            # Default TTL if not specified
  storage-engine memory      # In-memory storage engine
}

Device Storage Engine

namespace ssd {
  replication-factor 2
  memory-size 4G             # Primary index + metadata size
  default-ttl 0              # Never expire by default
  
  storage-engine device {
    device /dev/sdb          # Device or file path
    device /dev/sdc
    write-block-size 128K    # Write block size
    data-in-memory true      # Store data in memory as well
  }
}

Common Challenges & Solutions

Connection Issues

ChallengeSolution
Connection timeoutCheck firewall settings, increase connection timeout
Too many connectionsIncrease maxConnsPerNode, check for connection leaks
Node not availableEnable cluster logging, check heartbeat config

Performance Problems

ChallengeSolution
High latencyTune timeouts, check hardware bottlenecks, optimize query patterns
Client-side timeoutsAdjust timeout policies, identify slow operations
Hot keys/partitionsDistribute workload, redesign data model
Read amplificationUse batch operations, redesign data model, use projections

Data Model Issues

ChallengeSolution
Too large recordsSplit records, use references between records
Secondary index performanceLimit secondary indexes, be specific with bin selection
Complex queriesUse batch operations, consider UDFs for processing

Monitoring & Operations

Basic Monitoring Commands

Using asinfo

# Get namespace statistics
asinfo -v 'namespace/test'

# Get node statistics
asinfo -v 'statistics'

# Get XDR statistics
asinfo -v 'xdr-stats'

# Check cluster status
asinfo -v 'status'

Key Metrics to Monitor

Metric TypeSpecific MetricsWarning Signs
Latencyread/write/query latencyIncreasing over time, spikes
MemoryMemory usage percentageAbove 70-80%
DiskUsed disk space, write performanceAbove 70%, increasing write block time
Node healthCluster integrity, migrationsNode additions/removals, high migration rate
ClientTimeouts, retries, errorsIncreasing error rates, timeouts

Backup & Recovery

Using asbackup/asrestore

# Backup a namespace
asbackup --host localhost --namespace test --directory /backup --verbose

# Restore from backup
asrestore --host localhost --directory /backup --verbose

Best Practices & Tips

Data Modeling

  • Key Selection: Choose distribution-friendly keys to avoid hotspots
  • Bin Design: Prefer fewer, structured bins over many simple bins
  • TTL Strategy: Set appropriate TTLs to manage data lifecycle
  • Set Organization: Use sets for logical grouping, not for querying

Performance Optimization

  • Right-size memory: Ensure primary index fits in memory
  • Batch operations: Use batch reads/writes for multiple records
  • Predicate filtering: Filter at the client when possible
  • Connection pooling: Configure appropriate connection pools

Operational Excellence

  • Monitoring: Set up comprehensive monitoring and alerting
  • Capacity planning: Plan for growth in advance
  • Regular backups: Schedule consistent backups
  • Test upgrades: Test upgrades in staging environment first

Resources for Further Learning

Official Documentation

Community Resources

Tools

This cheatsheet provides a practical reference for Aerospike NoSQL database. For specific version details and advanced configurations, always refer to the official documentation.

Scroll to Top