Introduction
Data structures are specialized formats for organizing, storing, and manipulating data efficiently. They form the foundation of computer science and software development, enabling developers to write more efficient and scalable code. This cheatsheet provides a comprehensive overview of common data structures, their operations, time complexities, and practical applications to help you select the right data structure for your specific programming needs.
Core Concepts
Key Principles
- Data organization: How elements are arranged (sequentially, hierarchically, etc.)
- Access patterns: How data is retrieved (direct, sequential, etc.)
- Memory allocation: Static (fixed-size) vs. dynamic (flexible size)
- Time complexity: Efficiency of operations (worst, average, best case)
- Space complexity: Memory usage requirements
- Traversal: Methods for visiting all elements
Abstract Data Types vs. Data Structures
Abstract Data Type (ADT) | Data Structure |
---|---|
Logical description of data and operations | Concrete implementation of an ADT |
Defines what operations are possible | Defines how operations are implemented |
Examples: List, Map, Queue | Examples: Array, LinkedList, HashMap |
Linear Data Structures
Arrays
Characteristic | Description |
---|---|
Description | Contiguous memory locations storing elements of same type |
Types | Static arrays (fixed size), Dynamic arrays (resizable) |
Access Pattern | Direct access via index (constant time) |
Memory Layout | Contiguous memory block |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Access | O(1) | Direct index access |
Search | O(n) | For unsorted array |
Search (sorted) | O(log n) | Using binary search |
Insert/Delete (end) | O(1)* | *Amortized for dynamic arrays |
Insert/Delete (middle) | O(n) | Requires shifting elements |
Advantages and Disadvantages
Advantages:
- Fast random access
- Good memory locality
- Cache-friendly
Disadvantages:
- Fixed size (static arrays)
- Inefficient insertion/deletion
- Wasted space (if allocated but unused)
Linked Lists
Characteristic | Description |
---|---|
Description | Sequence of nodes where each node points to next node |
Types | Singly-linked, Doubly-linked, Circular |
Access Pattern | Sequential access |
Memory Layout | Non-contiguous, dynamically allocated |
Types of Linked Lists
Type | Description | Traversal |
---|---|---|
Singly-linked | Each node points to next node | Forward only |
Doubly-linked | Each node points to both next and previous nodes | Both directions |
Circular | Last node points to first node | Continuous cycle |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Access | O(n) | Requires traversal |
Search | O(n) | Linear scan required |
Insert/Delete (beginning) | O(1) | Change head pointer |
Insert/Delete (middle) | O(n) | Requires finding position |
Insert/Delete (end) | O(n) or O(1)* | *O(1) if tail pointer maintained |
Advantages and Disadvantages
Advantages:
- Dynamic size
- Efficient insertions/deletions
- No wasted space
Disadvantages:
- Slow access time
- Extra memory for pointers
- Not cache-friendly
Stacks
Characteristic | Description |
---|---|
Description | LIFO (Last In, First Out) data structure |
Implementations | Array-based, Linked list-based |
Access Pattern | Access only to top element |
Common Use Cases | Expression evaluation, Backtracking, Function calls |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Push | O(1) | Add to top |
Pop | O(1) | Remove from top |
Peek | O(1) | View top element |
Search | O(n) | Must pop elements to search |
Queues
Characteristic | Description |
---|---|
Description | FIFO (First In, First Out) data structure |
Implementations | Array-based, Linked list-based |
Types | Simple Queue, Circular Queue, Priority Queue, Deque |
Common Use Cases | Job scheduling, Request processing, BFS traversal |
Types of Queues
Type | Description | Special Properties |
---|---|---|
Simple Queue | Basic FIFO structure | Standard enqueue/dequeue |
Circular Queue | Uses wraparound indexing | Efficient use of space |
Priority Queue | Elements have priority values | Highest priority served first |
Deque | Double-ended queue | Insert/delete from both ends |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Enqueue | O(1) | Add to rear |
Dequeue | O(1) | Remove from front |
Peek | O(1) | View front element |
Search | O(n) | Must dequeue to search |
Hierarchical Data Structures
Trees
Characteristic | Description |
---|---|
Description | Hierarchical structure with nodes connected by edges |
Key Terms | Root, Node, Leaf, Parent, Child, Sibling, Subtree, Path, Height, Depth |
Types | Binary Tree, BST, AVL Tree, Red-Black Tree, B-Tree, Trie |
Tree Terminology
Term | Definition |
---|---|
Root | Top node with no parent |
Leaf | Node with no children |
Height | Longest path from root to leaf |
Depth | Length of path from root to node |
Balanced | Left and right subtrees differ in height by at most 1 |
Common Tree Types
Type | Description | Key Properties |
---|---|---|
Binary Tree | Each node has at most 2 children | Used in expression trees |
Binary Search Tree | Left child < parent < right child | Efficient search, insert, delete |
AVL Tree | Self-balancing BST | Height-balanced |
Red-Black Tree | Self-balancing BST | Color-based balancing |
B-Tree | Generalized tree with many children | Optimized for disk access |
Trie | Character-based tree for strings | Efficient prefix searching |
Binary Search Tree Operations
Operation | Average | Worst Case | Notes |
---|---|---|---|
Search | O(log n) | O(n) | O(n) for skewed tree |
Insert | O(log n) | O(n) | Maintains BST property |
Delete | O(log n) | O(n) | Requires tree restructuring |
Traversal | O(n) | O(n) | In-order, pre-order, post-order |
Self-balancing Trees Comparison
Tree Type | Balancing Method | Height | Insert/Delete | Practical Use |
---|---|---|---|---|
AVL | Height factor | Strictly balanced | More rotations | When frequent lookups |
Red-Black | Color properties | Approximately balanced | Fewer rotations | General purpose, stdlib |
B-Tree | Node occupancy | Low height, high width | Optimized for blocks | Databases, file systems |
Heaps
Characteristic | Description |
---|---|
Description | Complete binary tree with heap property |
Types | Min Heap, Max Heap, Binary Heap, Fibonacci Heap |
Heap Property | Parent key ≥ child key (max heap) or Parent key ≤ child key (min heap) |
Common Use Cases | Priority queues, heap sort, graph algorithms (Dijkstra’s) |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Insert | O(log n) | Add and bubble up |
Extract-Min/Max | O(log n) | Remove root and heapify |
Peek | O(1) | Access root element |
Heapify | O(n) | Build heap from array |
Decrease Key | O(log n) | Update and bubble up/down |
Hash-Based Structures
Hash Tables
Characteristic | Description |
---|---|
Description | Data structure mapping keys to values using hash function |
Components | Hash function, Array of buckets, Collision resolution strategy |
Load Factor | Ratio of filled slots to table size |
Common Use Cases | Fast lookups, caches, symbol tables, sets, maps |
Collision Resolution
Strategy | Description | Pros | Cons |
---|---|---|---|
Chaining | Each bucket contains linked list | Simple implementation | Extra memory for links |
Open Addressing | Probe for next empty slot | Better cache locality | Sensitive to load factor |
Linear Probing | Check next slot sequentially | Simple, cache-friendly | Clustering |
Quadratic Probing | Use quadratic function to find slots | Reduces clustering | Secondary clustering |
Double Hashing | Use second hash function for offset | Minimizes clustering | More computation |
Operations and Time Complexity
Operation | Average | Worst | Notes |
---|---|---|---|
Insert | O(1) | O(n) | Worst case with many collisions |
Delete | O(1) | O(n) | Requires finding the element first |
Search | O(1) | O(n) | Depends on collision resolution |
Rehash | O(n) | O(n) | Grows table when load factor exceeds threshold |
Graph Data Structures
Graphs
Characteristic | Description |
---|---|
Description | Collection of nodes (vertices) connected by edges |
Types | Directed/Undirected, Weighted/Unweighted, Dense/Sparse |
Representations | Adjacency Matrix, Adjacency List, Edge List |
Common Use Cases | Networks, relationships, paths, flows |
Graph Representations Comparison
Representation | Space | Edge Lookup | Neighbors | Notes |
---|---|---|---|---|
Adjacency Matrix | O(V²) | O(1) | O(V) | Better for dense graphs |
Adjacency List | O(V+E) | O(deg(V)) | O(deg(V)) | Better for sparse graphs |
Edge List | O(E) | O(E) | O(E) | Simple but inefficient for queries |
Common Graph Algorithms Time Complexity
Algorithm | Time Complexity | Use Case |
---|---|---|
BFS | O(V+E) | Shortest path (unweighted) |
DFS | O(V+E) | Cycle detection, topological sorting |
Dijkstra’s | O(E + V log V) | Shortest path (weighted, positive) |
Bellman-Ford | O(VE) | Shortest path (handles negative edges) |
Floyd-Warshall | O(V³) | All-pairs shortest paths |
Prim’s/Kruskal’s | O(E log V) | Minimum spanning tree |
Topological Sort | O(V+E) | Dependency ordering |
Advanced Data Structures
Disjoint Set (Union-Find)
Characteristic | Description |
---|---|
Description | Tracks elements partitioned into non-overlapping sets |
Operations | MakeSet, Find, Union |
Optimizations | Path compression, Union by rank/size |
Use Cases | Kruskal’s algorithm, cycle detection, network connectivity |
Operations and Time Complexity
Operation | Time Complexity | With Optimizations |
---|---|---|
MakeSet | O(1) | O(1) |
Find | O(n) | O(α(n)) – almost constant |
Union | O(n) | O(α(n)) – almost constant |
Segment Trees
Characteristic | Description |
---|---|
Description | Tree for storing intervals or segments with efficient queries |
Operations | Build, Query, Update |
Use Cases | Range queries (min, max, sum), computational geometry |
Space Complexity | O(n) |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Build | O(n) | Construct from array |
Query | O(log n) | Range queries |
Update | O(log n) | Single element update |
Trie (Prefix Tree)
Characteristic | Description |
---|---|
Description | Tree-like structure for storing strings with common prefixes |
Node Structure | Array/Map of child pointers, end-of-word flag |
Use Cases | Auto-complete, spell checking, prefix matching, IP routing |
Space Complexity | O(ALPHABET_SIZE * key_length * n) |
Operations and Time Complexity
Operation | Time Complexity | Notes |
---|---|---|
Insert | O(m) | m = key length |
Search | O(m) | Exact match search |
Prefix Search | O(m) | Find all words with prefix |
Delete | O(m) | Remove word |
Data Structure Selection Guide
Selection Factors
- Type of operations: Insertion, deletion, search, traversal
- Access patterns: Random, sequential, sorted
- Memory constraints: Available memory, cache considerations
- Implementation complexity: Development time, maintainability
- Performance requirements: Time complexity needs
Common Problem Types and Recommended Data Structures
Problem Type | Recommended Structures | Reasoning |
---|---|---|
Fast lookups by key | Hash Table, BST | O(1) average for hash table, O(log n) for BST |
Ordered data | BST, AVL Tree, B-Tree | Maintains order with efficient operations |
Frequent inserts/deletes | Linked List, Hash Table | Efficient modifications |
Priority management | Heap, Priority Queue | Efficient min/max extraction |
String operations | Trie, Suffix Tree | Optimized for prefix/suffix operations |
Graph problems | Adjacency List/Matrix | Represent relationships between entities |
Range queries | Segment Tree, Fenwick Tree | Efficient interval operations |
Disjoint data | Union-Find | Track non-overlapping sets |
Common Challenges and Solutions
Memory Management
Challenge | Solution |
---|---|
High memory usage | Use more space-efficient structures (adjacency list vs. matrix) |
Fragmentation | Pool allocators, custom memory management |
Cache misses | Improve locality, use array-based structures where possible |
Memory leaks | Proper cleanup, smart pointers, garbage collection |
Performance Bottlenecks
Challenge | Solution |
---|---|
Slow searches | Use indexing, hashing, or tree-based structures |
Expensive sorts | Choose appropriate sort algorithm, pre-sort if possible |
Inefficient traversals | Optimize access patterns, consider iteration order |
Unbalanced structures | Use self-balancing variants (AVL, Red-Black trees) |
Implementation Issues
Challenge | Solution |
---|---|
Complex implementation | Use standard library implementations when available |
Edge cases | Thorough testing, robust handling of special cases |
Concurrency | Use thread-safe data structures or proper synchronization |
Serialization | Implement custom serializers/deserializers |
Best Practices and Tips
General Best Practices
- Use standard library implementations when available
- Choose the simplest data structure that meets requirements
- Consider both time and space complexity
- Profile before optimizing
- Document the rationale for data structure choices
Language-Specific Tips
Java
- Use Collections framework (ArrayList, HashMap, etc.)
- Consider the concurrent versions for thread safety
- Use specialized implementations (EnumMap, ArrayDeque)
- Be aware of autoboxing overhead with primitive types
Python
- Use built-in types (list, dict, set) for simple cases
collections
module offers specialized structures (deque, defaultdict)- Consider NumPy for array-based operations
- Use generator expressions for memory efficiency
C++
- Leverage STL containers (vector, unordered_map, etc.)
- Use reserve() to pre-allocate memory
- Consider custom allocators for performance-critical code
- Be careful with iterator invalidation
JavaScript
- Use Map and Set for better performance than Object
- Arrays are dynamic by default
- Consider typed arrays for numerical data
- Use WeakMap/WeakSet for memory-sensitive caches
Resources for Further Learning
Books
- “Introduction to Algorithms” by Cormen, Leiserson, Rivest, and Stein
- “Algorithms” by Robert Sedgewick and Kevin Wayne
- “The Art of Computer Programming” by Donald Knuth
- “Cracking the Coding Interview” by Gayle Laakmann McDowell
Online Resources
- GeeksforGeeks (geeksforgeeks.org)
- Visualgo (visualgo.net) – Data structure visualizations
- LeetCode (leetcode.com) – Practice problems
- HackerRank (hackerrank.com) – Data structure challenges
- MIT OpenCourseWare – Algorithms and data structures courses
Standard Library Documentation
- Java Collections Framework (docs.oracle.com)
- C++ STL (cppreference.com)
- Python Data Structures (docs.python.org)
- JavaScript Data Structures (developer.mozilla.org)
This cheatsheet provides an overview of essential data structures, their properties, and use cases. Remember that selecting the right data structure often involves trade-offs between time complexity, space complexity, and implementation complexity. Always consider your specific requirements when choosing a data structure for your application.