Cache

Description

A cache is a high-speed data storage layer that stores a subset of data, typically transient in nature, so that future requests for that data are served faster than accessing the data’s primary storage location. It acts as a buffer between the source of data (like RAM, disk, or a database) and the consumer (like a processor or web client).

Caches are integral in computer architecture, operating systems, web applications, and databases, providing performance improvements by avoiding repetitive, costly access to slower data sources.

Why Caching Matters

The primary purpose of caching is to reduce:

Latency: Faster data retrieval.
Load: Less stress on backend systems or slower storage tiers.
Bandwidth: Repeated queries or downloads are avoided.

Caching increases throughput and improves user experience, especially in time-sensitive applications.

Basic Caching Principle

Caching relies on the assumption of temporal and spatial locality:

Temporal locality: If a piece of data was accessed recently, it’s likely to be accessed again soon.
Spatial locality: If one piece of data is accessed, nearby data is likely to be accessed.

How It Works

Caching follows a simple sequence:

Check: On each request, the cache is checked first.
Hit: If the data is present (cache hit), it’s served instantly.
Miss: If not (cache miss), the data is retrieved from the source, stored in the cache, and returned.

Cache Hit Ratio Formula

Cache Hit Ratio = (Number of Cache Hits) / (Total Requests)

A higher ratio indicates better cache efficiency.

Types of Cache

1. Hardware Cache (CPU Cache)

L1 (Level 1): Closest to the CPU, smallest, fastest.
L2 (Level 2): Slightly larger and slower.
L3 (Level 3): Shared among cores, larger still, slower.

These caches store recently used instructions and data to reduce memory latency.

2. Memory Cache

Caches data in RAM for quick access, often used by databases or applications.

3. Disk Cache

Stores frequently accessed disk data in RAM to speed up file access operations.

4. Web Cache

Browser Cache: Stores static files (CSS, JS, images).
Proxy Cache: Intermediary between client and server.
CDN (Content Delivery Network): Caches content geographically close to the user.

5. Application Cache

Framework-level or code-level cache like:

Django/Flask cache
Rails cache
ASP.NET in-memory cache

6. Database Cache

Improves query performance by caching:

Query results
Computed values
Rows or pages (e.g., MySQL buffer pool)

Popular Caching Strategies

Strategy	Description
Write-through	Data is written to both the cache and the backing store simultaneously.
Write-back	Data is written to cache first and updated in the main storage later.
Read-through	Application reads from cache; if missing, fetches from main source and updates cache.
Write-around	Data is written only to storage, avoiding cache pollution.

Replacement Policies

When the cache is full, the system needs to decide which data to evict. Common policies include:

Policy	Description
LRU	Least Recently Used
LFU	Least Frequently Used
FIFO	First In, First Out
Random	Random item is removed
MRU	Most Recently Used (less common)

Cache Coherence

In multi-processor or multi-threaded systems, keeping multiple caches synchronized is essential. This is known as cache coherence.

Example scenario:

CPU 1 modifies variable X in its cache.
CPU 2 has a stale copy of X.
If coherence isn’t maintained, incorrect program behavior occurs.

Protocols like MESI (Modified, Exclusive, Shared, Invalid) help maintain coherence.

Distributed Caching

Used in large-scale systems where multiple servers need fast access to shared data.

Common Tools:

Redis: In-memory key-value store, highly popular.
Memcached: Lightweight distributed memory object caching system.
Hazelcast: Java-based distributed cache.
Ehcache: Java in-process cache.

Cache Invalidation

Keeping cache up-to-date is challenging. Strategies include:

Time-based: Cache expires after a set TTL (time to live).
Event-based: Cache is invalidated when the underlying data changes.
Manual: Application explicitly deletes cache entry.

Cache invalidation is famously hard:

“There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.” — Phil Karlton

Example: Web Page Caching (Python)

from functools import lru_cache

@lru_cache(maxsize=128)
def get_user_data(user_id):
    # Simulated slow function
    time.sleep(2)
    return db.query(user_id)

This decorates the function with a least-recently-used cache of 128 entries.

Performance Metrics

Metric	Description
Hit Rate	% of times data was found in the cache
Miss Rate	1 – Hit Rate
Eviction Count	How many entries have been removed
Latency	Time to serve from cache vs source
Throughput	How many requests served per second

Pros and Cons of Caching

✅ Pros

Reduces latency
Lowers backend load
Improves scalability
Enhances user experience

❌ Cons

Data staleness risk
Complexity in invalidation
Memory consumption
Risk of cache poisoning

Real-World Use Cases

Scenario	Cache Application
Web apps	Browser cache, CDN caching
E-commerce	Product listings, user sessions
Databases	Frequently accessed query results
APIs	Rate-limiting and throttling using cache
Operating Systems	File system caching, paging
Machine Learning	Caching preprocessed datasets

Advanced Concepts

1. Write Amplification

In storage systems, frequent writes to cache may lead to increased I/O on disks if not managed well.

2. Cache Stampede

Occurs when many clients simultaneously attempt to recompute a missing or expired cache item.

Mitigation Techniques:

Use mutex locks.
Introduce random TTLs.
Use stale-while-revalidate strategies.

3. Lazy vs Eager Caching

Lazy: Data cached only when requested.
Eager: Data prefetched into cache ahead of time.

Security Considerations

Cache Poisoning: Malicious data injected into cache.
Sensitive Data Leakage: Caching user-specific or confidential data without control.
Side-Channel Attacks: Exploiting cache behavior to infer sensitive information (e.g., Meltdown/Spectre).

Related Terms

Conclusion

Caching is a core performance optimization technique that powers everything from microprocessors to large-scale web applications. While simple in principle — store now, retrieve fast — caching introduces sophisticated design trade-offs around consistency, memory usage, and invalidation strategies. Whether it’s a browser serving a static image or a distributed Redis cluster powering real-time analytics, caching remains essential in designing responsive and efficient systems.