?

The Power of Distributed Caching With Redis

Explore the nature of computing and data access to uncover the reason why distributed caching can be so effective.

February 2, 2021 6 minute read

How will my web application scale to millions of users? How can I scale my database to millions of rows and still support millions of users? These are both questions with a lot of possible answers. But I want to first talk about efficiency before we arrive at an answer.

We always need to start by focusing on the bottlenecks, the areas of the application that are hit the most and take the longest to execute. Typically this comes down to database access and third-party network requests. For the purposes of this article, we'll talk about databases and how we access our data.  

There's a science to retrieving data as quickly as possible. It comes down to where the data is stored and the algorithm used to retrieve the data.  

Where is your data stored?

Below is a chart of typical access times for common computer actions, along with normalized human-readable times to make it easy to see the magnitude of these differences:

Access times for common computer actions
Access times for common computer actions

The two I want to focus on are the RAM access and the SSD I/O. The memory access is between 100 and 200 times faster than the NVMe SSD access and the difference is even greater if you compare it to the still very common SATA SSDs. Okay, so we know that retrieving data from RAM should be preferred when possible. Of course, RAM is not persistent and all data stored in RAM is lost when the computer restarts. So it's clear that we can't use RAM for everything.

How is your data accessed?

There's a concept in computer science known as time complexity, which is usually denoted with Big O notation. Put simply, the purpose is to describe how many steps an algorithm needs to finish its work on "n" number of items. 

For example, finding a movie in an unsorted array of movie titles requires the computer to iterate through each movie title one at a time until it finds it. This would have a worst-case efficiency of O(n), meaning that every single movie is tested until the algorithm finds the one it's looking for at the end. However, if we take the same collection of movies and put it in a hash table we can greatly improve the efficiency of this lookup. A lookup on a hash table is an O(1) operation for both the best case and the worst case, meaning that the algorithm can always find the movie in a single step.

Graph for Big O time complexity examples
Big O time complexity examples

A little about databases

Database indexes can go a long way if used properly. They'll effectively improve the time complexity and efficiency of your queries. But you may still run into performance problems with your database as it's having to do a lot of work to return data from the disk.

So often the choice is made to go with a NoSQL database such as MongoDB or Cassandra as they'll scale across multiple servers much more easily than most SQL databases. That allows you to simply throw more servers at the problem to support the workload and achieve the performance that you need. And that may or may not be the right choice for you. But both SQL and NoSQL databases still tend to suffer from the same problem: the data is stored on and accessed from the disk.

Enter Redis

Redis is an in-memory data store that is most often used as a distributed cache. It offers a variety of efficient data structures designed to allow brutally fast access to your data. Typically what you'll want to do is store frequently accessed data in Redis, so that whenever the data is requested it can come from the cache instead of your database. You can then invalidate the relevant cache whenever a change is made to your data so that you can keep your cache up to date. 

Redis has the option to also persist to a disk so the cache isn't lost if the server restarts. And Redis can also be built in a cluster which spreads the cache across multiple servers. Technically Redis can even be used as your primary database as well, although more often we use it as a cache to reduce the load on more feature-rich databases that are meant for persisting data.

Knowing that caching can be heavily utilized can really change other considerations too. Above I mentioned the easy scalability of NoSQL databases. But if Redis is eliminating 95 percent of your database workload through caching it may actually be viable to stick with a single SQL database instance.

The team from StackOverflow has published some interesting articles about their experience with Redis, as they've been using it to great effect for years to accelerate stackoverflow.com as well as their other StackExchange series of websites.

Nick Craver from StackOverflow goes on to describe typical Redis usage in 2019:

  • Our Redis physical servers have 256GB of memory, but less than 96GB used.
  • 1,586,553,473 commands processed per day (3,726,580,897 commands and 86,982 per second peak across all instances – due to replicas).
  • Average of 2.01% CPU utilization (3.04% peak) for the entire server (< 1% even for the most active instance).
  • 124,415,398 active keys (422,818,481 including replicas).
  • Those numbers are across 308,065,226 HTTP hits (64,717,337 of which were question pages).

As you can see, even with over 1.5 billion Redis commands per day, their Redis servers are barely breaking a sweat at only 2 percent CPU utilization.

The StackOverflow example is great because of the sheer magnitude of its user base. And everyone I talk to both inside and outside of WWT has had very positive things to say about Redis. Its capabilities are impressive and can allow for staggering amounts of throughput with your application without a ton of server resources. 

Talk to us to determine if Redis is a good fit for your application. We have the expertise to implement a caching strategy to allow your application to scale effectively.

Need your application to scale?
Share this

Comments