Benchmarking cassandra with go

Golang is my new favourite programming language. In fact I would never imagine anything to top C++ in my personal ranking of programming languages. After recent years of suffering with slow C# and JavaScript there is a language that brings back joy of programming. It's minimalistic, clean, fast and built for concurrency.

Golang was initially built for solving server problems in Google but grew into a versatile language. One of the strongest feature of Golang is support for concurrency via goroutines. Goroutine is a special go thread with minimal overhead. Creating a new goroutine is very fast and cost about 2kb of memory (version 1.13).

In a typical server application written in Go you would spawned a single goroutine per connection/request. Each goroutine would process the request, access database and manipulate data until a response can be made back. Number of goroutines in a single instance server is limited only by memory. Having highly concurrent and scalable language needs a highly concurrent and scalable database. Cassandra seems to be a perfect fit.

Benchmark

In this benchmark I made series of tests over Cassandra database. There were used 3 types of test queries:

  • Write query
  • Read query
  • Mix of Read and Write query

Test was made on a 3 nodes cluster database with strong consistency and replication factor 3. The database was deployed on AWS. Testing was executed from another AWS instance on the same internal network to minimize traffic overhead. Each of the query was executed on a testing table with 5 fields:


    CREATE TABLE test (
        id UUID,
	data_0 bigint,
	data_1 int,
	data_2 int,
	time_data timestamp,
        PRIMARY KEY (id)
	);

Sequential test

For a reference and comparison I have made a sequential test which runs all queries in sequence. This means that there is only one goroutine doing the job. This test doesn't represent realistic load on the database.

queries 10k write query 10k read query 10k read/write query
Query speed [s] 16s 17s 32s


Concurrent test

Concurrent test runs all queries in parallel using goroutines. This test is much closer to the real server load assuming every incoming request is treated as one goroutine.

amount of concurrent queries 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Write speed [s] 0.06s 0.16s 0.22s 0.31s 0.37s 0.42s 0.58s 0.57s 0.73s 0.83s
Read speed [s] 0.06s 0.14s 0.2s 0.31s 0.4s 0.47s 0.59s 0.55s 0.59s 0.69s
Read/Write speed [s] 0.1s 0.2s 0.3s 0.5s 0.6s 0.8s 0.9s 0.9s 1s 1.3s


The results are amazing. 10.000 concurrent queries are executed bellow 1 second. This proves that Cassandra is extremely fast and can handle concurrent queries with ease.

But how far can it go? From a single testing instance I was able to scale the test up to 100k concurrent queries. After that the Unix OS refused to spawn additional connections (number of file descriptors). Adding more testing instances would be outside of scope of this test. Here's the result:

amount of concurrent queries 10k 20k 30k 40k 50k 60k 70k 80k 90k 100k
Write speed [s] 0.83s 1.5s 2.4s 3.2s 3.7s 5s 5.5s 6.2s 7.3s 7.5s
Read speed [s] 0.69s 1.5s 2.5s 3.3s 4.9s 5.3s 6.3s 6.2s 7.7s 9.5s
Read/Write speed [s] 1.3s 2.8s 4.4s 5.4s 6.6s 7.6s 9.2s 10.2s 14.9s 14.8s


The results are absolutely stunning. 100.000 concurrent queries are executed bellow 10 second. This proves again that Cassandra is suitable for extreme loads.

Understanding the results

How does 100k concurrent queries translates to number of real users is another question. In a real world scenario an average gamer spends around 10 minutes of playing per session. Finding an average number of queries in one session can differ for each game, but there are certain similarities that can help to estimate:

  • Start of the game will usually have some sort of authentication, getting user data, loading additional data ~5 queries.
  • Main game play loop will heavily depend on each game, for simplicity let's assume average ~10 queries.
  • Collecting reward ~5 queries.
  • Social interactions, for example chatting with friends ~10 queries.

Let's assume our average game session does in average 30 queries in 10 minutes. That's around 0.05 queries in 1 second. Now based on the benchmark our database can run 10.000 queries in 1 second. That means to reach 10.000 concurrent queries we would need around 200.000 concurrent users (CCU). And reaching 100.000 concurrent queries would require at least 2.000.000 CCU.

This is a very rough estimation, there are many games/apps having more than 100 queries per session, in those cases the calculation will result in much less CCU.

References