My Percona Live Data Performance Conference talk is called ActorDB: an alternative view of a distributed database. ActorDB is an open source database that was developed using a distributed model: it uses an SQL database that speaks the MySQL client/server protocol. It has a set of features including:
- Strong consistency (not eventual consistency)
- No single point of failure
- ACID compliance
- Replication using Raft distributed consensus
- SQLite on top of LMDB as a storage engine
- Scalablability to a large number of nodes
I am a co-founder of a private company in Slovenia named Biokoda. Our clients range from small companies to telecoms who offer our solutions to their customers and government institutions. The requirements that our products try to solve always include high availability, ease-of-management, ease-of-scale and self-hosted.
A few years ago we were tasked with building a file sync application. This requires the application to store a potentially very large file hierarchy for every user. When it came to choosing a database, our options were key-value stores, traditional SQL databases, and document stores (which were much less mature then than they are now).
Designing a database that would be an ideal fit for our use case and requirements became a fun engineering challenge. Then writing it became a fun engineering challenge. Sure, it would have been safer and easier to stick with an existing mature SQL database, but I’m an eternal engineering optimist. Sometimes you have to take a crazy chance if you believe in it!
The biggest concern, and rightfully so, is safety. There are many pitfalls developers of distributed databases can fall into. The most basic issues are: What is the consensus algorithm, is it implemented correctly and thoroughly tested? What is the storage engine, is it custom built? If yes how well is its reliability tested? The advantage of getting it right is a way to store state without a single point of failure. It is a way to horizontally grow your database with your needs. If your database is a part of your products, these things become important selling points for your products.
When it comes to performance, distributed databases tend to lose out on a per-node basis. But because they can scale out to more nodes, they can achieve higher performance by an order of magnitude. What we tried to do is base ActorDB on as much solid, proven ground as possible. We avoided developing our own storage and SQL engine, and instead based it on existing proven technology. We even avoided developing our own client protocol and libraries.
More on ActorDB can be found on the project's website, or on their GitHub repository where you can download the source under the Mozilla Public LIcense version 2.0.
To see my talk, register for Percona Live Data Performance Conference 2016. Use the code “FeaturedTalk” and receive $100 off the current registration price.
1 Comment