A database is a collection of data that is organized so it can easily be accessed, managed, and updated. Databases are commonly grouped into one of two categories – relational databases and non-relational databases.
In this post, I’ll cover some database scaling techniques you can employ when scaling up your databases to meet the needs of your growing application.
Relational databases, such as MySQL, PostgreSQL, and SQLite3 store data int tables, consisting of records that are identified by a primary key. Tables can be related through their primary keys. These databases use SQL (Structured Query Language) to query the database for information and make them useful for applications that handle lots of complex queries, database transactions and routine data analysis. Relational databases enforce ACID, the set of properties that ensure database transactions are reliably processed.
Non-relational databases, such as MongoDB, represent data in collections of JSON documents. These databases store data without explicit and structured mechanisms to link data between documents. Non-relational databases offer certain advantages over relational databases due to their simplicity of design, better horizontal scaling, finer control over availability, and speed.
Below are some database scaling techniques to keep in mind when scaling your application.
Horizontal scaling is achieved by adding more machines into your pool of resources and chaining them together so they function as a single unit. It it typically faster and cheaper to scale up horizontally, but this also comes with the costs associated with cluster configuration, management, and maintenance costs.
Vertical scaling (also known as scaling up) is achieved by adding more power (CPU, RAM) to an increase the capacity of an existing machine. Vertical scaling offers more consistent performance, simplified debugging, and increased efficiency at scale versus horizontal scaling. However, vertical scaling requires downtime to install new hardware and is limited by the hardware capacity.
Use Read Replicas
Read replicas allow data to be read across any number of servers, known as “slaves”, which are read-only. One server is the “master” and handles all write requests, as well as read requests. The more slaves that are installed, the more read-based queries can be executed. This technique is useful for when you have limited write changes but need to greatly expand read queries.
Use Multiple Masters
The multi-master pattern is similar to read replicas, except all servers are master and can read and write data. This is useful for when your application needs to handle a large amount of both read and write requests. The catch: applications are required to generated Universally Unique Identifiers (UUIDs), to prevent data collision during the multi-master replication process.
Partition YOUR database(S)
Partitioning is necessary when you have large databases that one server cannot access or modify the data by itself without severely impacting scale and performance of your application. Horizontal partitioning, known as “sharding”, distributes data across different servers based on a certain scheme. Vertical partitioning separates data associated with a single table based on frequency of access.
When querying a database for information, the application creates a client connection to the database to send the query and receive the data. Because there is overhead associated with establishing a client connection to the database, connections can be reused to handle queries. Connection pooling prevents the need to establish a new database with each database request.
USE A Load Balancer
Each database vendor can handle a limited number of client connections from application servers, so as your application grows a database will need to handle more connections. A database load balancer distributes incoming database queries to an available server using its connection pool, and does this in such a way that no single server is overloaded with requests.