The popular EA sports game, The Simpson’s: Tapped out was trending at number 2 when it launched. After tasting success at launch, suddenly the users started experiencing lag spoiling the gaming experience. According to the publishers of the game, the overwhelmingly positive response to the game was the culprit. One of the major reasons for this failure was that the database could not scale up to serve the millions of concurrent requests it was getting.
The internet traffic is expected to grow till 463 exabytes every day by 2025. For perspective, 1 exabyte equals 1 billion gigabytes. This heavy increase in data will demand flexible database systems that can scale up based on the dynamic needs of the application. Database scalability is the need of the hour and you will face many bottlenecks when you try to achieve database scalability. We shall discuss some of the most common database scalability bottlenecks in this blog and give you prospective solutions to resolve these issues.
Do you require database scalability?
The database of your application should be able to expand or contract its compute resources to match the dynamic needs of the application. Your database should be able to scale up to handle a sudden surge in traffic. Also, when not in need, your database should be able to contract so as to save resources. One of the best ways to ensure good database scalability is to choose the right database according to your requirements. With physical servers, database expansion and contraction can prove to be a headache. Cloud database solutions can do the trick.
Database scalability is a resource-intensive and challenging task. Hence, before you start the project, you need to ensure that your product really needs a scalable database. First determine whether your product will experience a surge in traffic in the foreseeable future. If not, you can still do with your old database.
If your business is a startup, there is no point of investing resources in acquiring a scalable database. You can do so once your application hits critical mass and you can expect a reasonable surge in traffic.
Here are a few scenarios in which database scalability is absolutely necessary
Database scalability methods
There are two major approaches to take while scaling your database.
-
Vertical scaling(scale up)
-
Horizontal scaling(scale out)
Vertical Scaling
In the vertical scaling approach, you will need to add more resources whether virtual, physical or both to the underlying servers where the database is stored. In essence in vertical scaling we are adding the following resources to the system
-
Computing power(CPU)
-
Memory
-
Storage capacity
Pros of vertical scaling
-
Need to manage only one system, hence easy to implement
-
No need to worry about application compatibility
Cons of vertical scaling
-
High hardware costs
-
An upper limit to the number and size of servers that can be added
-
Single point of failure
The vertical scaling approach is the more traditional approach where we use a large server to support the data.
Horizontal Scaling
In the horizontal scaling approach the idea is to deal with the increased workload by adding more nodes or instances to the database. We simply add more servers to the cluster whenever the organization needs to scale.
Pros of horizontal scaling
-
Easy to scale
-
Fewer downtime periods
-
Better resilience and fault tolerance
-
Improved performance
Cons of horizontal scaling
-
Need to ensure database compatibility
-
Increased complexity as one needs to manage multiple servers
Earlier you had to do all the heavy lifting like selecting whether to go for horizontal scaling on vertical scaling and implementing the architecture. Nowadays cloud service providers like AWS do the job for you.
Services like AWS fargate come with auto scaling capabilities. This means that you do not need to worry about scaling your database, AWS handles this for you. These cloud service providers have taken a lot of complexity out of the system. You still need to know what is happening under the hood so that you can effectively set things up and manage them. At Simform, we provide cloud development services that help you easily deploy and manage a cloud system.
Database scalability challenges
Improper traffic distribution
Inefficient traffic distribution is one of the major bottlenecks that you can face while scaling your database.
When you have multiple servers, you need to ensure that the load is evenly balanced between these servers. Otherwise, if one server has to bear more load than the other, then it could lead to inefficiencies in the system. The server with high load might give up while you already have extra capacity on another server.
Solution
Deploying a load balancer is often the best way to mitigate this issue. The load balancer acts as an intermediary between the client and servers. It accepts traffic from applications and distributes it to the database as shown in the below image
You will need to choose a load balancing algorithm for your load balancer. You can choose from the following load balancicng algorithms based on your traffic needs.
Round Robin
the simplest type of load balancing algorithm. The round robin algorithm forwards the requests in a sequential manner to every database server. All database nodes get an equal amount of requests.
Weighted load balancing
Every database server is assigned a weightage. the weightage is assigned based on the proportion of traffic that the particular database server can handle. This algorithm is useful in cases where you have certain database servers that can handle more requests than the other servers.
Least connection
The least connection algorithm always selects the database server with the fewest active connections. This kind of algorithm is best in scenarios where you get requests that are similar in nature.
At Simform, we can help you configure the correct load balancer for your application based on its traffic needs.
While configuring a load balancer, understand that if you only have one load balancer, then the whole structure can break down if that single load balancer fails. Hence investing in two or three load balancers at a time is a good idea.
Our Simform engineers can integrate AWS Elastic Load Balancing, a service that automatically distributes incoming application traffic across multiple targets and virtual appliances. The service includes three types of load balancers
-
The application load balancer
-
Network load balancer
-
Gateway load balancer
NuData security is a MasterCard company that helps various businesses in stopping all forms of online automated roads. NuData has integrated elastic load balancing across multiple tiers of its infrastructure. The load balancers ensure high availability and scalability of the API services of the company. NuData believes that the load balances provided by AWS are an integral part of its product and help it deliver a robust security solution.
One of our clients, an online dating cum coaching app operating in North America, wanted to build a scalable app. The Simform team was up to the task, we built a highly scalable and secure solution using the AWS Elastic load balancer. The client wanted a scalable solution without compromising on the security. The solution we built could handle 300% more traffic than usual. The idea was to build a system that could handle sudden spikes in
We also made sure that we seamlessly distribute incoming application traffic to multiple EC2 instances with the help of AWS Elastic load balancer to avoid any server downtime.
Inefficient database management
Inefficient database management is one of the major bottlenecks that can affect the scale ability of your database. In fact an improperly designed database proves to be one of the major cloud migration challenges. To ensure that you do not face this bottleneck, it is essential to start with choosing the correct database for the correct business application. The following are some instances that match the correct database type with the correct business application
-
A Bank or financial institution should choose a relational DBMS as it needs to ensure ACID (atomicity, consistency, isolation, durability) for its structured data.
-
A key-value NoSQL database would do the trick for an online multiplayer game that requires sessions.
-
A social media analytics business should choose a graph database.
-
An Internet of Things (IoT) business should go with a time-series database. The database will help the business support its sensor or network data.
While selecting the database, you need to ensure that the database has the ability to scale. At Simform, our database experts can help you select the correct database according to your specific requirements.
Solution
Whichever database architecture you choose, ensure that it has the sharding feature. Sharding is a popular horizontal scale out technique that splits data into multiple database servers known as shards. These shards are faster and easier to manage for a database. Sharding helps a database manage load by managing the reads and writes to the database.
All shards normally have the same type of database engine, data structure and hardware. The idea is to generate a similar level of performance from each shard. The shards operate on a share-nothing model which means that these databases have no knowledge of each other’s existence. Just in case, if one shard has some snag and goes down, then no other shard is impacted. With sharding, in essence we are eliminating the problem of a single point of failure. Thus sharding provides your database with better scalability and fault tolerance. Apart from this, sharding also provides wide ranging benefits like
-
Data storage distribution across machines
-
Better balance of traffic across different shards
-
Improved query performance
-
Easier database scaling
There are two ways to shard data
-
Vertical sharding
-
Horizontal sharding
Horizontal sharding proves effective when your database queries return a subset of rows of data. These rows are often grouped together. For instance, the queries in which the data is filtered based on short date ranges.
Vertical sharding proves effective when your database queries return a subset of columns of data. For instance if some database queries request only names, while the others want only cities, then vertical sharding proves effective in such cases.
Solutions like Amazon RDS make working with a sharded database architecture very easy. It offers you the choice of working with the following set of database engines
-
MySQL
-
MariaDB
-
PostgreSQL
-
Oracle
-
SQL Server
-
Amazon Aurora
TransTMS is an integrated transportation management company based in the USA. We used Amazon RDS amongst other AWS services to help the company achieve a 40% reduction in time and a 25% reduction in costs. Refer the TransTMS case study for more details . Being an AWS premier consulting partner, we at Simform understand the best ways to integrate Amazon RDS into your system.
Performance issues with queries
As it scales up, your application will need to process vast amounts of database queries simultaneously. Your application can crash if it’s not able to handle a huge number of queries. One of the most common issues is that the database is overwhelmed by too many trivial tasks. With millions of users, this can prove to be a major bottleneck for your database.
To resolve this issue, you can follow the following rules
-
Avoid executing a database query inside a loop
-
Use bulk operations while reading and writing large volumes of data
-
Offload and schedule query intensive operations
One of the most simple ways to improve your database’s load handling capability is to cache database queries. Normally an application has only a handful of queries that make up the bulk of requests made. You can cache such queries so that in future these requests are read from the cache. This eliminates the need to fetch data from the database every time such common requests are made. The user is served with the required data quickly. This way caching helps in improving the performance of your database.
Amazon ElastiCache is a caching service that helps you in caching your database. With Amazon ElastiCache you can scale with in-memory caching. Amazon ElastiCache supports real-time use cases. The following are the ideal use cases for Amazon ElastiCache
-
Gaming leaderboards
-
Analytics
-
Streaming
Tinder, the famous dating app powers around 30 billion matches through Amazon ElastiCache. To solve performance issues with queries, AWS also provides multi-region databases.
Slow loading content
Slow loading content is a major database scalability bottleneck. If the users do not get the content they are looking for quickly, then your application is doomed for failure. With content loading speed, serving a couple of hundred users is never a problem. The problem however arises when millions of concurrent users want to access content. The database gets overloaded and simple crashes or loads excruciatingly slow. Most of the users are turned off and leave the application, never to return.
Solution
Serve static assets through a CDN(content delivery network). A CDN is a geographically distributed network of servers. The main aim of these servers is to work in coherence to deliver static content fast. In a CDN, the content is served from in-memory cache. Moreover to speed up content delivery, the content is served from a server that is geographically closest to the client requesting the content.
The AWS services like Cloudfront and S3 make this process easy. These services will help you sever your static content a lot faster and give it a major performance bump.
At Simform, our engineers implemented the AWS cloudfront technology in the FIH(international hockey federation) platform. We were able to serve content to 100,000 users simultaneously without any kind of lag. Check out how an optimized database can help you build a robust system that can cater to 1 million users without any issues.
How to scale to 1 million users
Build scalable solutions with simform
Database is the backbone of any application and you cannot effectively scale an application without a robust database scalability strategy. Scaling an application is much easier today due to cloud service providers like AWS,Azure and Google cloud. You can still choose to manually implement the scaling solutions yourselves. However in most scenarios you would ideally want your team to focus on the application without worrying too much about scalability. We at Simform hold expertise in building scalable solutions. Simform engineers have delivered countless scalable applications that are powering businesses worth millions now. Contact us to build a scalable solution for your business.
Ashik
Great explanation ! informative