Let’s admit, the whole “High Load” paradigm is a bit hyped now. We hear people throwing these words here and there, without a proper understanding of the topic. 

When used in the expression “the development of high load systems”, it is normally about the creation of applications that can take higher loads. But what is high load? How high should the load be to be considered high? How to detect high load systems in the wild? Let’s find out. 




By itself, a high load application is not just a piece of code that can withstand many conditional loads (requests, visitors), but a diversified and well-planned IT infrastructure. It can be either a single server or a network of interconnected servers, depending on the unique needs of a particular business. 

The difference between an application and an infrastructure is obvious – the latter can be easily scaled to any theoretically achievable number of clients, requests, or any other indicators. 

Overall, a high load system is a system that is constantly scalable and has enough resources to handle current workloads. 




First let’s realize one simple truth: high load is a relative concept. It cannot be measured by the number of requests that go to the server or the website loading speed. There is no “average” requests number, because there is no “average” site.  

One web resource can process a thousand requests per second, and the other will collapse on the hundredth connection. So, quantitative indicators are not the point here. 




 Thus, if it is not defined by quantity, we should be able to define a high load system by certain qualities, right? And here they are: 


  • It has a huge audience. 


If we talk about web applications, there should be thousands, and sometimes hundreds of thousands of people. Of course, a specific figure cannot be named. But an online store that processes 10 leads a day cannot be called a “high load”. Facebook, Amazon, Flickr, Myspace or YouTube – can easily be called high load systems. 


  • It's a distributed system. 


If an application processes a huge amount of data, which is also constantly growing, one server is not enough. The largest “high loads” (for example, Google or Facebook) run on hundreds of servers. 

At the same time, servers do fail, so the more of them you have, the faster you will recover from a failure. 


  • It is a system with positive dynamics. 


If an application offers value to users, its audience naturally grows. Therefore, “high load” is not just a system with many users, but a system that intensively increases its audience. 


  • It is an interactive system. 


Whether a person enters a search query into Google, uploads a video to YouTube, or makes a purchase on eBay, they expect to get an instant result. If the system takes a long time to respond, most likely, the user will find another solution. Therefore, instant response is a distinctive and very important feature of any high load system. 


  • It is a high-performance system. 


This point is directly related to the previous one. To perform an instant response, the system requires a lot of resources: CPU, RAM, disk space, etc. Therefore, it is necessary that these resources: a) are available; b) are in excessive quantity. 


This is where the paradox of a high-load system pops up: the faster it grows, the more accurately you manage its resources. When an app grows its audience, the number of requests naturally grows as well. And with them – the number of resources that need to be spent to maintain the system. 


High load system needs to be constantly scaled. Configuring it in such way is quite difficult, but from a business point of view, it's worth it. 




High load systems are, by definition, subject to high loads. Such loads should be distributed equally among the system, so that if one component fails, the others still work. Of course, you need to distribute the load wisely – system analysis and system modeling methods (including diagrams drawing) will, normally, do the job. 


Moreover, when creating models, you should consider not only components, but also the interdependence of data. The degree of interdependence is different – therefore, the possibility of the so-called preliminary caching appears: if the user requested some data, with a high degree of probability he will request other data that is closely related to the previously requested one. 




Talking about high load systems we should realize that there is no one-size-fits-all solution; nor such thing as “High Load technology”. 

Another important observation is that 100% availability at high loads is unattainable, and all attempts to achieve it by increasing the string of nines (99.9%, 99.99%, etc.) lead to a dramatical cost increase.  

Therefore, if you want to build a high load system, your main challenge is to find a balance between the desired level of availability and the budget.