Choosing The Right Web Server For Your Project
Apache and Nginx are the two most utilized open source web server softwares on the internet. The aforementioned two combined are responsible for serving well over half of the internet we see as end users. Both solutions are quite capable offerings, can handle incredibly diverse workloads, and have very useful features beyond actually serving web pages.
Apache and Nginx share many features, but they should not be thought of as interchangeable. They both excel in its own unique way and it is important to understand the situations and it is critical to understand your individual use case when deciding which one to deploy.
General Overview of NGINX and Apache
Before we get into the weeds on the differences between Apache and Nginx, it’s worth taking a 10,000 ft. overview at the origins of these two projects and their defining characteristics.
The Apache HTTP Server was born in 1995 and has been developed under the direction of the Apache Software Foundation since 1999. Since the HTTP server is the Apache Foundation’s founding project and is by far their most popular piece of software, it is often referred to simply as Apache.
The Apache web server has been the most pervasive server on the internet since a year after the project was first created. Because of this ubiquity, Apache has a huge, vibrant and mature developer community and as such is very well documented. A huge volume of tutorials on setting up your own hosting environments are based on using Apache and PHP.
Apache is often chosen by Sys Admins and DevOps Engineers for its wealth of support and industry ubiquity. It is extended through a module system and has great server and scripting language support.
In 2002, Igor Sysoev began work on Nginx as an answer to the C10K problem, which was getting web servers to handle ten thousand concurrent connections, which is a requirement for today’s web application demands. The initial public release was made in 2004, meeting this goal by relying on an asynchronous, events-driven architecture.
Nginx gained support and popularity since release because of light weight resource utilization and ability to scale easily on commodity hardware configurations. One of Nginx’s great strengths is serving static content quickly and is designed to pass non-static (read: dynamic) client requests off to other software to process the non-html (dynamic) requests. One example is PHP-FPM (Fast Process Manager).
Nginx is selected by Sys Admins and DevOps Engineers for its resource efficient nature and ability to handle loads at scale.
The Handling of Connections
A key difference between Apache and Nginx is the manner upon which they process connections and traffic.
To talk about these differences, we get a little help from our friends over at DigitalOcean (Source article can be found here)
Apache provides a smattering of multi-processing modules (colloquially known as MPMs) that determine how client requests are handled. This modularization allows server admins to alter its connection handling architecture on the fly. The different MPM’s all have unique advantages over each other (which is not in the scope of this article), but they all have some serious drawbacks when it comes to how lots of connections are handled at the same time. Put simply, there was less traffic on the internet when Apache was created, and the concurrency of the day was what drove the architecture of Apache.
With more awareness of the connection handling problems that would face sites at scale running Apache, this awareness was baked into Nginx. Leveraging this knowledge, Nginx was designed from the ground up to use an asynchronous, non-blocking, event-driven connection handling algorithm, rather than the pluggable MPM’s of Apache.
Nginx spawns worker processes, each of which can handle thousands of connections. The worker processes accomplish this by implementing a fast looping mechanism that continuously evaluates, and processes events. Unbinding actual work from individual connections allows each worker to concern itself with a connection only when a new event has been triggered.
Each of the connections handled by the worker are placed within the event loop where they exist with other connections. Within the loop, events are processed asynchronously, allowing work to be handled in a non-blocking manner. When the connection closes, it is removed from the loop.
This style of connection processing allows Nginx to scale incredibly far with limited resources. Since the server is single-threaded and processes are not spawned to handle each new connection, the memory and CPU usage tends to stay relatively consistent, even at times of heavy load.
Static vs Dynamic Content
Probably the most common comparison between Apache and Nginx is the manner in which each server handles requests for static and dynamic content.
Apache servers can handle static content conventional file-based parsing methodology. Apache processes dynamic content by embedding an interpreter of the language being processed (ie. CGI, PHP, et al.) into its worker instances. This allows Apache to parse and serve up dynamic content within the web server itself without having to rely on external components like fast process managers (PHP-FPM, for example). These dynamic processors are side-loaded through Apache’s module system.
This makes Apache’s dynamic processing easy, since communication does not need to be coordinated with a piece of free standing software, and modules can easily be changed out if the system level conditions change.
Nginx lacks the ability to process dynamic content out of the box. To handle PHP and other requests for dynamic content, Nginx must passed to a free standing processor like PHP-FPM, be processed then sent back to Nginx. The results are then presented to the end user.
This tends to complicate things slightly, especially when trying to anticipate the number of connections to allow, and if you have never had to properly resource plan at scale, this can be very tricky to get right.
With the previous in mind, Nginx does have a hidden advantage here. Since the language interpreter is not baked in the worker process, its overhead will only be present for content that needs to be processed dynamically. Static content can be served as is, and the language interpreter will only be called up when needed. This saves some serious overhead at scale. Apache can function in this manner to a certain degree, but doing so requires advanced tinkery and can cause problems down the road with teams that need to troubleshoot the server.
Non HTTP Features
Apache and Nginx can both be used as load balancers, but one does this job way better than the other. Getting Apache load balancing to work correctly is difficult at best, and you still have issues with the high memory footprint of Apache. Nginx, on the other hand, was mainly geared to do load balancing, and you have a lot of options that you can configure. You can
- Proxy different traffic types to different servers or ‘backends’
- Do least load first, round robin or similar load balancing setups
- More efficient SSL termination. Doing so with Apache can be a nightmare at scale
- You can get very granular with static asset caching with Nginx, which is why it is the basis of CloudFlare and similar CDN products.
Using Apache and Nginx Together
If you plan to deploy your web application in a containerized manner, this is a common use case.
You can bake your, for instance, PHP application into a docker container based on PHP’s official Docker Hub image, which contains Apache. From there, you can use Nginx in docker to point at your application container, which is wrapped up with the server that it’ll need to present your application to the world.
This has advantages because you let Nginx handle the connections while leaving Apache to work with the dynamic content that you have baked into the container, and let each Nginx and Apache do what they do best. While this can create some technical overhead, there are a lot of places where this is being used to great success. This same model can also be applied to non-container setups.
Both Apache and Nginx are capable offerings. Deciding which is best for your use case is largely dependent on your specific requirements and extensive testing with the patterns that you expect to see.
There is no ‘good for all’ web server, so use the solution that best aligns with your objectives.