NG Firewall Performance Guide: Difference between revisions
No edit summary |
|||
Line 245: | Line 245: | ||
If you are trying to control applications, shape bandwidth, or run captive portal, this won't work because a significant amount of internet traffic is UDP based. However, if the goal is simply to filter web traffic, then scanning UDP is not necessary and bypassing it can save a lot of server processing power. | If you are trying to control applications, shape bandwidth, or run captive portal, this won't work because a significant amount of internet traffic is UDP based. However, if the goal is simply to filter web traffic, then scanning UDP is not necessary and bypassing it can save a lot of server processing power. | ||
=== | === Shape Bandwidth === | ||
Do you have bandwidth hogs or certain applications that are hogging network resources? | Do you have bandwidth hogs or certain applications that are hogging network resources? | ||
Line 253: | Line 253: | ||
In some cases, you can actually change the network profile. For example, schools often struggle with P2P and bittorrent saturating the bandwidth and causing performance bottlenecks at the WAN. Application Control and Bandwidth Control can provide essential tools for blocking or slowing unimportant traffic to limit both the bandwidth requirements and server resource requirements. | In some cases, you can actually change the network profile. For example, schools often struggle with P2P and bittorrent saturating the bandwidth and causing performance bottlenecks at the WAN. Application Control and Bandwidth Control can provide essential tools for blocking or slowing unimportant traffic to limit both the bandwidth requirements and server resource requirements. | ||
Quotas in [[Bandwidth Control]] can provide a useful low-maintenance tool to automatically slow clients when they are using more data then you think in reasonable. | Quotas in [[Bandwidth Control]] can provide a useful low-maintenance tool to automatically slow clients when they are using more data then you think in reasonable. | ||
=== Remove unnecessary apps === | === Remove unnecessary apps === |
Revision as of 03:36, 31 March 2016
Untangle Performance Tuning
This guide describes what factors determine the performance of your Untangle server and configuration and how you can tune your Untangle for optimal performance.
Usually on modern hardware "tuning" really isn't necessary for the huge majority of sites. However, if you are running on a tiny server or running a large site with thousands of users doing more than 100Mbit 24/7, then this guide may help you tune your Untangle to get the best performance out of it.
Performance Factors
There are several main components that determine the performance of your Untangle setup.
- Server Hardware
- Configuration
- Traffic Profile
Of course, all three of these are closely interrelated. Lets analyze each one such that you can find a working configuration.
Hardware
If you already have chosen your hardware, you can skip this section. If you are choosing what hardware to run or evaluating hardware, this section can help make sure you have the optimal setup.
While server performance is extremely complex and there are many different kinds of resources. The most important resources that can be limiting factors are memory, CPU, disk I/O (input/output).
When people think of server performance they usually think of CPU speed. While CPU clock speed and processing power are important, they are the least important resource of these three for Untangle’s work load. More cores and faster cores help, but you can actually run a large site on a fairy underpowered CPU if you have plenty of memory and disk I/O.
Memory is extremely important up to a point. You need enough memory to store Untangle’s working set with some left over to serve as disk cache. If you have a major shortage of memory, you’ll see consistent swapping, performance will be sluggish, and large pauses will occur. Once you have enough memory, you may want to add more for better disk cache, but you won’t see massive gains from doubling memory if you already have enough.
For large sites an important resource for Untangle is disk speed or disk I/O throughput. Unfortunately, when evaluating servers it is often overlooked and the hardest to quantify. Unlike a typical firewall which has flat log files, Reports runs a database and each application logs information to the database through the reporting system. For large sites this can be many millions of events every hour. Systems experiencing disk I/O saturation can experience long pauses and major sluggishness.
Generally, I would just plan on having plenty of all 3 types of resources for your setup with some overhead available, just in case. It is absolutely essential to have at least enough of memory and disk I/O. You can have a 16 core machine with 16 gigs RAM, but if your disk is slow, that will ultimately be your limiting factor.
Virtualization can be a source of additional performance woes. The same principles apply. If Untangle is given sufficient (virtual) resources it will run great. However, if other VMs running on the same virtualization platform manage to saturate the disk I/O, Untangle performance will suffer.
Configuration
Configuration obviously has a huge effect on the performance of your setup. Which apps are installed and their configuration has a huge impact on the amount of work the Untangle system has to do to process the network traffic.
Many new users expect Untangle performance to be comparable to other software firewall solutions available with similar hardware requirements. This is usually true if you install just the Firewall application and maybe some lighter apps. Untangle will have slightly higher latency than your typical layer-3 firewall at these tasks because Untangle (by default) processes all sessions at layer 7, which means it reconstructs the stream for processing before deconstructing it again on the other side.
Where Untangle starts to diverge from traditional router software is when you start installing the apps which can have huge impact on the resource requirements. For example, Web Filter Lite requires a large amount of memory all by itself because it stores the entire categorization database in memory. Web Filter requires much less, since it does its categorization through a cloud service with a local cache. Reports, on the other hand, requires almost no additional memory, but requires a large amount of disk I/O to process and store events. The following chart provides a high-level guide to which resources and how much of each resource each app requires.
Component/App | Memory | CPU | Disk I/O |
---|---|---|---|
Platform | medium | medium | medium |
Web Filter | low | low | low |
Web Filter Lite | high | low | low |
Virus Blocker | medium | medium | medium |
Virus Blocker Lite | medium | medium | medium |
Spam Blocker | high | medium | medium |
Spam Blocker Lite | medium | medium | medium |
Phish Blocker | medium | medium | medium |
Web Cache | medium | low | high |
Bandwdith Control | low | medium | low |
Application Control | low | medium | low |
Application Control Lite | low | low | low |
Firewall | low | low | low |
Ad Blocker | low | medium | low |
Reports | medium | medium | high |
Policy Manager | low | low | low |
Directory Connector | low | low | low |
WAN Failover | low | low | low |
WAN Balancer | low | low | low |
Captive Portal | low | low | low |
IPsec VPN | low | low | low |
OpenVPN | low | low | low |
Intrusion Prevention | high | medium | low |
Configuration Backup | low | low | low |
Branding Manager | low | low | low |
Live Support | low | low | low |
Note: these are just an estimates. The configuration of the app itself can matter a great deal. Virus Blocker can require very little, but if configured to scan every .png downloaded over HTTP, it will be significantly more costly.
As mentioned earlier, none of the apps require an intense amount of CPU power; therefore, it is less important. Disk I/O and memory are very important. If you are short on Disk I/O, try disabling Reports, which will lessen the disk I/O requirements a significant amount. Likewise, if you are short on memory, try removing Web Filter Lite or Intrusion Prevention or Spam Blocker.
The other important aspect of configuration is bypass rules. By default, Untangle processes all ports of TCP and UDP at layer 7. For many sites, this is overkill, and significant gains can be had by just adjusting the bypass rules to bypass traffic that doesn’t require scanning.
Network
The type and amount of traffic on your network plays in important part in your Untangle performance. Unfortunately, it isn’t always a variable you can tune as the traffic on your network is the traffic on your network.
However, at some sites it is appropriate to restrict certain behavior that is not considered an appropriate use of network resources. Often schools may block or shape bittorrent, or use quotas to enforce reasonable bandwidth usage, or outright block content from inappropriate sites. Other tips below suggest ways to tune your configuration to optimized for your network traffic profile.
Summary
Hopefully this article helps illuminate some of Untangle’s inner workings and its performance characteristics. Users often ask “How big of a server do I need on a site with X thousand users?” or “Is this server big enough for this site?” Unfortunately these questions are impossible to answer as the difference from one site to the next site and one configuration to the next configuration can be drastic.
As general guidance, buying a server with good hardware, several cores, and a few gigs of memory, and a good disk setup can handle huge sites if configured correctly. If you aren’t sure how to configure it correctly, call Untangle support. If you aren’t sure what server to get, remember disk I/O is what matters. If you just want one that will just work, check out our appliances as we have tested those extensively.
Tuning Tips
Here are some common tests and changes you can do to analyze and optimize your performance.
Disable logging of bypassed traffic
Do you care about logging/reporting of traffic that is bypassed (not scanned by the apps)?
This includes:
- Traffic that is explicitly bypassed with bypass rules. (that would have otherwise been scanned)
- Traffic from the Untangle server itself (DNS lookups, cloud lookups, signature updates, etc)
- Traffic to the Untangle server itself (DNS lookups, Administration, etc)
Most users do not need this information. The best performance can be had by unchecking in Config > Network > Advanced > Options: Log bypassed sessions Log outbound local sessions Log inbound local sessions' Log blocked sessions
With this configuration only scanned traffic is logged, which is going to be fine in most cases except where you need to be able to audit all network traffic that has occurred or all traffic needs to be logged for bandwidth accounting.
Bypass unimportant traffic
Look in Reports > Network > Top Ports by Session and Reports > Network > Top Ports by Bytes. Do you see any uncommon ports that comprise a significant amount of your traffic? If so consider bypassing it.
For example, sometimes we’ll look at a site and see millions of sessions to port 514. Its doubtful that a site like this really needs to spend the server resources on scanning their internal syslog traffic (port 514). This traffic can safely be bypassed.
A more normal traffic profile will show the more common ports (80 for HTTP, 443 for HTTPS, 53 for DNS, etc being the most common ports).
A suspect traffic profile | A normal traffic profile |
If you see something non-standard as the top port, you may want to investigate what it is and consider bypassing it.
Bypass DNS
If Untangle itself is the DNS server, then DNS is automatically bypassed. However, if DNS is going *through* Untangle it is scanned/categorized/scrubbed just like normal traffic.
In some cases this is desirable if you want to use Captive Portal, or Firewall and/or policies to control internet access. However in some cases users may not care about DNS or it can be managed solely with filter rules (at layer 3) even when bypassed which is much faster. In these cases you can bypass all UDP port 53 and save a lot of server processing power.
Bypass UDP
Similarly to bypassing DNS, depending on the use case many sites can actually bypass all UDP. If you are trying to control applications, shape bandwidth, or run captive portal, this won't work because a significant amount of internet traffic is UDP based. However, if the goal is simply to filter web traffic, then scanning UDP is not necessary and bypassing it can save a lot of server processing power.
Shape Bandwidth
Do you have bandwidth hogs or certain applications that are hogging network resources? A quick look at Reports > Bandwidth Control > Top Clients (by total bytes) will show if you have any clients on the network that are significantly different than other clients. Reports > Bandwidth Control > Top Application (by total bytes) will show if you have any applications on the network that are using more resources than they should.
In some cases, you can actually change the network profile. For example, schools often struggle with P2P and bittorrent saturating the bandwidth and causing performance bottlenecks at the WAN. Application Control and Bandwidth Control can provide essential tools for blocking or slowing unimportant traffic to limit both the bandwidth requirements and server resource requirements.
Quotas in Bandwidth Control can provide a useful low-maintenance tool to automatically slow clients when they are using more data then you think in reasonable.
Remove unnecessary apps
Performance tuning may require being pragmatic about which applications you install and run. Untangle makes it VERY easy to install and enable apps, but that doesn't mean its always a good idea.
Web Cache requires lots of server resources and likely provides very little value. Often this results on a net-negative ROI. It is suggested not to run it except in very special circumstances.
Intrusion Prevention provides little measurable tangible security benefit, but requires a lot of memory and CPU resources. If you are low on memory, then its certainly better to leave this disabled. The more rules you have enabled the more memory is required.
Web Filter Lite is not a good web filter, but can be useful for monitoring web traffic. It is unmaintained and it is a memory pig because the whole (unmaintained & deprecated) database in stored in memory. If you are low on memory you should not run it.
Look for misbehaving hosts
Misbehaving hosts can often suck network and server resources by flooding the network, sending spam, scanning the internet for vulnerable hosts, and other crazy activities. Its not always an infected hosts - in some cases applications that are explicitly blocked often retry the connection with no delay and this can lead to accidental floods of connections.
Check the reports to look for suspicious activity. Reports > Shield > Top Blocked Clients might reveal if there are any hosts that may be behaving suspiciously. Its normal to see some blocked clients, however if you see millions of sessions being blocked that host may be doing something suspect and it warrants investigation.
Finding and investigating these hosts and their activity can help you keep your network and configuration of Untangle in the optimal state.
Performance Numbers
Users often request performance metrics be published by vendors. Untangle doesn’t do this. Here's why:
Traditionally, network devices quantify network performance in throughput. Untangle doesn’t publish throughput numbers because it is obviously hardware-dependent, but, more importantly, because it’s just irrelevant. Even bare minimum hardware doesn’t have a tough time supporting 100Mbit, which is far more than most users running minimal hardware have at the gateway. It doesn’t require a lot of hardware to support gigabit or 10 gigabit or more levels of throughput.
What matters a great deal is the type of traffic. For example, 100Mbit of continuous tiny HTTP fetches and tiny HTTP downloads requires significantly more work to process than one big HTTP download taking 100Mbit which takes almost no resources. However, at the packet level, both are just 100Mbit/sec of packets.
Another common metric is maximum number of sessions. Untangle also declines to publish these numbers because they are similarly useless. Vendors publish these numbers for their servers when they are “optimally” configured, which is a code word for configured for maximal performance and minimum utility. Publishing the performance of Untangle with traffic bypassed and no apps installed is not useful because no one runs it like that since it provides no utility in that configuration.
We did some internal testing of common appliances currently available. None of them even supported 10% of the advertised maximum number of sessions with a “reasonable” configuration.
After reading, this if you’re still worried about the typical performance metrics, then rest assured that its fairly easy to configure your Untangle server to support 256k concurrent sessions and more than gigabit throughput on even the smallest of servers.