Netfilter Iptables And Conntrack (ip_conntrack), Max Connections, Buckets, Inspect And Tweek Commands, And Errors.
I recently came in need of referencing an old Netfilter Conntrack document. Normally at http://www.wallfire.org/misc/netfilter_conntrack_perf.txt. I was unpleasantly surprised to find it was not publicly available. I located a copy of the document and pasted it here with some formatting improvements for easy reference and to ensure it does not get lost and available when needed.
Please note I do not get paid to post these articles.
Article Structure
- Initially, the Netfilter document is pasted below.
- Then highlights of typically needed commands and related error messages that cause one to reference this.
Netfilter Conntrack (ip_conntrack) Doc
The pasted document is between a set of Hash symbols, like this ###########. However, I have modified the document by highlighting the executable command lines or other file entries as code format.
############## START ##############
Netfilter conntrack performance tweaking, v0.8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Hervé Eychenne <rv _AT_ wallfire _DOT_ org>
This document explains some of the things you need to know for netfilter conntrack (and thus NAT) performance tuning.
Latest version of this document can be found at:
http://www.wallfire.org/misc/netfilter_conntrack_perf.txt
------------------------------------------------------------------------------
There are two parameters we can play with:
- the maximum number of allowed conntrack entries, which will be called CONNTRACK_MAX in this document
- the size of the hash table storing the lists of conntrack entries, which will be called HASHSIZE (see below for a description of the structure)
CONNTRACK_MAX is the maximum number of "sessions" (connection tracking entries) that can be handled simultaneously by netfilter in kernel memory.
A conntrack entry is stored in a node of a linked list, and there are several lists, each list being an element in a hash table. So each hash table entry (also called a bucket) contains a linked list of conntrack entries.
To access a conntrack entry corresponding to a packet, the kernel has to:
- compute a hash value according to some defined characteristics of the packet. This is a constant time operation. This hash value will then be used as an index in the hash table, where a list of conntrack entries is stored.
- iterate over the linked list of conntrack entries to find the good one. This is a more costly operation, depending on the size of the list (and on the position of the wanted conntrack entry in the list).
The hash table contains HASHSIZE linked lists. When the limit is reached (the total number of conntrack entries being stored has reached CONNTRACK_MAX), each list will contain ideally (in the optimal case) about CONNTRACK_MAX/HASHSIZE entries.
The hash table occupies a fixed amount of non-swappable kernel memory, whether you have any connections or not. But the maximum number of conntrack entries determines how many conntrack entries can be stored (globally into the
linked lists), i.e. how much kernel memory they will be able to occupy at most.
This document will now give you hints about how to choose optimal values for HASHSIZE and CONNTRACK_MAX, in order to get the best out of the netfilter conntracking/NAT system.
Default values of CONNTRACK_MAX and HASHSIZE
============================================
By default, both CONNTRACK_MAX and HASHSIZE get average values for "reasonable" use, computed automatically according to the amount of available RAM.
Default value of CONNTRACK_MAX
------------------------------
On i386 architecture, CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 = RAMSIZE (in MegaBytes) * 64.
So for example, a 32 bits PC with 512MB of RAM can handle 512*1024^2/16384 = 512*64 = 32768 simultaneous netfilter connections by default.
But the real formula is:
CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 / (x / 32) where x is the number of bits in a pointer (for example, 32 or 64 bits)
Please note that:
- default CONNTRACK_MAX value will not be inferior to 128
- for systems with more than 1GB of RAM, default CONNTRACK_MAX value is limited to 65536 (but can of course be set to more manually).
Default value of HASHSIZE
-------------------------
By default, CONNTRACK_MAX = HASHSIZE * 8. This means that there is an average of 8 conntrack entries per linked list (in the optimal case, and when CONNTRACK_MAX is reached), each linked list being a hash table entry
(a bucket).
On i386 architecture, HASHSIZE = CONNTRACK_MAX / 8 = RAMSIZE (in bytes) / 131072 = RAMSIZE (in MegaBytes) * 8.
So for example, a 32 bits PC with 512MB of RAM can store 512*1024^2/128/1024 = 512*8 = 4096 buckets (linked lists)
But the real formula is:
HASHSIZE = CONNTRACK_MAX / 8 = RAMSIZE (in bytes) / 131072 / (x / 32) where x is the number of bits in a pointer (for example, 32 or 64 bits)
Please note that:
- default HASHSIZE value will not be inferior to 16
- for systems with more than 1GB of RAM, default HASHSIZE value is limited to 8192 (but can of course be set to more manually).
Reading CONNTRACK_MAX and HASHSIZE
==================================
Current CONNTRACK_MAX value can be read at runtime, via the /proc filesystem.
Before Linux kernel version 2.4.23, use:
# cat /proc/sys/net/ipv4/ip_conntrack_max
Since Linux kernel version 2.4.23 (thus Linux 2.6 as well), use:
# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
(old /proc/sys/net/ipv4/ip_conntrack_max is then deprecated!)
Current HASHSIZE is always available (for every kernel version) in syslog messages, as the number of buckets (which is HASHSIZE) is printed there at ip_conntrack initialization.
Since Linux kernel version 2.4.24 (thus Linux 2.6 as well), current HASHSIZE value can be read at runtime with:
# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_buckets
Modifying CONNTRACK_MAX and HASHSIZE
====================================
Default CONNTRACK_MAX and HASHSIZE values are reasonable for a typical host, but you may increase them on high-loaded firewalling-only systems.
So CONNTRACK_MAX and HASHSIZE values can be changed manually if needed.
While accessing a bucket is a constant time operation (hence the interest of having a hash of lists), keep in mind that the kernel has to iterate over a linked list to find a conntrack entry. So the average size of a linked list (CONNTRACK_MAX/HASHSIZE in the optimal case when the limit is reached) must not be too big. This ratio is set to 8 by default (when values are computed automatically).
On systems with enough memory and where performance really matters, you can consider trying to get an average of one conntrack entry per hash bucket, which means HASHSIZE = CONNTRACK_MAX.
Setting CONNTRACK_MAX
---------------------
Conntrack entries are stored in linked lists, so the maximum number of conntrack entries (CONNTRACK_MAX) can be easily configured dynamically.
Before Linux kernel version 2.4.23, use:
# echo $CONNTRACK_MAX > /proc/sys/net/ipv4/ip_conntrack_max
Since Linux kernel version 2.4.23 (thus Linux 2.6 as well), use:
# echo $CONNTRACK_MAX > /proc/sys/net/ipv4/netfilter/ip_conntrack_max
where $CONNTRACK_MAX is an integer.
Setting HASHSIZE
----------------
For mathematical reasons, hash tables have static sizes. So HASHSIZE must be determined before the hash table is created and begins to be filled.
Before Linux kernel version 2.4.21, a prime number should be chosen for hash size, ensuring that the hash table will be efficiently populated. Odd non-prime numbers or even numbers are strongly discouraged, as the hash
distribution will be sub-optimal.
Since Linux kernel version 2.4.21 (thus Linux 2.6 as well), conntrack uses jenkins2b hash algorithm which is happy with all sizes, but power of 2 works best.
If netfilter conntrack is statically compiled in the kernel, the hash table size can be set at compile time, or (since kernel 2.6) as a boot option withip_conntrack.hashsize=$HASHSIZE
If netfilter conntrack is compiled as a module, the hash table size can be set at module insertion, with the following command:
# modprobe ip_conntrack hashsize=$HASHSIZE
where $HASHSIZE is an integer.
Since 2.6.14, it is possible to set hashsize dynamically at runtime, after boot and module load.
Between 2.6.14 and 2.6.19 (included), use:
# echo $HASHSIZE > /sys/module/ip_conntrack/parameters/hashsize
Since 2.6.20, use:
# echo $HASHSIZE > /sys/module/nf_conntrack/parameters/hashsize
Ideal case: firewalling-only machine
------------------------------------
In the ideal case, you have a machine _just_ doing packet filtering and NAT (i.e. almost no userspace running, at least none that would have a growing memory consumption like proxies, ...).
The size of kernel memory used by netfilter connection tracking is:
size_of_mem_used_by_conntrack (in bytes) = CONNTRACK_MAX * sizeof(struct ip_conntrack) + HASHSIZE * sizeof(struct list_head)
where:
- sizeof(struct ip_conntrack) can vary quite much, depending on architecture, kernel version and compile-time configuration. To know its size, see the kernel log message at ip_conntrack initialization time.
- sizeof(struct ip_conntrack) is around 300 bytes on i386 for 2.6.5, but heavy development around 2.6.10 make it vary between 352 and 192 bytes!
- sizeof(struct list_head) = 2 * size_of_a_pointer On i386, size_of_a_pointer is 4 bytes.
So, on i386, kernel 2.6.5, size_of_mem_used_by_conntrack is around CONNTRACK_MAX * 300 + HASHSIZE * 8 (bytes).
If we take HASHSIZE = CONNTRACK_MAX (if we have most of the memory dedicated to firewalling, see "Modifying CONNTRACK_MAX and HASHSIZE" section above), size_of_mem_used_by_conntrack would be around CONNTRACK_MAX * 308 bytes on i386 systems, kernel 2.6.5.
Now suppose your firewalling-only box has 512MB of RAM (a decent amount of memory considering today's memory prices). You have to spare a bit of memory for a few applications (syslog, etc.): 128MB should really be big
enough for a firewall in console mode, for example.
The rest can be dedicated to conntrack entries.
Then you could set both CONNTRACK_MAX and HASHSIZE approximately to:
(512 - 128) * 1024^2 / 308 =~ 1307315 (instead of 32768 for CONNTRACK_MAX,
and 4096 for HASHSIZE by default).
Since Linux 2.4.21 (thus Linux 2.6 as well), hash algorithm is happy with "power of 2" sizes (it used to be a prime number before).
So here we can set CONNTRACK_MAX and HASHSIZE to 1048576 (2^20), for example.
This way, you can store about 32 times more conntrack entries than the default, and get better performance for conntrack entry access.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Last changes on Jan 10, 2008
Revision history:
0.8 Make the "Ideal case: firewalling-only machine" paragraph a bit more clearer.
0.7 Hashsize parameter can be set dynamically since Linux 2.6.14. Thanks to Christopher A. Craig for the suggestion.
0.6 Hashsize parameter can be set at boot time with Linux 2.6. Thanks to Tobias Diedrich for pointing this out.
0.5 Added further notice about the varying length of the conntrack structure.
0.4 Since Linux 2.4.21, hash algorithm is happy with all sizes, not only prime ones. However, power of 2 is best.
0.3 Various small precisions.
0.2 Information about Linux kernel versions and corresponding /proc entries.
(/proc/sys/net/ipv4/netfilter/ip_conntrack_{max,buckets}).
0.1 Initial writing, largely based on my discussions with Harald Welte (netfilter maintainer) on the netfilter-devel mailing-list. Many thanks to him!
############## END ##############
Related Error Messages
Chances are that you came here because you saw an error similar to "ip_tables: (C) 2000-2002 Netfilter core team. ip_conntrack version 2.1 (3071 buckets, 24568 max) - 360 bytes per conntrack".
Final Notes
Note that if you got a server with 1GB of memory or more, the default maximum connections will still be set to 64K (65536). You can change that as indicated in the doc above.
The server should be setting the CONNTRACK_MAX according to this formula with a 64K cap that can be changed manually (reference above) or set upon server boot:
- For 32 bit server
CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 - For 64 bit server
CONNTRACK_MAX = RAMSIZE (in bytes) / 8192
This means that on a 64 bit OS server a larger CONNTRACK_MAX value can be set even if it has the same memory size as another server that is 32 bit OS.
Example: If you got a 64 bit OS server with 2GB RAM, and you want the maximum connections to have available 1GB, then the formula says it would be 1073741824/8192 for a result of 131072. Using the example above, the following would modify the CONNTRACK_MAX to be 131072:
# echo 131072 > /proc/sys/net/ipv4/ip_conntrack_max
IMPORTANT. Upon reboot, the value set to ip_conntrack_max, using the command line echo method, would get wiped and set to 65536 (64K) upon server boot (startup).
Maybe Useful Website URLs
http://conntrack-tools.netfilter.org/
Feel Free To Leave A Good Comment. :)
Look around, and you may find other useful articles. Add this site to your Bookmarks/Favorites for easy return for new articles. Consider submitting technical articles for publication, including your embedded links. I will even create a new category if needed.