A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to
our developers and this was pretty successful. We would like to do
something similar for BIND, provide a dozen or so tips for how to
maximize your throughput with BIND. However, as usual, everything is
more complicated with BIND.
Can those of you who care about performance, who have worked to improve
your performance, share some of your suggestions that have the most
impact? Please also comment if you think any of these ideas below are stupid or dangerous. I have combined advice for resolvers and for authoritative servers, I hope it is clear which is which...
The ideas we have fall into four general categories:
System design
1a) Use a load balancerto specialize your resolvers and maximize your
cache hit ratio. A load balancer is traditionally designed to spread
the traffic out evenly among a pool of servers, but it can also be used
to concentrate related queries on one server to make its cache as hot as possible. For example, if all queries for domains in .info are sent to
one server in a pool, there is a better chance that an answer will be in
the cache there.
1b) If you have a large authoritative system with many servers, consider dedicating some machines to propagate transfers. These machines, called transfer servers, would not answer client queries, but just send
notifies and process IXFR requests.
1c) Deploy ghost secondaries. If you store copies of authoritative
zones on resolvers (resolvers as undelegated secondaries), you can avoid querying those authoritative zones. The most obvious uses of this would
be mirroring the root zone locally or mirroring your own authoritative
zones on your resolver.
we have other system design ideas that we suspect would help, but we are
not sure, so I will wait to see if anyone suggests them.
OS settings and the system environment
2a) Run on bare metal if possible, not on virtual machines or in the
cloud. (any idea how much difference this makes? the only reference we
can cite is pretty out of date - https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf
<https://urldefense.com/v3/__https://indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYfEBpbu8w$>
)
2b) Consider using with-tuning-large. (https://kb.isc.org/docs/aa-01314 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-01314__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYdvKmJFZQ$>)
This is a compile time option, so not something you can switch on and
off during production.
2c) Consider which R/W lock choice you want to use - https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named
<https://urldefense.com/v3/__https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYftHIt-qg$>
For the highest tested query rates (> 100,000 queries per second),
pthreads read-write locks with hyper-threading /enabled/seem to be the best-performing choice by far.
2d) Pay attention to your choice of NIC cards. We have found wide
variations in their performance. (Can anyone suggest what specifically
to look for?)
2e) Make sure your socket send buffers are big enough. (not sure if this
is obsolete advice, do we need to tell people how to tell if their
buffers are causing delays?)
2f) When the number of CPUs is very large (32 or more), the increase in
UDP listeners may not provide any performance improvement and might
actually reduce throughput slightly due to the overhead of the
additional structures and tasks. We suggest trying different values of
-U to find the optimal one for your production environment.
named Features
3a) Minimize logging. Query logging is expensive (can cost you 20% or
more of your throughput) so don’t do it unless you are using the logs
for something. Logging with dnstap is lower impact, but still fairly expensive. Don’t run in debug mode unless necessary.
3b) Use named.conf option minimal-responses yes; to reduce the amount of work that named needs to do to assemble the query response as well as reducing the amount of outbound traffic
3c) Disable synth-from-dnssec. While this seemed like a good idea, it
turns out, in practice it does not improve performance.
3d) Tune your zone transfers. (https://kb.isc.org/docs/aa-00726 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-00726__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYe98KMFqg$>)
When tuning the behavior of the primary, there are several factors that
you can control:
- The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)
- Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)
- Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)
The most important options to focus on are transfers-out,
serial-query-rate, tcp-clients and tcp-listen-queue.
4e) If you use RPZ, consider using qnane-wait-recurse. We have had
issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very
large RPZ zones.
4f) Consider enabling prefetch on your resolver, unless you are running
9.10 (which is EOL) https://kb.isc.org/docs/aa-01122 <https://urldefense.com/v3/__https://kb.isc.org/docs/aa-01122__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYcf-H7ZBg$>
Fix your transport network.
Transport network issues cause BIND to keep retrying, which is a
performance drain.
4a) Disable (in some cases, completely remove in order to prevent
ongoing interference) outbound firewalls/packet-filters (particularly
that maintain state on connections). These are a frequent cause of
problems in the DNS that can cause your DNS server to do a lot of extra work.
4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096. Ensure
that your network infrastructure allows transit for and reassembly of fragmented UDP packets (these will be large query responses if you are DNSSEC signing)
4c) Ensure that your network infrastructure allows DNS over TCP.
4d) Check for, and eliminate any incomplete IPv6 interface set-up (what
can go wrong here is that BIND thinksthat it can use IPv6 authoritative servers, but actually the sends silently fail, leaving named waiting unnecessarily for responses)
Any further suggestions, corrections or warnings are very welcome.
Thank you!
Vicky
---------
Victoria Risk
Product Manager
Internet Systems Consortium
vicky@isc.org <mailto:vicky@isc.org>
_______________________________________________
Please visit https://urldefense.com/v3/__https://lists.isc.org/mailman/listinfo/bind-users__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYflfQafZw$ to unsubscribe from this list
ISC funds the development of this software with paid support subscriptions. Contact us at https://urldefense.com/v3/__https://www.isc.org/contact/__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYd9ITf9ow$ for more information.
bind-users mailing list
bind-users@lists.isc.org https://urldefense.com/v3/__https://lists.isc.org/mailman/listinfo/bind-users__;!!J2_8gdp6gZQ!7sRXGLQDm9waSVfgufc44e2-G1iYoLGoT_iBOLgmPYx3xAW8jKIAFbCB5OVJYYflfQafZw$
A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious to[big snip]
our developers and this was pretty successful. We would like to do
something similar for BIND, provide a dozen or so tips for how to
maximize your throughput with BIND. However, as usual, everything is
more complicated with BIND.
Any further suggestions, corrections or warnings are very welcome.
OS settings and the system environment...
2e) Make sure your socket send buffers are big enough. (not2e#1) Make sure your UDP socket *receive* buffers are big enough.
sure if this is obsolete advice, do we need to tell people how
to tell if their buffers are causing delays?)
named Features3a#1) Do not configure BIND with --enable-querytrace. It most
3a) Minimize logging. Query logging is expensive (can cost you
20% or more of your throughput) so don't do it unless you
are using the logs for something. Logging with dnstap is
lower impact, but still fairly expensive. Don't run in
debug mode unless necessary.
4b) Set an appropriate MTU for your network. Ensure that yourWell, isn't the major goal of DNS Flag Day 2020 to eliminate
network infrastructure supports EDNS and large UDP responses up
to 4096. Ensure that your network infrastructure allows transit
for and reassembly of fragmented UDP packets (these will be
large query responses if you are DNSSEC signing)
A while ago we created a KB article with tips on how to improve your performance with our Kea dhcp server. The tips were fairly obvious toead
our developers and this was pretty successful. We would like to do
something similar for BIND, provide a dozen or so tips for how to
maximize your throughput with BIND. However, as usual, everything is
more complicated with BIND.
Can those of you who care about performance, who have worked to
improve your performance, share some of your suggestions that have the
most impact? =C2=A0Please also comment if you think any of these ideas
below are stupid or dangerous. I have combined advice for resolvers
and for authoritative servers, I hope it is clear which is which...
The ideas we have fall into four general categories:
System design
1a) Use a load balancerto specialize your resolvers and maximize your
cache hit ratio.=C2=A0 A load balancer is traditionally designed to spr=
the traffic out evenly among a pool of servers, but it can also bee
used to concentrate related queries on one server to make its cache as
hot as possible. For example, if all queries for domains in .info are
sent to one server in a pool, there is a better chance that an answer
will be in the cache there.
1b) If you have a large authoritative system with many servers,
consider dedicating some machines to propagate transfers. These
machines, called transfer servers, would not answer client queries,
but just send notifies and process IXFR requests.
1c) Deploy ghost secondaries.=C2=A0 If you store copies of authoritativ=
zones on resolvers (resolvers as undelegated secondaries), you cants/217/411/DNS_perf_OARC_Apr_14.pdf
avoid querying those authoritative zones. The most obvious uses of
this would be mirroring the root zone locally or mirroring your own authoritative zones on your resolver.
we have other system design ideas that we suspect would help, but we
are not sure, so I will wait to see if anyone suggests them.
OS settings and the system environment
2a) Run on bare metal if possible, not on virtual machines or in the
cloud. (any idea how much difference this makes? the only reference we
can cite is pretty out of date -=C2=A0https://indico.dns-oarc.net/event/19/contributions/234/attachmen=
)e-with-named
2b) Consider using with-tuning-large.
(https://kb.isc.org/docs/aa-01314) This is a compile time option, so
not something you can switch on and off during production.=C2=A0
2c) Consider which=C2=A0R/W lock choice you want to use - https://kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-us=
For the highest tested query rates (> 100,000 queries per second),to be
pthreads read-write locks with hyper-threading=C2=A0/enabled/=C2=A0seem=
the best-performing choice by far.se
2d) Pay attention to your choice of NIC cards. We have found wide
variations in their performance. (Can anyone suggest what specifically
to look for?)
2e) Make sure your socket send buffers are big enough. (not sure if
this is obsolete advice, do we need to tell people how to tell if
their buffers are causing delays?)
2f)=C2=A0When the number of CPUs is very large (32 or more), the increa=
in UDP listeners may not provide any performance improvement and might actually reduce throughput slightly due to the overhead of thee logs
additional structures and tasks. We suggest trying different values of
-U to find the optimal one for your production environment.
named Features
3a) Minimize logging. Query logging is expensive (can cost you 20% or
more of your throughput) so don=E2=80=99t do it unless you are using th=
for something. Logging with dnstap is lower impact, but still fairly expensive. Don=E2=80=99t run in debug mode unless necessary.
3b) Use named.conf option minimal-responses yes; to reduce the amount
of work that named needs to do to assemble the query response as well
as reducing the amount of outbound traffic
3c) Disable synth-from-dnssec. While this seemed like a good idea, it
turns out, in practice it does not improve performance.
3d) Tune your zone transfers. (https://kb.isc.org/docs/aa-00726)
When tuning the behavior of the primary, there are several factors
that you can control:
- The rate of notifications of changes to secondary servers (serial-query-rate and notify-delay)
- Limits on concurrent zone transfers (transfers-out, tcp-clients, tcp-listen-queue, reserved-sockets)
- Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, transfer-format)
The most important options to focus on are transfers-out,
serial-query-rate, tcp-clients and tcp-listen-queue.
4e) If you use RPZ, consider using qnane-wait-recurse. We have had
issues with RPZ transfers impacting query performance in resolvers. In general, more smaller RPZ zones will transfer faster than a few very
large RPZ zones.=C2=A0
4f) Consider enabling prefetch on your resolver, unless you are
running 9.10 (which is EOL)=C2=A0https://kb.isc.org/docs/aa-01122
Fix your transport network.=C2=A0
Transport network issues cause BIND to keep retrying, which is a
performance drain.
4a) Disable (in some cases, completely remove in order to prevent
ongoing interference) outbound firewalls/packet-filters (particularly
that maintain state on connections). These are a frequent cause of
problems in the DNS that can cause your DNS server to do a lot of
extra work.
4b) Set an appropriate MTU for your network. Ensure that your network infrastructure supports EDNS and large UDP responses up to 4096.
Ensure that your network infrastructure allows transit for and
reassembly of fragmented UDP packets (these will be large query
responses if you are DNSSEC signing)
4c) Ensure that your network infrastructure allows DNS over TCP.
4d) Check for, and eliminate any incomplete IPv6 interface set-up
(what can go wrong here is that BIND thinksthat it can use IPv6
authoritative servers, but actually the sends silently fail, leaving
named waiting unnecessarily for responses)
Any further suggestions, corrections or warnings are very welcome.
Thank you!
Vicky
---------
Victoria Risk
Product Manager
Internet Systems Consortium
vicky@isc.org <mailto:vicky@isc.org>
R/W lock choice you want to use - </span><span style=3D"text-decoration:=underline; color: rgb(17, 85, 204); font-variant-ligatures: normal; font= -variant-east-asian: normal; font-variant-position: normal; text-decorati= on-skip: none; vertical-align: baseline; white-space: pre-wrap;" class=3D= ""><a href=3D"https://kb.isc.org/docs/choosing-a-read-write-lock-implemen= tation-to-use-with-named" class=3D"" moz-do-not-send=3D"true">https://kb.= isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named<=
</span><spanstyle=3D"caret-color: rgb(34, 34, 34); color: rgb(34, 34,
3a) Minimize logging. Query logging is expensive (can cost you 20% or mo=re of your throughput) so don=E2=80=99t do it unless you are using the lo=
3b) </span><span style=3D"color: rgb(34, 34, 34); white-space: pre-wrap;=" class=3D"">Use named.conf option minimal-responses yes; to reduce the a= mount of work that named needs to do to assemble the query response as we=
3c) </span><span style=3D"white-space: pre-wrap;" class=3D"">Disable syn= th-from-dnssec. While this seemed like a good idea, it turns out, in prac=tice it does not improve performance.</span></div>
3d) Tune your zone transfers. </span><span style=3D"white-space: pre-wra=p;" class=3D""> (</span><a href=3D"https://kb.isc.org/docs/aa-00726" styl= e=3D"white-space: pre-wrap;" class=3D"" moz-do-not-send=3D"true">https://= kb.isc.org/docs/aa-00726</a><span style=3D"white-space: pre-wrap;" class=3D= "">)</span></div>
Fix your transport network.=C2=A0</span></div><div class=3D""><span style=3D"white-space: pre-wrap;" class=3D""= >Transport network issues cause BIND to keep retrying, which is a perform=
Any further suggestions, corrections or warnings are very welcome. </spa= n></div><div class=3D""><span style=3D"white-space: pre-wrap;" class=3D""=
2e#1) Make sure your UDP socket *receive* buffers are big enough.
If on BSD, monitor for "dropped due to full socket buffers"
count in "netstat -s" output, and tune accordingly. Note that
this may be a symptom of mis-tuning of other parts of BIND,
causing excessive CPU usage, which may contribute to this
problem.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,030 |
Nodes: | 10 (0 / 10) |
Uptime: | 99:31:47 |
Calls: | 13,354 |
Calls today: | 3 |
Files: | 186,574 |
D/L today: |
26,277 files (7,442M bytes) |
Messages: | 3,359,549 |