• DNS security, amplification attacks and recursion

    From Michael De Roover@isc@nixmagic.com to bind-users on Tue Jul 7 15:00:13 2020
    From Newsgroup: comp.protocols.dns.bind

    Hello,

    Recently I discussed with a friend of mine the idea of NTP and DNS in
    the context of denial of service attacks. In NTP this amplification
    attack is done with the monlist command (that should honestly never have
    been publicly available due to its purpose being pretty much entirely debugging-related). The DNS version was rather unclear to me however.

    Said friend said to me that he tested my authoritative name servers and
    found them to be not vulnerable. I don't run the latest and greatest of
    BIND at all, I mean it's Debian distribution packages we're talking
    about there... But they were set up to be exclusively authoritative.
    They do not respond to recursive queries. It appears that the test of
    whether a server is "vulnerable" or not has to do with this. The command
    used to test this was apparently "dig +short test.openresolver.com TXT @your.name.server". That's simply a recursive query of what appears to
    be an arbitrary record to me.

    This also meant that supposedly the recursive DNS servers from Google, Cloudflare and Quad9 were all considered vulnerable. I find this very
    hard to believe. Authoritative name servers may not need a huge DNS infrastructure for a small-ish zone (say under 1k records), but
    recursors on the scale of Google and Cloudflare in particular (not sure
    how popular Quad9 is so far).. those use massive infrastructure
    including anycast and everything! I'd consider it safe to assume that
    their servers are at least on the order of 100Gbps cumulatively, if not
    more. If these would be vulnerable to amplification attacks just because
    they allow recursion, wouldn't skids be jumping on this like there's no tomorrow? It doesn't make any sense to me.

    This seems to be not very well documented online (or more likely my
    search terms aren't right), so yeah... I wonder why the idea of
    recursion became associated with a vulnerable server in the first place.

    --
    Met vriendelijke groet / Best regards,
    Michael De Roover
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Stephane Bortzmeyer@bortzmeyer@nic.fr to Michael De Roover on Tue Jul 7 15:22:16 2020
    From Newsgroup: comp.protocols.dns.bind

    On Tue, Jul 07, 2020 at 03:00:13PM +0200,
    Michael De Roover <isc@nixmagic.com> wrote
    a message of 46 lines which said:

    The command used to test this was apparently "dig +short test.openresolver.com TXT @your.name.server".

    ANY instead of TXT may be more efficient (specially with +dnssec), if
    the goal is to get the maximum amplification. Of course, if the server implements RFC 8482, ANY won't help.

    Authoritative name servers may not need a huge DNS infrastructure
    for a small-ish zone (say under 1k records), but recursors on the
    scale of Google and Cloudflare in particular (not sure how popular
    Quad9 is so far).. those use massive infrastructure including
    anycast and everything! I'd consider it safe to assume that their
    servers are at least on the order of 100Gbps cumulatively, if not
    more.

    This is precisely what makes them dangerous. They are good reflectors
    (good from the point of view of the attacker). On the other hand, they typically implement various forms of rate-limiting, and they are
    monitored closely by knowledgeable professionals so, they may not be
    good reflectors after all.

    If these would be vulnerable to amplification attacks just because
    they allow recursion,

    They're not vulnerable, this attack works by reflection (just like the
    NTP attack you mentioned) so they are not the potential victims, they
    could be used as helpers.



    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tony Finch@dot@dotat.at to Michael De Roover on Tue Jul 7 15:06:25 2020
    From Newsgroup: comp.protocols.dns.bind

    Michael De Roover <isc@nixmagic.com> wrote:

    Said friend said to me that he tested my authoritative name servers and
    found them to be not vulnerable. [snip] They do not respond to recursive queries. It appears that the test of whether a server is "vulnerable" or
    not has to do with this. The command used to test this was apparently
    "dig +short test.openresolver.com TXT @your.name.server".

    OK, that iss all right and correct, but there is (of course) a bit more to
    this issue.

    As you already know, the most basic thing to avoid is not being an open recursive server. Out of the box, BIND has a recursion ACL that only
    allows queries from directly connected networks, so you won't have this
    problem without making an explicit configuration mistake. Normally for an authoritative-only server, you should set `recursion no` to lock it down
    more tightly.

    An auth-only server can also be used for amplification attacks that use
    its authoritative zones - these attacks don't have to use recursion.
    There are a few ways to mitigate auth-only amplification attacks.

    Response rate limiting is very effective. Start off by putting the
    following in your options{} section, and look in the BIND ARM for other directives you can put in the rate-limit{} section.

    rate-limit {
    responses-per-second 10;
    };

    Especially if you have DNSSEC signed zones then there are a few extra
    things you can do to reduce the size of your response packets, which
    reduces the attacker's amplification factor, and makes you less likely to
    be abused.

    Set a maximum UDP packet size, to suppress fragmented packets. The DNS
    flag day 2020 campaign will make this a standard setting. For a long time
    I have used:

    max-udp-size 1420;

    https://dnsflagday.net/2020/

    A downside of small UDP responses is more truncated packets and more
    queries over TCP, but there are still more ways to reduce response size
    which also reduce truncation.

    Reduce the size of responses to ANY queries, which are a favourite tool of amplification attacks. There's basically no downside to this one, in my opinion, but I'm biased because I implemented it.

    minimal-any yes;

    You can also reduce the size of other answers. In theory this option might force resolvers to make more queries to get records that by default would appear in the additional section, but I think in practice resolvers make
    these queries anyway because of RFC 2181 trustworthiness logic, and
    because applications (such as SMTP servers) find it easier to query
    directly than use additional records. So on my auth servers I set:

    minimal-responses yes;

    If you are signing zones with DNSSEC, consider doing an algorithm
    rollover to ECDSA p256 (algorithm 13) because this has much smaller
    signatures than RSA. Algorithm rollovers are not particularly easy,
    because you need a good grasp of the DNSSEC key timing parameters and how
    and when to swap over your DS records. (There used to be even more
    gotchas, so it is getting easier, honest!)

    Finally, there's the built-in _bind CHAOS view. This has very strict
    response rate limiting by default, but if you want to be super careful
    you can set `version none` and `hostname none` to lock it down further.
    (I don't bother with this.)

    Here endeth the brain dump.

    Tony.
    --
    f.anthony.n.finch <dot@dotat.at> http://dotat.at/
    Mull of Galloway to Mull of Kintyre including the Firth of Clyde and North Channel: Variable, 2 to 4. Moderate at first near the Mull of Kintyre, otherwise smooth or slight. Showers. Mainly good.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From @lbutlr@kremels@kreme.com to bind-users on Tue Jul 7 11:28:18 2020
    From Newsgroup: comp.protocols.dns.bind

    On 07 Jul 2020, at 08:06, Tony Finch <dot@dotat.at> wrote:
    Excellent post, and a nice summary of some best practices.
    I have a couple of questions.
    Response rate limiting is very effective. Start off by putting the
    following in your options{} section, and look in the BIND ARM for other directives you can put in the rate-limit{} section.

    rate-limit { responses-per-second 10; };
    Does that apply to local queries as well (for example, a mail server may easily make a whole lot of queries to 127.0.0.1, and rate limiting it would at the very least affect logging and could delay mail if the MTA cannot verify DNS.
    Do these setting also need to be applied to the secondary servers?
    --
    What's another word for Thesaurus?
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tony Finch@dot@dotat.at to @lbutlr on Tue Jul 7 18:58:35 2020
    From Newsgroup: comp.protocols.dns.bind

    @lbutlr <kremels@kreme.com> wrote:

    rate-limit { responses-per-second 10; };

    Does that apply to local queries as well (for example, a mail server may easily make a whole lot of queries to 127.0.0.1, and rate limiting it
    would at the very least affect logging and could delay mail if the MTA
    cannot verify DNS.

    I don't recommend using response rate limiting on recursive servers.

    The principle behind RRL is that auth servers are queried by resolvers
    with caches, and a correctly-functioning cache will not repeat the same
    query very frequently, so it is reasonable to apply a rate limit on the
    auth servers.

    Recursive servers, on the other hand, are often queried by stub resolvers without caches. The query rate is then entirely driven by the application workload, and you can't apply a rate limit on the recursive server without causing serious trouble for the application.

    It can be especially bad because traditional cacheless stub resolvers are
    not good at error recovery, and when RRL hits, their retry strategy is
    likely to increase the query rate observed by the server, making things
    worse.

    If you are running an oldskool multi-purpose server that is recursive for
    its own daemons but authoritative for others, then you can use the
    `rate-limit { exempt-clients }` option so that RRL doesn't apply to
    recursive clients. But I wouldn't recommend a setup like this. (My auth
    servers have their /etc/resolv.conf pointing at my recursive service.)

    Do these setting also need to be applied to the secondary servers?

    The settings I described are for public authoritative servers, i.e ones
    that appear in NS records. These servers can be primary or secondary (but
    are usually secondary).

    Secondary servers don't necessarily appear in NS records, and if they
    don't they are less likely to be exposed to this kind of attack.

    Tony.
    --
    f.anthony.n.finch <dot@dotat.at> http://dotat.at/
    Southeast Iceland: Westerly or southwesterly, 3 to 5, becoming variable 3 or less later in north. Moderate. Showers. Good.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Michael De Roover@isc@nixmagic.com to Tony Finch on Tue Jul 7 20:06:29 2020
    From Newsgroup: comp.protocols.dns.bind

    On 7/7/20 4:06 PM, Tony Finch wrote:

    An auth-only server can also be used for amplification attacks that use
    its authoritative zones - these attacks don't have to use recursion.
    There are a few ways to mitigate auth-only amplification attacks.

    Response rate limiting is very effective. Start off by putting the
    following in your options{} section, and look in the BIND ARM for other directives you can put in the rate-limit{} section.

    rate-limit {
    responses-per-second 10;
    };
    That's a really useful option to have, I didn't know about this yet. It
    seems like that could take care of the brunt of amplification attacks
    already. Definitely going to add this in, thanks!
    Set a maximum UDP packet size, to suppress fragmented packets. The DNS
    flag day 2020 campaign will make this a standard setting. For a long time
    I have used:

    max-udp-size 1420;

    https://dnsflagday.net/2020/

    A downside of small UDP responses is more truncated packets and more
    queries over TCP, but there are still more ways to reduce response size
    which also reduce truncation.
    Interesting, I wasn't aware of this campaign. I don't know if I'm knowledgeable enough on UDP to be able to make educated decisions on
    this myself but I look forward to its eventual release.
    Reduce the size of responses to ANY queries, which are a favourite tool of amplification attacks. There's basically no downside to this one, in my opinion, but I'm biased because I implemented it.

    minimal-any yes;

    I've heard of these ANY queries being preferred for amplification
    attacks as well, since the responses are often so large... I don't think
    that there would be any downsides to this either, in fact I've never
    actually seen a legitimate application use it... Probably best to lock
    down indeed.

    You can also reduce the size of other answers. In theory this option might force resolvers to make more queries to get records that by default would appear in the additional section, but I think in practice resolvers make these queries anyway because of RFC 2181 trustworthiness logic, and
    because applications (such as SMTP servers) find it easier to query
    directly than use additional records. So on my auth servers I set:

    minimal-responses yes;

    Hmm, for the authoritative name servers this might be a good idea yeah..
    Those are authoritative only (i.e. `recursion no`). So for clients
    querying those, the NS records served in the additional section at least should already be known to the client anyway... I mean that's why
    they're there to begin with, so they must already know that information
    from the DNS servers higher up the chain. And another query if needed,
    saves traffic either way I suppose.

    Thanks a lot for the detailed reply, I really appreciate it :)

    --
    Met vriendelijke groet / Best regards,
    Michael De Roover
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Brett Delmage@Brett@BrettDelmage.ca to bind-users on Tue Jul 7 14:21:13 2020
    From Newsgroup: comp.protocols.dns.bind

    On Tue, 7 Jul 2020, Tony Finch wrote:

    Reduce the size of responses to ANY queries, which are a favourite tool of amplification attacks. There's basically no downside to this one, in my opinion, but I'm biased because I implemented it.

    minimal-any yes;

    Why only reduce and not eliminate?

    Can ANY responses be disabled completely with an option?

    This article at cloudflare https://blog.cloudflare.com/deprecating-dns-any-meta-query-type/
    states that they have deprecated it because it wasn't being used. They
    should know! This was posted over 5 years ago, in 2015.

    Brett
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Shumon Huque@shuque@gmail.com to Brett Delmage on Tue Jul 7 14:31:26 2020
    From Newsgroup: comp.protocols.dns.bind

    --0000000000009ba6da05a9de337b
    Content-Type: text/plain; charset="UTF-8"

    On Tue, Jul 7, 2020 at 2:21 PM Brett Delmage <Brett@brettdelmage.ca> wrote:

    On Tue, 7 Jul 2020, Tony Finch wrote:

    Reduce the size of responses to ANY queries, which are a favourite tool
    of
    amplification attacks. There's basically no downside to this one, in my opinion, but I'm biased because I implemented it.

    minimal-any yes;

    Why only reduce and not eliminate?

    Can ANY responses be disabled completely with an option?

    This article at cloudflare https://blog.cloudflare.com/deprecating-dns-any-meta-query-type/
    states that they have deprecated it because it wasn't being used. They
    should know! This was posted over 5 years ago, in 2015.


    Cloudflare themselves now implement the "minimal any" behavior described
    in this spec:

    https://tools.ietf.org/html/rfc8482

    Responding to ANY with NOTIMP, REFUSED, or unknown RCODEs, or not
    responding at all results in undesirable follow-on behaviour from DNS
    resolvers
    (mostly aggressive retries).

    Shumon.

    ---
    $ dig @ns1.cloudflare.com. cloudflare.com. ANY

    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54526
    ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;cloudflare.com. IN ANY

    ;; ANSWER SECTION:
    cloudflare.com. 3789 IN HINFO "RFC8482" ""

    --0000000000009ba6da05a9de337b
    Content-Type: text/html; charset="UTF-8"
    Content-Transfer-Encoding: quoted-printable

    <div dir=3D"ltr"><div dir=3D"ltr">On Tue, Jul 7, 2020 at 2:21 PM Brett Delm= age &lt;<a href=3D"mailto:Brett@brettdelmage.ca">Brett@brettdelmage.ca</a>&= gt; wrote:<br></div><div class=3D"gmail_quote"><blockquote class=3D"gmail_q= uote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,2= 04);padding-left:1ex">On Tue, 7 Jul 2020, Tony Finch wrote:<br>

    &gt; Reduce the size of responses to ANY queries, which are a favourite too=
    l of<br>
    &gt; amplification attacks. There&#39;s basically no downside to this one, =
    in my<br>
    &gt; opinion, but I&#39;m biased because I implemented it.<br>
    &gt;<br>
    &gt;=C2=A0 =C2=A0 =C2=A0 =C2=A0minimal-any yes;<br>

    Why only reduce and not eliminate?<br>

    Can ANY responses be disabled completely with an option?<br>

    This article at cloudflare<br>
    <a href=3D"https://blog.cloudflare.com/deprecating-dns-any-meta-query-type/=
    " rel=3D"noreferrer" target=3D"_blank">https://blog.cloudflare.com/deprecat= ing-dns-any-meta-query-type/</a><br>
    states that they have deprecated it because it wasn&#39;t being used. They =

    should know! This was posted over 5 years ago, in 2015.<br></blockquote><di= v><br></div><div>Cloudflare themselves now implement the &quot;minimal any&= quot; behavior described</div><div>in this spec:</div><div><br></div><div>= =C2=A0 =C2=A0 <a href=3D"https://tools.ietf.org/html/rfc8482">https://tools= .ietf.org/html/rfc8482</a></div><div><br></div><div>Responding to ANY with = NOTIMP, REFUSED, or unknown RCODEs, or not</div><div>responding at all resu= lts in undesirable follow-on behaviour from DNS resolvers</div><div>(mostly=
    aggressive retries).</div><div><br></div><div>Shumon.</div><div><br></div>= <div>---</div>$ dig @<a href=3D"http://ns1.cloudflare.com">ns1.cloudflare.c= om</a>. <a href=3D"http://cloudflare.com">cloudflare.com</a>. ANY</div><div=
    class=3D"gmail_quote"><br>;; -&gt;&gt;HEADER&lt;&lt;- opcode: QUERY, statu=
    s: NOERROR, id: 54526<br>;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY=
    : 0, ADDITIONAL: 0<br><br>;; QUESTION SECTION:<br>;<a href=3D"http://cloudf= lare.com">cloudflare.com</a>. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0IN =C2=A0 =C2=A0 =C2=A0ANY<br><br>;; = ANSWER SECTION:<br><a href=3D"http://cloudflare.com">cloudflare.com</a>. = =C2=A0 =C2=A0 =C2=A0 =C2=A0 3789 =C2=A0 =C2=A0IN =C2=A0 =C2=A0 =C2=A0HINFO = =C2=A0 &quot;RFC8482&quot; &quot;&quot;<br><div><br></div></div></div>

    --0000000000009ba6da05a9de337b--
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Brett Delmage@Brett@brettdelmage.ca to bind-users on Tue Jul 7 14:42:12 2020
    From Newsgroup: comp.protocols.dns.bind

    --8323328-1191164296-1594147332=:10687
    Content-Type: text/plain; format=flowed; charset=ISO-8859-15 Content-Transfer-Encoding: 8BIT

    On Tue, 7 Jul 2020, Shumon Huque wrote:

    Cloudflare themselves now implement the "minimal any" behavior described
    in this spec:

        https://tools.ietf.org/html/rfc8482

    cloudflare.com.         3789    IN      HINFO   "RFC8482" ""

    Gee, that's a pretty minimal answer! Thanks. --8323328-1191164296-1594147332=:10687--
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From @lbutlr@kremels@kreme.com to bind-users on Tue Jul 7 14:05:53 2020
    From Newsgroup: comp.protocols.dns.bind

    On 07 Jul 2020, at 12:06, Michael De Roover <isc@nixmagic.com> wrote:
    On 7/7/20 4:06 PM, Tony Finch wrote:

    max-udp-size 1420;
    https://dnsflagday.net/2020/
    Interesting, I wasn't aware of this campaign. I don't know if I'm knowledgeable enough on UDP to be able to make educated decisions on this myself but I look forward to its eventual release.
    The URL has a good explanation of this setting and it looks like 1420 is a more than adequate packet size.
    From the page:
    An EDNS buffer size of 1232 bytes will avoid fragmentation on nearly all current networks. This is based on an MTU of 1280, which is required by the IPv6 specification, minus 48 bytes for the IPv6 and UDP headers.
    Sunce 1420 is still well under the MTU on most connections (usually 1500, sometimes 1492) and well above the required, I suspect this is fine as well. I've gone ahead and added to to my named.conf with a comment linking to Tony's message.
    --
    "Are you pondering what I'm pondering?"
    "I think so, Mr. Brain, but if the sun'll come out tomorrow, what's
    it doing right now?"
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tony Finch@dot@dotat.at to Brett Delmage on Tue Jul 7 21:31:04 2020
    From Newsgroup: comp.protocols.dns.bind

    Brett Delmage <Brett@BrettDelmage.ca> wrote:
    On Tue, 7 Jul 2020, Tony Finch wrote:

    minimal-any yes;

    Why only reduce and not eliminate?

    The reason is a bit subtle. If an ANY query comes via a recursive
    resolver, it is much better to give the resolver an answer so that it will
    put an entry in its cache. The cache entry will stop more ANY queries from being sent from the resolver to the upstream auth server, as long as its
    TTL lasts.

    If the auth server does not answer, or sends a REFUSED error, the resolver
    is likely to retry, which increases worthless traffic rather than
    suppressing it, and the resolver may decide the auth server is lame which
    will cause knock-on problems for legitimate queries.

    There are some scenarios where reflection attacks go through multiple
    servers. If you can get cache entries into those servers then the
    attack traffic gets suppressed closer to its source. There have been quite
    a lot of attacks that work like this:

    * an ISP has a huge number of customers with crappy home routers, that
    can act as open recursive resolvers

    * an arsehole decides to use these crappy home routers in a reflection /
    amplification DDoS attack

    * the crappy home routers forward the attack queries to their ISP's
    recursive servers; these recursive servers are legitimate and well
    configured but suffer from bad client devices

    * the recursive servers resolve the queries against some third party
    authoritative servers

    If the recursive servers cache the responses, then the auth servers should
    not be much affected by the attack: most of the traffic is answered from the ISP caches, and maybe the home router caches if they have them.

    But if the auth servers don't answer, or send REFUSED errors, then the recursive servers are going to keep retrying queries, and thereby relay a
    very large proportion of the attack traffic to the auth servers. Sadness
    will follow.

    Note that RRL does not help in this scenario, because from the auth
    server's point of view the ISP resolvers are legitimate clients, which RRL
    can observe from their retry behaviour. RRL is designed for attacks where
    the spoofed queries go direct to the auth server, which is not happening
    in this case.

    When this happened to us (when my servers were the third party auth
    servers) the DDoS attack was hitting a very large number of ISPs, so our
    auth servers were getting ANY queries via huge numbers of recursive
    servers. Extra unfortunately, the ANY response was too big to fit in UDP,
    so all the resolvers were trying to query over TCP. And our auth servers
    did not have enough TCP capacity to handle the load. Much sadness. (It
    didn't take us offline because our off-site auth servers were differently configured and able to keep answering.)

    So I implemented minimal-any to stop it from happening again.

    Tony.
    --
    f.anthony.n.finch <dot@dotat.at> http://dotat.at/
    Fisher, German Bight: Westerly veering northwesterly 4 to 6, decreasing 3
    later in south German Bight. Moderate, occasionally rough at first. Mainly fair. Mainly good.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Brett Delmage@Brett@BrettDelmage.ca to Tony Finch on Tue Jul 7 18:17:33 2020
    From Newsgroup: comp.protocols.dns.bind

    On Tue, 7 Jul 2020, Tony Finch wrote:

    Brett Delmage <Brett@BrettDelmage.ca> wrote:
    On Tue, 7 Jul 2020, Tony Finch wrote:

    minimal-any yes;

    Why only reduce and not eliminate?

    The reason is a bit subtle. If an ANY query comes via a recursive
    resolver, it is much better to give the resolver an answer so that it will put an entry in its cache...

    This is a very interesting and clear explanation. Thanks for taking the
    time to share this Tony. TIL :-)

    Brett

    --- Synchronet 3.18a-Linux NewsLink 1.113