Authors: Jason Goertzen, David Joseph, Peter Thomassen
BIND & liboqs: A PQC DNSSEC Field Study
The DNS protocol is the internet’s phonebook. Send off a domain name, and receive an IP address, to which all further communications can be sent. Validating the authenticity of that IP address requires cryptography, however, and this is where DNSSEC comes in. But the cryptography used to sign domain name/IP address pairs – known as records – will be vulnerable to quantum computers once they are available, and thus faces the same complex transition to post-quantum cryptography (PQC) as the rest of the internet.
PQC is almost ready
Even if estimates for Q-day (potentially within a decade or two) might seem optimistic, rolling out new DNSSEC signing schemes is a complex and lengthy process. The PQC algorithms themselves are on the cusp of being standardized by NIST, after an 8-year process, and folks around IETF are just now starting to research how to ready DNSSEC for the quantum threat. These new PQC algorithms have much larger signatures and keys which could cause delivery issues if the DNS protocol is not adapted for these needs. Clearly we need to investigate …
Let’s test it out in DNSSEC
For the past year, deSEC and SandboxAQ have been evaluating how PQC would behave if it were slotted in with no additional protocol changes. This involved deploying zones – portions of the DNS namespace managed by a specific organization – signed with Falcon, Dilithium, SPHINCS+, as well as the stateful hash-based signature scheme XMSS. In order to deploy these zones, we modified BIND 9 and PowerDNS to support all of these algorithms. In this post we’ll look at BIND 9, noting that the modifications to both implementations are rather similar.
We used liboqs with an OpenSSL 3 provider
In order to add support for the PQC algorithms, we took advantage of the open source project Open Quantum Safe. Open Quantum Safe provides a library called liboqs which contains the PQC algorithms selected by NIST for standardization. The project also offers an OpenSSL 3 provider (“oqs-provider”) which exposes those algorithms via the OpenSSL API. We ultimately used oqs-provider as the other DNSSEC algorithms were already using the OpenSSL API. Once one of the algorithms was added, supporting others was easy. We’ve published the source code on GitHub.
Stateful hash-based signatures
Stateful hash-based signatures are a special type of PQC, already standardized. In contrast to the other signing schemes we tested, signers have to carefully track a “state” (in essence, a counter), whose value must change with every invocation so as to not break security. In principle, they’ll work well for some applications, including DNSSEC, but the complexities of tracking state make them impractical for use cases like load balancers signing TLS transcripts. In order to test these types of signatures, we concentrated on the performance impact, and did not worry too much about managing this state correctly. (In other words, we allowed ourselves to insecurely reuse state, and the signatures implemented this way do not provide authenticity; however, they are perfectly suitable for investigating protocol-related performance aspects like deliverability.)
Liboqs is also in the process of adding support for stateful hash-based signatures. This study is the first project that is utilizing the new stateful signatures API. However, since this API is so new, oqs-provider does not support stateful signatures. This means that we had to use the low level API provided by liboqs instead. The API worked as expected, so it wasn’t too much effort to add XMSS and XMSSMT directly to the software.
Key generation did require special treatment, however, so we made a small change. Stateful hash-based signature schemes can generate a limited number of signatures, and zone operators would likely want to be able to pick the appropriate parameter set for their zone size. Both for XMSS and XMSSMT there are a plethora of parameter sets. The more signatures you wish to allow, the larger they become. In practice, it’s likely that operators of smaller zones would prefer to use parameter sets that support fewer signatures and benefit from smaller signature sizes. However, if a parameter set was standardized and the number of signatures it generated was too small, it would prevent operators of large zones from deploying XMSS or XMSSMT.
It would not be reasonable to have a DNSSEC algorithm number standardized for each parameter set (12 sets for XMSS and 16 for XMSSMT). Therefore, we took a similar approach to RSA modulus sizes: “Support all of the parameter sets of XMSS and XMSSMT, or you aren’t compliant”. Our implementation differentiates between parameter sets based on information stored within the DNSKEY. The nice thing is that liboqs does this for us by hiding an internal OID into both the secret and public keys, requiring only the introduction of the corresponding mappings to translate between the XMSS/XMSSMT parameter sets and DNSSEC algorithm numbers without losing the specific parameter details.
We modded BIND 9 to support parameter sets
In BIND 9 we create a custom function that takes the passed-in algorithm name. If it’s a type of XMSS or XMSSMT, it converts the algorithm name to the generic XMSS or XMSSMT name and then returns the internal OID that specifies the intended parameter set. This OID then gets passed to the key generation function. We had to add an extra XMSS specific header file in order to achieve this, but this can likely be abstracted away if the dst_func_t struct can be modified with a “parameter init” function.
Experimental setup
In order to evaluate how these quantum-resistant signature algorithms perform in real world DNS, we deployed several zones, each one signed by one classical or quantum-resistant algorithm. The BIND-operated test zones were set up with a KSK/ZSK presigned scheme (in contrast to the PowerDNS zones which used a CSK setup with dynamic signing).
We then deployed experiments across the RIPE ATLAS network. This is a network of roughly 10,000 small nodes distributed across the globe, which researchers can use to run various kinds of network-related measurements.
To run DNSSEC experiments, we used a BIND nameserver (version 9.19.7, plus our modifications) which was located in the United States, and clients from across the RIPE ATLAS network submitted DNS requests using their locally configured resolver(s), receiving PQC-enabled responses. We then looked into how many responses arrived and what their properties were, depending on algorithm, transport and the like.
Results
We find that in general, measurement results are quite noisy. We therefore applied pre-filtering, ignoring individual results from probe-resolver combinations that answered incorrectly for the zone signed with algorithm 8 (RSASHA256). (A correct response required the expected A record, or NXDOMAIN, depending on the case.) The pre-filter did not include a requirement for the AD bit, i.e., resolvers not performing validation were acceptable as long as the response appeared correct in terms of return code and record content.
The following two figures show results from BIND-operated zones for an existing name (Fig. 1) and a non-existing name (Fig. 2), with the aforementioned pre-filter applied. The two quadrants at the top are for probes querying their resolver via UDP, and the ones at the bottom via TCP. Only for queries on the right, the DO bit was set.
Figure 1: Results for an existing name.
Figure 2: Results for a non-existing name. NSEC3 is indicated by a “3” suffix; otherwise NSEC is used.
As can be seen, nearly all resolvers appear to be compatible with algorithms 8, 13, and 15, independently of transport and DO bit, for both NSEC and NSEC3 setups. This is no surprise for algorithm 8 (due to the pre-selection), and also expected for algorithm 13 (which some TLDs have deployed, presumably after verifying interoperability). For algorithm 15, we were less certain ahead of time, but it’s good to see that it does not appear to cause problems. (For numbers on actual validation support, see Fig. 4 below.)
Regarding PQC, it can be clearly seen that response delivery rates go down significantly as response sizes increase. This most prominently applies along the algorithm dimension, where the largest packet involved in a resolution happens to correlate with the algorithm numbers we chose – as a result, SERVFAIL responses become more frequent for SPHINCS+ and XMSS-type algorithms than for Dilithium2. Falcon, having the smallest signatures, appears to have the fewest issues.
It appears that many resolvers don’t obtain DNSSEC records by default (perhaps because validation is turned off, or because the zone’s DS record already indicates an unsupported algorithm). Only when setting the DO bit to request the resolver to obtain these records do larger packets occur, and deliverability problems might get worse. Whether that really is the case depends on whether the increase in packet size is such that a particular deliverability barrier (such as MTU) is exceeded. It appears that this mostly affects SPHINCS+ and XMSS via TCP, nullifying the benefit that those algorithms otherwise enjoy from using this transport when the DO bit is absent.
Similar considerations apply to NSEC vs. NSEC3, where transmission boundaries may be crossed depending on the case. In the scenarios we evaluated, this does not appear to have significant effects, except in the case of Dilithium2, where the SERVFAIL rate goes from 35.8% to 43.7% when using NSEC3.
Taking a slightly different perspective, Fig. 3 shows the frequency of correct (0) vs. incorrect (1) responses (for an existing name). While correct fractions for conventional algorithms are very close to 100% in all cases, PQC algorithms again show dependency on packet sizes as mediated by the choice of transport and DO bit, and in line with expectations. In particular, when the DO bit is not set, going from UDP to TCP increases success rates from around 70% to around 80% (and 90% in case of Falcon). A similar trend is observed when the DO bit is present, except that overall numbers are lower, going from 50% to around 70–75% when going from UDP to TCP. Exceptions are again Falcon (performing better, around 90%) and XMSS (performing a little worse than others).
Figure 3: Fraction of correct responses (for an existing name).
A surprising result was the AD bit distribution in responses (Fig. 4). First, it can be seen that between 33% and 46% of probe–resolver pairs claim DNSSEC validation for conventional algorithms. This is in line with what’s expected from DNSSEC validation deployment studies.
More interestingly, we are pretty certain that no probe was connected to a resolver that is able to validate signatures from our PQC implementations. Still, when the DO bit was set, responses from about 8.5% of probe–resolver pairs indicated for the Falcon algorithm that signatures had been validated. This appears to be clearly broken, and perhaps deserves a deeper look into what resolver software produces such responses. Curiously, the problem goes away when using TCP.
Figure 4: AD bit (for a non-existing name).
Testbed
If you’d like to play around with our test zones, you can do so using our test bench at https://pq-dnssec.dedyn.io/.
In addition to our signing implementation for BIND authoritative nameserver (available at bind9.pq-dnssec.dedyn.io:53), we have also implemented validation for the resolver function (running on port 5304).
You can query this PQC-enabled BIND resolver directly from the above page (using a DoH proxy), and also test interoperation by selecting PowerDNS Recursor to validate BIND signatures or vice versa. As PowerDNS uses a different setup (CSK instead of KSK/ZSK, and dynamic NSEC3), results may vary slightly across scenarios.
In case you want to use the command line, the page also has information on which exactly are the zones to query, accompanied by links to source code and additional measurement results.
Conclusion
All in all, our measurements indicate that the PQC signing schemes investigated are not well-suited for use in the existing DNS ecosystem, with the potential exception of Falcon (which might or might not withstand cryptographers’ scrutiny). Some differences exist between configurations (such as whether a combined key is used or ZSK/KSK are split), but appear to have only gradual impact without qualitatively changing the picture.
These findings confirm the suspicion that DNSSEC is not going to work well with PQC unless further changes are made. These changes might be in terms of improving signing schemes so that they do fit the ecosystem’s constraints; alternatively, one may try to change message payloads in a way that is compatible with existing constraints, but at the same time evades overly large messages (e.g., by splitting keys). It is not clear what’s the best approach, and thus more research is needed.
This project was funded by NLnet Foundation and supported by SSE.