>On OmniOS, all the DNS queries (apparently 58) received a response. On
>HardenedBSD, only the first 4 queries received a response, the next 18
>timed out. They were retried 4 additional times, as expected, again
>timing out without receiving a response.
The fd of the async pipe to the client isn't the same in both outputs:
it's 9 on OmniOS and 10 on HardenedBSD, which means the client uses one
more fd on HardenedBSD for some reason. (Does OmniOS support signalfd()?
That would explain it.)
On HardenedBSD, 4 queries received responses, that were properly
reported to the client. The others were pending and retried with longer
timeouts, but only 6 of them reported a full timeout to the client.
The client exited while 12 queries were technically still in flight.
On OmniOS, I can't even make sense of some of the strings, typically
in the async responses to the client. What is the endianness of this
machine? A network byte order 32-bit number equal to 3 seems to be
encoded as { 0, 0, 3, 0 }, which doesn't look right. (I did check my
uint32_bswap() primitive.) If the client isn't complaining very loudly
when it receives such strings, it means the strings are correct and the
truss tool displays them incorrectly, which doesn't help me diagnose
what's going on.
In any case the problems look unrelated to skadnsd and come from the
interaction between the s6-dns library and the caches: either the
packets are correct and the caches are not sending the responses they
should, and that's not an s6-dns problem, or the packets are malformed
and that's why the servers are ignoring them, and I need to fix that.
Amelia, could you do some tests (with the same caches) from s6-dns
command-line clients such as s6-dnsip4? That will bypass the skadns
layer, and will be easier to trace and understand. Thanks :)
--
Laurent
Received on Mon Oct 10 2022 - 13:12:55 CEST