Overview

    This document is likely to be out of date in a number of areas...

    This manual attempts to cover things that don't logically fit in the man
    pages. It covers the big-picture items and overall design It's a cross
    between a manual and a design document at the moment, although perhaps
    those functions can be split up at a later date.

    This manual is not intended to be an exhaustive reference. For a
    complete rundown of every configuration and commandline option and its
    precise technical meaning, see the man pages.

Portability Testing

    This was done during the run-up to 1.2.0, and hasn't been re-tested by
    me since. There's a good chance that some platforms have become broken
    since this, especially ones with non-gcc-compatible compilers.

    I've personally run through the process of, basically: "./configure &&
    make check && sudo make install && make installcheck" on the following
    platforms with the following compilers. On all of them, I've also
    manually started the daemon as root and seen it produce correct syslog
    output, appear to secure itself, answer a few manual queries, etc.

    Of course most platforms required some prep work to get the testsuite
    dependencies installed, but it wasn't too hard on any of them.

      x86_64:
        Linux - Fedora 12: gcc 4.4.3
        Linux - Fedora 13: gcc 4.4.4
        Linux - Debian Etch: gcc 4.1.2
        MacOS X Snow Leopard: gcc 4.2.1, llvm-gcc 4.2.1, clang 1.0.2

      x86:
        Linux - Fedora 13: gcc 4.4.4
        FreeBSD 8.0: gcc 4.2.1
        FreeBSD 7.2: gcc 4.2.1
        FreeBSD 6.4: gcc 3.4.6
        OpenBSD 4.4: gcc 3.3.5
        MacOS X Snow Leopard: gcc 4.0.1
        OpenSolaris 2009.06: gcc 3.4.3, gcc 4.3.2, Sun Studio C Compiler

      mips32r2 (Big Endian) (Atheros 9132 - Buffalo WZR-HP-G300NH Router):
        Linux - OpenWRT Backfire: gcc 4.3.3 + uClibc 0.9.30.1
          (cross compiled from x86_64 build host using OpenWRT's toolchain)
          (also, I wasn't able to run the testsuite here (vendor perl
          broken), but I did do some simple manual testing)

    I think this covers a fair swath of the field of modern-ish POSIXy
    systems. The Linux and Mac tests were on real hosts, and the rest were
    done in VMWare using standard images found via Google. Given what I've
    seen so far, I expect there will be more minor issues when people try
    gdnsd on platforms other than those listed. Please send bug reports, I
    doubt the issues will be hard to solve at this point. And if you try a
    significant variation from the above and it works without a hitch,
    please let me know so I can add it to the list of tested systems.

    Most current development/testing is done on latest available releases of
    MacOS X, Fedora Linux, and/or Ubuntu Linux.

Overall Design

  Configuration

    The configuration file's basic syntax is handled by "vscf", which parses
    a simple and clean configuration syntax with arbitrary structural depth
    in the form of arrays and hashes. At one time this was a separate
    library, but it has been bundled back into gdnsd's distribution at this
    point. Details of the configuration options are in the man page
    gdnsd.config(5).

  Threading

    The gdnsd daemon uses pthreads to maximize performance and efficiency,
    but they don't contend with each other on locks at runtime, and no more
    than one thread writes to any shared memory location. Thread-local
    writable memory is malloc()'d within the writing thread and the address
    is private to the thread.

    The "main" thread handles pretty much everything but the actual
    processing of DNS requests. It handles monitoring I/O in the case of
    plugins using the built-in monitoring, the http server that serves stats
    data over port 3506 (reconfigurable), and the signal handlers for daemon
    shutdown and toggling runtime logging of packet errors.

    Currently all of the lengthy startup operations that precede serving DNS
    requests, such as the parsing of zonefiles and the post-processing of
    zone data, occur serially in the main thread as well. In the future we
    may spawn some short-lived worker threads to speed this up, but
    ultimately they would be joined at a synchronization point before
    spawning the threads that serve DNS requests in the long term.

    There are N threads dedicated to handling UDP DNS requests, one per
    listening socket. These UDP threads do nothing but sit in a tight
    blocking loop on the listening socket until they are terminated. They
    run with all signals blocked, and they make no expensive system calls
    (other than necessary socket read/write traffic) or heap allocations at
    runtime. All necessary heap data/buffers are pre-allocated before these
    threads begin serving requests.

    The scaling of UDP DNS was looked into extensively on Linux. Adding more
    threads per socket never works as the socket itself is a serial, locked
    resource for the operating system and our DNS code is way faster than
    the socket operations themselves.

    To scale out over multiple cores, if you find that your network
    interface(s) can pump more data than gdnsd can handle in a single
    socket/thread, your best scaling option is to have gdnsd listen on
    several independent sockets (port numbers and/or IPs). Then either use a
    hardware loadbalancer or Linux's own ipvsadm software to balance the
    client requests across the several sockets. With ipvsadm you can do it
    all in software on the same machine gdnsd is running on, with the
    inbound port 53 traffic spread across N local ports gdnsd is listening
    on.

    Few people will face these scaling limits in practice, and the few that
    do shouldn't have any problem figuring out ipvsadm, etc.

    TCP DNS is threaded and scaled in the same manner. It could have been
    done differently, but in any sane DNS configuration UDP scaling issues
    should dwarf TCP ones anyways, and this keeps things simpler.

    Each of the TCP DNS threads listens on one socket, and uses a libev
    event loop to handle many connections in parallel. They do allocate heap
    dynamically at runtime, and you can limit the count of parallel requests
    per thread with "tcp_clients_per_socket" (default 128). libev priorities
    are used to prioritize finishing current requests over accepting new
    ones when both events are available in the same loop wakeup, so this
    should help keep the parallelism number lower in general.

  Statistics

    The DNS threads keep reasonably detailed statistical counters of all of
    their activity. The core dns request handling code that both the TCP and
    UDP threads use tracks counters for all response types. Mostly these
    counters are named for the corresponding DNS response codes (RCODEs):

    refused
        Request was refused by the server because the server is not
        authoritative for the queried name.

    nxdomain
        Request was for a non-existant domainname. In other words, a name
        the daemon is authoritative for, but which does not exist in the
        database.

    notimp
        Requested service not implemented by this daemon, such as zone
        transfer requests.

    badvers
        Request had an EDNS OPT RR with a version higher than zero, which
        this daemon does not support (at the time of this writing, such a
        version doesn't even exist).

    dropped
        Request was so horribly malformed that we didn't even bother to
        respond (too short to contain a valid header, unparseable question
        section, QR (Query Response) bit set in a supposed question, TC bit
        set, illegal domainname encoding, etc, etc).

    noerror
        Request did not have any of the above problems.

    v6  Request was from an IPv6 client. This one isn't RCODE based, and is
        orthogonal to all other counts above.

    edns
        Request contained an EDNS OPT-RR. Not RCODE-based, so again
        orthogonal to the RCODE-based totals above. Includes the ones that
        generated badvers RCODEs.

    edns_client_subnet
        Subset of the above which specified the edns_client_subnet option.

    The UDP thread(s) keep the following statistics at their own level of
    processing:

    udp_reqs
        Total count of UDP requests received and passed on to the core DNS
        request handling code (this is synthesized by summing all of the
        RCODE-based stat counters above for the UDP threads).

    udp_recvfail
        Count of UDP recvmsg() errors, where the OS indicated that something
        bad happened on receive. Obviously, we don't even get these
        requests, so they can't be processed and replied to.

    udp_sendfail
        Count of UDP "sendmsg()" errors, which almost definitely resulted in
        dropped responses from the client's point of view.

        Note that a common cause for these I've seen on Linux is requests
        with a source port of zero. For some reason Linux will pass these to
        us on the input side, but then the "sendmsg()" call to send a
        response back to port zero immediately fails due to the bad port
        number.

    udp_tc
        Non-EDNS (traditional 512-byte) UDP responses that were truncated
        with the TC bit set.

    udp_edns_big
        Subset of udp_edns where the response was greater than 512 bytes (in
        other words, EDNS actually did something for you size-wise)

    udp_edns_tc
        Subset of udp_edns where the response was truncated and the TC bit
        set, meaning that the client's specified edns buffer size was too
        small for the data requested.

    The TCP threads also count this stuff:

    tcp_reqs
        Total count of TCP requests (again, synthesized by summing the
        RCODE-based stats for only TCP threads).

    tcp_recvfail
        Count of abnormal failures in recv() on a DNS TCP socket, including
        ones where the sender indicated a payload larger than we're willing
        to accept.

    tcp_sendfail
        Count of abnormal failures in send() on a DNS TCP socket.

    These statistics are tracked in per-thread structures. The actual data
    slots are uintptr_t, which helps with rollover on 64-bit machines.

    The main thread reports the statistics in two different ways. The first
    is via syslog every log_stats seconds (default 3600), as well as always
    at exit time. The other is via an embedded HTTP server which listens by
    default on port 3506. The HTTP server can give the data in both html
    (for humans) and csv (for monitoring tools) formats. All of the stats
    reporting code is in statio.c.

  Truncation Handling and other related things

    gdnsd's truncation handling follows the simplest valid set of truncation
    rules. That is: it drops whole RR sets (without setting the TC bit) in
    the case of being unable to fit all the desirable additional records
    into the Additional section, and in the case that Answer or Authority
    records don't fit, it returns an empty (other than perhaps an EDNS OPT
    RR) packet with the TC bit set. The space for the EDNS OPT RR is
    reserved from the start when applicable, so it will never be elided to
    make room for other records. Nameserver address RRs for delegation glue
    are considered part of the required set above (i.e. if they don't fit,
    the whole packet will be truncated w/ TC, even though they go in the
    Additional section).

    Also, by default, unnecessary lists of NS records in the Authority
    section are left out completely (they're really only necessary for
    delegation responses). This behavior can be reversed (to always send
    appropriate NS records even when not strictly necessary) via the
    include_optional_ns option.

Security

    Any public-facing network daemon has to consider security issues. While
    the potential will always exist for gdnsd to contain stupid buffer
    overflow bugs and the like, I believe the code to be reasonable secure.

    I regularly audit the code as best I can, both manually and with tools
    like valgrind, to look for stupid memory bugs. Another point in its
    favor is the fact that, being a purely authoritative server, gdnsd has
    no reason to believe anything anyone else on the network has to say
    about anything. This eliminates entire classes of attacks related to
    poisoning and the like. gdnsd never sends DNS queries (even indirectly
    via gethostbyname()) to anyone else. It's a DNS server, not a DNS
    client.

    Perhaps more importantly, gdnsd doesn't trust itself to be root on your
    machine. Any time gdnsd is started as root, it will drop privileges to
    those of the user named "gdnsd" (configurable). It will optionally also
    chroot into a chroot directory specified via the "-d" argument (or
    defaulted at build time), but the default default is to use system
    pathnames and not do chroot.

    If any security-related operation fails, the daemon will fail to start
    itself and abort with a log message indicating the problem.

    While this doesn't erase other security concerns, it certainly helps in
    minimizing the potential impact of any future remote exploit that may be
    discovered in the gdnsd code.

    If your host supports randomized address layout for executables (-fPIE
    in gcc terms), gdnsd can be built as such for additional security. This
    is not the default, but you can simply add it to your CFLAGS when
    building.

Copyright and License

    Copyright (c) 2012 Brandon L Black <blblack@gmail.com>

    This file is part of gdnsd.

    gdnsd is free software: you can redistribute it and/or modify it under
    the terms of the GNU General Public License as published by the Free
    Software Foundation, either version 3 of the License, or (at your
    option) any later version.

    gdnsd is distributed in the hope that it will be useful, but WITHOUT ANY
    WARRANTY; without even the implied warranty of MERCHANTABILITY or
    FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
    more details.

    You should have received a copy of the GNU General Public License along
    with gdnsd. If not, see <http://www.gnu.org/licenses/>.

