[bind10-dev] Planning for next sprint - input required

Thu Nov 25 18:25:41 UTC 2010

As mentioned in the last sprint planning meetings, I would like us to start thinking about what we are going to do in the next sprint when this one finishes.  The basic user stories are fixed for the A-Team (performance improvements) and R-team (working recursor), but within those stories we do have a bit of flexibility as to the tasks we tackle.

The following list is a set of tasks that I've put together from these stores.  It is by no means complete and not all tasks are equally important.  What I hope it will do is to start a discussion as to what we need to do over the next few months (and in particular the next few weeks) and allow us to prioritise (and estimate) the tasks appropriately so that we can reach our year 2 goals.

In the short term thought, the six-week release cycle is coming to an end next week.  At the planning meetings next week, we need to assess where we are and what we achieved, decide what our goals are for the following cycle and estimate/select the tasks appropriately.

So please, come back with additions/deletions to the task list, assessment of priority and any comments you care to make.  Can you also start giving thought to as to the tasks we should focus on for the next sprint?

Thanks

Stephen

General/Common
==============
This section covers tasks that are common to both the A- and R-teams.

General Requirements
* Working through DNS RFCs and identifying MUST/SHOULD etc requirements
* Working through BIND-9 configuration options and identifying specific features
* Working through BIND-9 bug/enhancement list and identifying specific features

Notes: the authoritative and recursive servers are subject to a large number of RFCs, many of which have explicit RFC2119 (SHOULD, MUST etc.) requirements and others with indicate best current practice.  In addition, a number of features have been added to BIND over the years (either as enhancements, as a result of bug reports or at customer request) that should be incorporated into BIND10.  (A partial list of BIND-9 features can be found at http://bind10.isc.org/wiki/BIND9Features) The result is that at some point we will need to a check for completeness and feed that into the system test.

Logging Framework
* API Requirements+Design, e.g.
  - Facilities (auth, recurse, xfrout etc.)
  - Subcomponents (e.g. cache, query logic, validation etc.)
  - Levels (info, warning, debug)
  - Logging destinations
    - syslog, file, socket, combination
    - per-destination filters?
* C++/Python Implementation 

Notes: this was raised at the last face to face.  BIND-10 needs a comprehensive logging framework.  The longer we put this off, the more code we will have to re-factor to include include it.  It therefore makes sense to do this early and to start adding it to code as we write it.

DNSSEC Key Interface
* HSM interface
  - Key generation
  - Key addition/removal/listing
* File interface
  - Key generation
  - Key addition/removal/listing

Notes: many top-level domains use HSMs to store their keys and to sign/check the signatures.  Existing BIND code uses key files.  Ideally we want a single abstract key store to which a variety of storage mechanisms can be connected.  PKCS#11 is the default for HSMs (and that would also allow use of SoftHSM); however some sites may want to continue with the key file idea, in which case a PKCS#11-style interface to those files would simplify the code that uses keys.

The issue goes deeper than a single HSM as the issue of HSM replacement - and the movement of keys between HSMs - must be considered. (For reference, OpenDNSSEC allows simultaneous access to multiple HSMs, the idea being that when an HSM is retired, if keys can't be transferred to another HSM a key roll takes place, the new key coming from the new HSM.)

Authoritative Server
====================

Tasks not complete from last sprint
See http://bind10.isc.org/wiki/ATeam20101117

Task Backlog
These tasks were not included in the current sprint but were identified as A-team tasks: http://bind10.isc.org/wiki/TaskBacklog

DNSSEC
* Zone signing
  - Signing a zone file
    - NSEC
    - NSEC3
  - Continuous signing
* Handling DO=1 queries
  - Returning signature information
  - Returning NSEC/NSEC3 information

Notes: The basic DNSSEC operations and we should not underestimate the complexity of the task.

* Signing mechanisms
  - Accessing key for signing in CPU
    - Performing the signing operation
  - Signing in the HSM

Notes: The basic key-realated tasks for the authoritative server; with the key we should be able to sign the key using the host computer.  However, if the HSM offers it, we should be prepared to pass the information to the HSM.  As an aside, some HSMs can only realise their full potential using multiple threads.

* Key management
  - Single key of multiple zones and Multiple keys, one per zone
  - Key rollover
  - DS interaction with the parent

Notes: Assuming that the server is going to serve multiple zones, users may want to have one key per zone or have one key server multiple zones.  In addition, there will be a need for managing key rolls - pre-introducing new keys and retaining old keys in the zone for a sufficient time to allow RRSIGs in caches to expire.

Recursor
========
Tasks not complete from last sprint
See http://bind10.isc.org/wiki/RTeam20101116

Priming query
* Priming query logic

Notes: the first query a resolver makes is a priming query.  There is an (expired) internet-draft on the subject: http://tools.ietf.org/html/draft-ietf-dnsop-resolver-priming-02

Cache
* Basic design
* Handling TTL expiration/cache cleaning
* Caching negative responses
* When to cache glue v when to cache authoritative data
* Cache persistence - dumping and loading
  - Inspection tools
* Pre-load cache?
  - Testing (can set up cache in particular configuration)
  - Authoritative server on same system (can load authoritative data into cache)

Notes: The cache design will be key to the performance of the resolver.  We need to spend some time on getting a good design.  I've included in this task cache persistence which seems to be a definite requirement and time for writing inspection tools (useful for debugging and later support).  I have also added the idea of pre-loading the cache - perhaps by writing something that converts a zone file into a cache dump file format.  I see two uses for it: the first is for testing - allowing us to set up particular cache configurations for particular tests.  The second is when an authoritative and recursive server run on the same system; one idea was that the recursive server answers all queries, with the authoritative data being pre-loaded into its cache (and marked with a "do not delete" flag).

- Nameserver address store
  * NSAS persistence - dumping and loading

Notes: Work seems to be well under way here.  However, the ability to save and restore the address store (across server restarts) has been requested.

- Lookup logic in non-DNSSEC case
  * Basic design
  * Tracking Queries (to the server) and Fetches (data from other servers)
    - Loop detection and avoidance
  * Fetches to other servers
    - Randomisation of query ID => random number generator
    - Port randomisation

Notes: the core of the resolver - actually receive a query and follow the chain of referrals until an answer is received.  There are a number of issues here, not least those associated with the Kaminsky attack a couple of years ago.

Lookup logic in DNSSEC case
* Loading configured trust anchors
* Extension to lookup design
  - Path to look up chain of trust

Notes: extension of the lookup logic to the DNSSEC case - the determination that a zone is signed and that RRSIG should be expected, following the chain of trust.

DNSSEC Validation
* Checking that RRSIG matches RRset + DNSKEY
  - Algorithm SHA1, SHA2, GOST
* Extend to checking that at least one RRSIG matches RRset + one key in DNSKEY RRset
  - Check that RRset has a RRSIG for all algorithms represented in DNSKEY RRset
* Checking that DS record matches DNSKEY record
  - Checking that at least one DS in parent zone matches at least one DNSKEY in child zone
* Validation using HSM
  - Key access
  - Signature checking
* Authenticated denial of existence
  NSEC/NSEC3 validation

Notes: the basic tasks associated with DNSSEC validation.