Enable systemd hardening options for named
daniel.stirnimann at switch.ch
Tue Jan 16 12:52:59 UTC 2018
Just wondering, if one is already using selinux in enforcing mode, does
systemd hardening provide any additional benefit?
On 16.01.18 12:21, Ludovic Gasc wrote:
> I have merged config files from Tony, Robert, and me.
> I have tried to be the most generic, the result below.
> It seems to work here without regression, except a warning:
> managed-keys-zone: Unable to fetch DNSKEY set '.': operation canceled
> But only at the first boot, I don't see the message anymore when I
> restart the daemon.
> Any clue ?
> Thanks for your feedbacks.
> SystemCallFilter=~@mount @debug acct modify_ldt add_key adjtimex
> clock_adjtime delete_module fanotify_init finit_module get_mempolicy
> init_module io_destroy io_getevents iopl ioperm io_setup io_submit
> io_cancel kcmp kexec_load keyctl lookup_dcookie migrate_pages move_pages
> open_by_handle_at perf_event_open process_vm_readv process_vm_writev
> ptrace remap_file_pages request_key set_mempolicy swapoff swapon uselib
> Ludovic Gasc (GMLudo)
> 2018-01-15 21:14 GMT+01:00 Robert Edmonds <edmonds at mycre.ws
> <mailto:edmonds at mycre.ws>>:
> Tony Finch wrote:
> > Ludovic Gasc <gmludo at gmail.com <mailto:gmludo at gmail.com>> wrote:
> > >
> > > 1. The list of minimal capabilities needed for bind to run correctly:
> > > http://man7.org/linux/man-pages/man7/capabilities.7.html
> > named already drops capabilities - have a look at the code around here:
> > https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=bin/named/unix/os.c;hb=v9_11_2#l234
> > Note that it's a bit clever - the privileges are dropped in two stages,
> > right at the start, and after the server has been configured.
> I checked just now to see what that code actually ends up doing, and on
> my system I ended up with:
> $ grep -h ^Cap /proc/$(pidof named)/**/status | sort | uniq -c
> 6 CapAmb: 0000000000000000
> 6 CapBnd: 0000003fffffffff
> 6 CapEff: 0000000001000400
> 6 CapInh: 0000000000000000
> 6 CapPrm: 0000000001000400
> That decodes to:
> - The effective and permitted capabilities sets were reduced to
> CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE.
> - The ambient and inheritable capabilities sets were cleared.
> - The capability bounding set was left completely open-ended.
> It's not clear why CAP_SYS_RESOURCE needs to be retained past startup:
> * XXX We might want to add CAP_SYS_RESOURCE, though it's not
> * clear it would work right given the way linuxthreads
> * XXXDCL But since we need to be able to set the maximum number
> * of files, the stack size, data size, and core dump size to
> * support named.conf options, this is now being added to test.
> See commits 5e4b7294d88ab58371d8c98e05ea80086dcb67cd,
> 108490a7f8529aff50a0ac7897580b59a73d9845. "[T]o test"?
> CAP_SYS_RESOURCE is documented as permitting:
> * Use reserved space on ext2 filesystems;
> * make ioctl(2) calls controlling ext3 journaling;
> * override disk quota limits;
> * increase resource limits (see setrlimit(2));
> * override RLIMIT_NPROC resource limit;
> * override maximum number of consoles on console allocation;
> * override maximum number of keymaps;
> * allow more than 64hz interrupts from the real-time clock;
> * raise msg_qbytes limit for a System V message queue
> above the
> limit in /proc/sys/kernel/msgmnb (see msgop(2) and
> * allow the RLIMIT_NOFILE resource limit on the number
> of "in-
> flight" file descriptors to be bypassed when
> passing file
> descriptors to another process via a UNIX domain
> socket (see
> * override the /proc/sys/fs/pipe-size-max limit when
> setting the
> capacity of a pipe using the F_SETPIPE_SZ fcntl(2) command.
> * use F_SETPIPE_SZ to increase the capacity of a pipe
> above the
> limit specified by /proc/sys/fs/pipe-max-size;
> * override /proc/sys/fs/mqueue/queues_max limit when
> POSIX message queues (see mq_overview(7));
> * employ the prctl(2) PR_SET_MM operation;
> * set /proc/[pid]/oom_score_adj to a value lower than the
> last set by a process with CAP_SYS_RESOURCE.
> I would guess that retaining CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE
> during the process runtime permits open-ended reloading of the config at
> runtime (e.g., binding to a new IP address on port 53 without needing to
> restart the daemon). So even though BIND drops some capabilities, it's
> still running with elevated privileges compared to a traditional
> non-root user.
> systemd permits a nice pattern for network daemons that want to run as
> an unprivileged user, but bind to a privileged port (and without using
> socket activation), without starting the process as root. Basically, you
> put something like this in the unit file:
> CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
> AmbientCapabilities=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT CAP_SETPCAP
> Any needed filesystem directories and permissions need to be set up
> correctly before hand. The service is started by the init system as the
> unprivileged User/Group specified in the unit file, so there's no need
> to change UID/GID. CAP_NET_BIND_SERVICE is then used to bind to a
> privileged port, CAP_SYS_CHROOT is used to perform the chroot, and
> CAP_SETPCAP is used to drop all remaining capabilities from the
> capability sets and the capability bounding set, so you end up with a
> completely unprivileged process at runtime. (Alternatively you could
> keep CAP_NET_BIND_SERVICE and drop CAP_SYS_CHROOT and CAP_SETPCAP, if
> you wanted to retain the capability to perform privileged binds at
> runtime. Or you could eliminate CAP_SYS_CHROOT and use other systemd
> functionality to make parts of the filesystem inaccessible, etc.) This
> pattern might be a bit hard to retrofit into BIND at this point, though,
> other than by adding more knobs.
> Robert Edmonds
> Please visit https://lists.isc.org/mailman/listinfo/bind-users
> <https://lists.isc.org/mailman/listinfo/bind-users> to unsubscribe
> from this list
> bind-users mailing list
> bind-users at lists.isc.org <mailto:bind-users at lists.isc.org>
More information about the bind-users