<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<br>
<div class="moz-cite-prefix">On 10-Sep-21 08:36, Victoria Risk
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Sep 10, 2021, at 7:24 AM, Timothe Litt <<a
href="mailto:litt@acm.org" class="" moz-do-not-send="true">litt@acm.org</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="content-isolator__container">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8" class="">
<div class="">
<p class="">Clearly map format solved a big problem for
some users. Asking whether it's OK to drop it with no
statement of what those users would give up today is
not reasonable.</p>
</div>
</div>
</div>
</blockquote>
Actually, we are not sure there ARE any users. In fact, the one
example I could come up with was Anand, who has replied to the
list that he is in fact NOT using map zone. I should have asked
directly - is anyone on this list USING MAP ZONE format?</div>
<div><br class="">
</div>
</blockquote>
<p>Well, if the answer is "no one", that simplifies matters :-)<br>
</p>
<p>I do remember that startup time was a big issue before map came
out, and that the complaints subsided thereafter. No personal
knowledge as to whether that was cause and effect or a realignment
of the planets. In general, I don't look to Astrology for answers
:-)<br>
</p>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="content-isolator__container">
<div class=""> After all the "other improvements in
performance" that you cited, what is the performance
difference between map and the other formats? </div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
I don’t know that, to be honest. We don’t have the resources to
benchmark everything. Maybe someone on this list could? We
would also like to be able to embark on a wholesale update to
the rbtdb next year and this is the sort of thing that might
complicate refactoring unnecessarily. <br class="">
</div>
</blockquote>
<p>IIRC, when I did some work on the stats channel & was
concerned with scalability, Evan said that you keep some large
datasets (1M+zones) around for testing and produced some numbers
for that. So it ought to be possible to get some basic data.<br>
</p>
<p>I'm not suggesting a full benchmarking campaign -but one or two
datapoints are a lot better than none. E.g. If there's no
difference with 1 or 10M zones with, say, 10K records each, it's
pretty clear that map's time is past. If it's orders of magnitude
faster (and it's used), it's not.</p>
<p>I don't remember - did your user survey ask about how many/how
large zones people serve? I vaguely think so, but it's been a
while...<br>
</p>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="content-isolator__container">
<div class="">
<p class="">For a case which took 'several hours' before
map was introduced, what would the restart time be for
named if raw format was used now?</p>
</div>
</div>
</div>
</blockquote>
<div>
<div>
<blockquote type="cite" class="">
<blockquote type="cite" class="">
<div class="content-isolator__container">
<div class="">If I knew that I would have said. 'Raw’
was much faster than the text version. Map was
faster than raw. Raw is apparently not a problem to
maintain. I believe the improvement with raw was
~3x.</div>
</div>
</blockquote>
<blockquote type="cite" class="">
<div class="content-isolator__container">
<div class=""><br class="webkit-block-placeholder">
</div>
</div>
</blockquote>
</blockquote>
</div>
</div>
<br class="">
</div>
</blockquote>
I think the questions are: (a) is startup time an issue (however
it's solved)?, (b) if so, is map format the solution? (c) If it is
and people are using it, what would the consequences be to them if
it went away? (d) If it is, and people aren't using it - is the
documentation too scary (as Anand said it is for him)?<br>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="content-isolator__container">
<div class="">
<div class="">It's pretty clear to me that if map format
saves a few seconds in the worst case, it's not worth
keeping. If it saves hours for large operators, then
the alternative isn't adequate. Maybe "map" isn't the
answer - how might 'raw' compare to a tuned database
back end? (Which has other advantages for some.)
What if operators specified a priority order for
loading zones? Or zones were loaded on demand during
startup, with low activity zones added as a background
task? Or???</div>
</div>
</div>
</div>
</blockquote>
<div><br class="">
</div>
Well, back when we added map zone format, startup time was a
major pain point for some users. Now, it seems as though large
operators are updating their zones all the time (also updating
RPZ feeds) and efficiency in transfers seems to be a bigger
issue. </div>
<div><br class="">
</div>
</blockquote>
<p>What I was getting as is how hard the definition of "startup
time" is. Time to serving all zones? Important zones? Is it OK
for responses to be slow during startup, or is startup only
complete when responses are at nominal speed?<br>
</p>
<p>I wonder if this comes from large operators using a
database(DLZ) back end. Database developers tend to have a
single-minded focus on performance, and direct updates are
probably faster than going thru named & its generalized
authentication/validation. Plus, depending on how you set up your
server architecture, DB replication can replace DNS zone
transfers.<br>
</p>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>We don’t have any direct data on what features are being
used, we can only judge based on complaints we receive via bug
tickets or posts on this list. <br class="">
</div>
</blockquote>
You did a survey a while back...<br>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>
<blockquote type="cite" class="">
<div class="">
<div class="content-isolator__container">
<div class="">
<p class="">A fair question for users would be what
restart times are acceptable for their environment -
obviously a function of the number and size/content of
zones. And is a restart "all or nothing", or would
some priority/sequencing of zone availability meet
requirements?</p>
</div>
</div>
</div>
</blockquote>
<div>That is a good question. Can you answer it for yourself?</div>
</div>
</blockquote>
<p>Sure. I'm not a large operator, but I've always thought big and
implemented smaller. About 350 zones, 2 real views and 2
static-stub recursive views. 50-a couple of hundred records/zone
- not counting the DNSSEC signatures & overhead that named
generates. ~10 servers. Plus a 3rd party backup service.
Anything under a minute is a reasonable startup time for named -
though most of my servers are underpowered. (e.g. RPi class
machines with USB disks that sleep a lot.) Two minutes is
tolerable. Longer than that, I'd have issues.</p>
<p>If I were a larger operator and had to choose, I'd prioritize
external views so that key services (e.g. e-mail, webservers,
vpns,...) aren't seen to be slow/down. The internal network has
plenty of redundancy & tolerance for slow resolution. The
external views are smaller, with fewer servers. Another priority
would be zones for which a server is primary, since it's required
for updates.<br>
</p>
<p>If I were a DNS provider/registrar, I'd guess that of the
(hopefully) millions of zones that I sold, only a few actually
get a lot of traffic. So a scheme where historical query stats
drove reload order would be attractive. And since I'd sell SLAs,
prioritizing the higher-paying customers would be good business.</p>
<p>Of course, none of that matters if reload times are small enough
to cover expected outage durations with an affordable number of
servers.</p>
<p>The key would be the downtime on the database primaries (masters)
- that would prevent my customers from activating/updating their
zones. And a reason for a database back-end rather than
named-managed files - since DB persistence, consistency, and
replication are solved problems in that world. <br>
</p>
<p>Since you're lucky to get through to a (competent technical) help
desk in 10s of minutes, a total downtime (meaning rebooting a
server thru named serving at least key/zones and updates) on the
order of 15 minutes is probably the outer limit. That's a
thumb-in-the-air number, not science.<br>
</p>
<p>Hope this helps.<br>
</p>
<blockquote type="cite"
cite="mid:0CC54D4B-7F9B-49B0-AC20-467874716C2B@isc.org">
<div>
<div><br class="">
</div>
<div>Thank you!</div>
<div><br class="">
</div>
<div>Vicky</div>
<div><br class="">
</div>
</div>
<br class="">
</blockquote>
</body>
</html>