Fwd: 64-bit time_t transition for 32-bit archs: a proposal
Julien ÉLIE
julien at trigofacile.com
Sat May 18 20:35:29 UTC 2024
Hi Russ, Olaf,
A follow-up on this discussion dating back to June 2023. I've (at
last!) reviewed all the uses of time in INN source code.
All printf-like writes of time_t still using %ld now use %lu. Reading
epoch times from files uses strtoul() and atoll() instead of strtol()
and atol(). They concern positive numbers, and 32-bit time_t will then
go on working after 2038 (up to 2106). Changes are in the following commit:
https://github.com/InterNetNews/inn/commit/d60390074c1a8aefe794720e79b47d829f62791e
I think we have a bug for both 32-bit and 64-bit archs in
backends/ninpaths.c where we write a dump to a file whose name is
specified by the user with the possibility to use %d in the given file
name to write a time_t... We just blindly use it without any
verification in the "n" argument:
static void writedumpfile(const char *n) {
[...]
snprintf(buf, sizeof(buf), n, time(0));
d = fopen(buf, "w");
[...]
As %d is a signed int, it won't give the right result in 2038, even with
64-bit time_t. To fix that, I propose to properly parse the string,
changing any occurrences of "%d" (if not preceded by a "%") by the
current time, and then use that newly created string as the file name.
This way, it will be transparent to users.
> Forwarding this discussion from debian-devel about 64-bit time_t migration
> to inn-workers since this is a better place to discuss the details.
>
>>> Oh, wait! No, I'm wrong, CNFS actually does something smart and encodes
>>> the header in ASCII when writing it to disk.
>>>
>>> Okay, phew, this isn't going to be nearly as bad as I had thought.
CNFS is indeed not impacted by the 64-bit time_t transition. It will
work as expected in 2038. However, we'll one day have to do something
as it uses htonl(article.arrived) which works on uint32_t parameters,
and not uint64_t. Maybe htonl() will in 2106 when the number of seconds
since the epoch exceeds 2^32. Anyway, there will be other things to
also look at besides htonl().
>> Good news. It would be great if you could add relevant info to the wiki page:
>> https://wiki.debian.org/ReleaseGoals/64bit-time#Known_Issues
>
> That page now says
> < The CNFS storage format does not have problems with its disk format,
> < but the less-used timecaf storage format might (yet to be confirmed).
In the page:
https://wiki.debian.org/ReleaseGoals/64bit-time#Known_Issues
"INN has time_t embedded in the disk format of its overview and history
databases, which will require manual rebuilds when the size of time_t
changes."
I do not see time_t used as-is in history or dbz files. Seconds are
written in decimal numbers (unsigned long), and read with strtoul() so I
think it is not impacted by the 64-bit time_t transition.
I would suggest to rewrite the sentence like:
=> "INN has time_t embedded in the disk format of its overview
databases, which will require manual rebuilds when the size of time_t
changes."
Amongst our 4 overview methods, I think ovsqlite is not impacted though.
It handles uint64_t internally and times are stored in database
columns which have an INTEGER type, which is 64-bit.
"The CNFS storage format does not have problems with its disk format,
but the less-used timecaf storage format might (yet to be confirmed)."
It is confirmed. I propose a rewording to:
=> "The CNFS storage format does not have problems with its disk format,
but the less-used timecaf storage format has."
> 2. The CF files use file names like timecaf-nn/bb/aacc.CF where nn is
> the storage class and aabbccxx the time of arrival, changing the file
> name in 256-second intervals. The code using %02x format strings to
> determine the file name, and everything else checking them for
> validity, will have to be changed appropriately.
As the time is manipulated unsigned, the files will be correctly named
up to year 2106, so that point is not mandatory for the 64-bit time_t
transition. (We also have a similar naming scheme to fix in the
timehash storage method.)
All that being said, what could be pragmatically done for the 64-bit
time_t transition in Debian?
In a nutshell, I think we have several use cases:
- the ovsqlite overview method is used, and timecaf is not used => The
best scenario: there's nothing special to do.
- an overview method other than ovsqlite is used, and timecaf is not
used => A rebuild of the overview data is needed. In the absence of a
dedicated migration program (for 3 overview methods), this can be done
with "makehistory -O -x -F" when INN is *not* running. The tricky part
is that the existing overview data should be moved into another path
(and empty buffers recreated if buffindexed is used) before running the
command. Is it to be done in a special inn2 package uploaded to archs
concerned by the 64-bit time_t transition?
- timecaf is used => The most complex scenario: we first need to convert
timecaf files to a format with 64-bit time_t, and then if an overview
method other than ovsqlite is used, rebuild the overview data like in
the previous point.
The ideal would be to know whether we really have this last use case to
handle... It may happen that no timecaf users will do a 64-bit time_t
transition, so there's no need spending useless time on it!
Also, though I have not tested it, we have a "respool" program in the
contrib directory which is said to "refile articles into the storage
manager under the current storage.conf rules, deleting articles from
their old place in the spool". It may as well be used to refile
articles from timecaf to another storage method *before* the user
upgrades to 64-bit time_t (but how would he know he has to do that
before upgrading?).
Do not hesitate to tell if I have missed something in these thoughts.
--
Julien ÉLIE
« The hardest thing is to go to sleep at night, when there are so many
urgent things needing to be done. A huge gap exists between what we
know is possible with today's machines and what we have so far been
able to finish. » (Donald Knuth)
More information about the inn-workers
mailing list