Fwd: 64-bit time_t transition for 32-bit archs: a proposal
    Julien ÉLIE 
    julien at trigofacile.com
       
    Sat May 18 20:35:29 UTC 2024
    
    
  
Hi Russ, Olaf,
A follow-up on this discussion dating back to June 2023.  I've (at 
last!) reviewed all the uses of time in INN source code.
All printf-like writes of time_t still using %ld now use %lu.  Reading 
epoch times from files uses strtoul() and atoll() instead of strtol() 
and atol().  They concern positive numbers, and 32-bit time_t will then 
go on working after 2038 (up to 2106).  Changes are in the following commit:
 
https://github.com/InterNetNews/inn/commit/d60390074c1a8aefe794720e79b47d829f62791e
I think we have a bug for both 32-bit and 64-bit archs in 
backends/ninpaths.c where we write a dump to a file whose name is 
specified by the user with the possibility to use %d in the given file 
name to write a time_t...  We just blindly use it without any 
verification in the "n" argument:
static void writedumpfile(const char *n) {
[...]
     snprintf(buf, sizeof(buf), n, time(0));
     d = fopen(buf, "w");
[...]
As %d is a signed int, it won't give the right result in 2038, even with 
64-bit time_t.  To fix that, I propose to properly parse the string, 
changing any occurrences of "%d" (if not preceded by a "%") by the 
current time, and then use that newly created string as the file name. 
This way, it will be transparent to users.
> Forwarding this discussion from debian-devel about 64-bit time_t migration
> to inn-workers since this is a better place to discuss the details.
> 
>>> Oh, wait!  No, I'm wrong, CNFS actually does something smart and encodes
>>> the header in ASCII when writing it to disk.
>>>
>>> Okay, phew, this isn't going to be nearly as bad as I had thought.
CNFS is indeed not impacted by the 64-bit time_t transition.  It will 
work as expected in 2038.  However, we'll one day have to do something 
as it uses htonl(article.arrived) which works on uint32_t parameters, 
and not uint64_t.  Maybe htonl() will in 2106 when the number of seconds 
since the epoch exceeds 2^32.  Anyway, there will be other things to 
also look at besides htonl().
>> Good news. It would be great if you could add relevant info to the wiki page:
>> https://wiki.debian.org/ReleaseGoals/64bit-time#Known_Issues
> 
> That page now says
> < The CNFS storage format does not have problems with its disk format,
> < but the less-used timecaf storage format might (yet to be confirmed).
In the page:
   https://wiki.debian.org/ReleaseGoals/64bit-time#Known_Issues
"INN has time_t embedded in the disk format of its overview and history 
databases, which will require manual rebuilds when the size of time_t 
changes."
I do not see time_t used as-is in history or dbz files.  Seconds are 
written in decimal numbers (unsigned long), and read with strtoul() so I 
think it is not impacted by the 64-bit time_t transition.
I would suggest to rewrite the sentence like:
=> "INN has time_t embedded in the disk format of its overview 
databases, which will require manual rebuilds when the size of time_t 
changes."
Amongst our 4 overview methods, I think ovsqlite is not impacted though. 
  It handles uint64_t internally and times are stored in database 
columns which have an INTEGER type, which is 64-bit.
"The CNFS storage format does not have problems with its disk format, 
but the less-used timecaf storage format might (yet to be confirmed)."
It is confirmed.  I propose a rewording to:
=> "The CNFS storage format does not have problems with its disk format, 
but the less-used timecaf storage format has."
> 2. The CF files use file names like timecaf-nn/bb/aacc.CF where nn is
> the storage class and aabbccxx the time of arrival, changing the file
> name in 256-second intervals. The code using %02x format strings to
> determine the file name, and everything else checking them for
> validity, will have to be changed appropriately.
As the time is manipulated unsigned, the files will be correctly named 
up to year 2106, so that point is not mandatory for the 64-bit time_t 
transition.  (We also have a similar naming scheme to fix in the 
timehash storage method.)
All that being said, what could be pragmatically done for the 64-bit 
time_t transition in Debian?
In a nutshell, I think we have several use cases:
- the ovsqlite overview method is used, and timecaf is not used => The 
best scenario: there's nothing special to do.
- an overview method other than ovsqlite is used, and timecaf is not 
used => A rebuild of the overview data is needed.  In the absence of a 
dedicated migration program (for 3 overview methods), this can be done 
with "makehistory -O -x -F" when INN is *not* running.  The tricky part 
is that the existing overview data should be moved into another path 
(and empty buffers recreated if buffindexed is used) before running the 
command.  Is it to be done in a special inn2 package uploaded to archs 
concerned by the 64-bit time_t transition?
- timecaf is used => The most complex scenario: we first need to convert 
timecaf files to a format with 64-bit time_t, and then if an overview 
method other than ovsqlite is used, rebuild the overview data like in 
the previous point.
The ideal would be to know whether we really have this last use case to 
handle...  It may happen that no timecaf users will do a 64-bit time_t 
transition, so there's no need spending useless time on it!
Also, though I have not tested it, we have a "respool" program in the 
contrib directory which is said to "refile articles into the storage 
manager under the current storage.conf rules, deleting articles from 
their old place in the spool".  It may as well be used to refile 
articles from timecaf to another storage method *before* the user 
upgrades to 64-bit time_t (but how would he know he has to do that 
before upgrading?).
Do not hesitate to tell if I have missed something in these thoughts.
-- 
Julien ÉLIE
« The hardest thing is to go to sleep at night, when there are so many
   urgent things needing to be done.  A huge gap exists between what we
   know is possible with today's machines and what we have so far been
   able to finish. »  (Donald Knuth)
    
    
More information about the inn-workers
mailing list