[kea-dev] Kea's database schema versioning in practice

Mon Sep 5 16:33:44 UTC 2016

On 05.09.2016 13:04, Ola Thoresen wrote:
> >>
> >
> > Oddly enough, nobody has any opinions. Come on, stating that we have
> > been doing that right is also a good feedback.
> >
>
>
>
> I don't have any strong opinions, and I have not read the code, so I
> might be wrong, but I have a few questions which might lead to others
> having opinions.
>
> a) Are you limited to just two parts in the schema version numbering?
> - To me it makes sense to use a two part ("two digit") version for the
> releases, and use another number for intermediate versions.  So kea 1.0
> is released with schema x.y and all updates to the schema until the next
> release is numbered x.y.z
> This will not bloat the schema version too much, as all the minor
> updates don't count as soon as a new version of kea is released.
>

Ola,

Currently we use major and minor version number of the schema. So, only
x.y. It is not technically hard to add revision "z" to this if there is
sufficient demand for versioning between releases. One problem I see
with this approach, though, is that you have to tag each change to the
SQL scripts with a version number (namely, value of "z"). Upgrade
scripts are incremental, which means that if you make a change like
adding new column into an SQL table and later you decide to rename it,
you have to keep both statements in your upgrade scripts, one that adds
the column, second one that renames it. If you didn't version the schema
with "z" granularity, you could simply have one statement (between two
releases) that adds a column with the target name.

This is obviously only an example, but there will be similar issues as
we extend the databases to hold much more data than they hold today. And
this will certainly bloat the upgrade scripts. The level of "bloating"
is debatable. The major question is, again, whether people feel that
being able to upgrade between intermediate verions of schema is a must.

> b) Is there an effort going on to synchronize the schema versions
> between the different db-platforms?  The difference between IE pgsql and
> mysql should not be THAT big - and that PG is using schema version 3 and
> MySQL is already in version 5 looks a bit strange to me.
> The schema version should reflect the actual contents of the schema
> (what columns and indexes and so on that they contain) and if they are
> more or less "on pair" between the DBs the version should also be more
> or less the same.
> - Of course, it can be that PG is lagging behind MySQL (I have not read
> the code) but this sounds like an issue that should be prioritized, as
> working with code that needs to behave vastly different based on what
> DB-engine is running in the background sounds like a big problem in the
> long run.
>

Yeah, I see your point. I think this would be more informative if we had
the same or nearly the same version number for the two schemas that
contain the same data set and support the same features.

The reason for the discrepancy today is that, historically, PostgreSQL
backend implementation was lagging behind MySQL. We had many more
revisions of MySQL schema and finally we updated Postgres within a
single release.

Our general feeling is that we should make sure that MySQL and Postgres
have the same feature set and the schema version numbers should mostly
match. But, I am not sure we can guarantee that this will always be the
case. Maybe we should try that and see how it goes.

Marcin Siodelski
ISC