Facilitating installation and update of anti-spam filter
Jesse Rehmer
jesse.rehmer at blueworldhosting.com
Sun Jul 10 20:37:46 UTC 2022
> On Jul 10, 2022, at 1:34 PM, Julien ÉLIE <julien at trigofacile.com> wrote:
>
> Local patches should indeed be retained, I agree.
>
> Note that you would have the same problem if distributions provided inn2-cleanfeed packages... Any update would erase local changes (unless done in cleanfeed.local).
Until an integrated filter that is regularly maintained and battle-tested is introduced with INN, I’d rather not have anything mess with pyClean. I’ve found some of its filters too aggressive and have disabled several, which it doesn’t provide an easy way to do via config files. Cleanfeed has config options for disabling each of its checks, if I recall correctly. In my case filter_innd.py is heavily modified. Overall pyClean does it’s job, but I find the PHN/FSL filters reject a fair amount of “valid” articles. I put quotes around valid because admittedly the articles I’m referring to are often part of flame wars and such, but they are not what I consider spam.
I can’t get pyClean to work with Python 3.8/3.9 and had to revert to using Python 2.7. It loads and reports it is hooked into INN but never does any filtering or logging. Looking at the Github repo it has had a few updates in the last couple years, but the majority of the code is 7 years old. Another hurdle for me was Steve’s repo has zero installation instructions. When I first came across it I had no idea how to properly install and configure. It wasn’t until I stumbled upon a fork whose maintainer took the time to provide a basic install guide that it was clear what to do.
If such filtering packages are going to be distributed with INN hopefully they get more love and maintenance. I don’t know if this would improve performance, but I’d like the filter to run as an external process. During busy times innd with pyClean is using ~40-60% more CPU than another innd process handing the same feed with no filtering. I lack the skills to dive into whether this is due to pyClean code, the Python hook, or a combination but it uses quite a bit more CPU than Cleanfeed. My server isn’t that busy compared to others, but I could see it being unusable for those handling just a small fraction of binary groups. I only carry a few that are sporadic, but I can tell when binary articles are coming in by how much one of my spool machine’s CPU usage spikes. Having the filter run as a separate process would at least give me a better idea where to place blame and investigate for improvement.
Whatever is decided, let’s please not introduce another “Cleanfeed” into the world, where multiple iterations by various maintainers are strewn amongst the Internet, and for a beginner is confusing to understand what is the newest version, where to get it, who maintains it, etc.
> Nonetheless, we have configuration files for which news admins should give better care. How could they notably know there are moderating rules to change? (the fido7.* line in moderators should for instance point to @fido7.org and no longer @fido7.ru - I bet most news admins did not do the change)
> Same thing for control.ctl when a rule (and sometimes its associated key) change...
>
> Which in fact raises the question of how to ease the administration of a news server and inform admins of changes they should have a look at?
Good question, this has been a point of struggle for me after putting down INN/Diablo/NNTP for some years and picking things up again. This area of administration is a mess across the entirety of Usenet. INN’s maintainers have seemingly been the only entity attempting to keep the usefulness of control messages alive. A common method to update, or notify of updates, to the control.ctl or filters would be a good start. I try to keep this stuff up-to-date, but it can be confusing what is “up-to-date” and where is the proper place to look for changes.
Regards,
Jesse Rehmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <https://lists.isc.org/pipermail/inn-workers/attachments/20220710/0732c078/attachment-0001.sig>
More information about the inn-workers
mailing list