Are you curious about SpamAssasin’s sa-update tool and what it does? As with many other programs geared towards servers, there are additional tools that are run inside of cron jobs and used by administrators. Knowing what these tools do and how they work can help you better understand your server and fix issues down the line.
The sa-update tool is used to pull new configuration files and rules from channels. These new files are used by SpamAssassin to classify emails as spam in addition to the Naive Bayes filtering. Among these files, there are definitions of free email providers, regex checks on subjects and body of messages, and more.
These files are stored in the
/var/lib/spamassassin/<MAJOR>.<MINOR><PATCH> directory. Do not edit these files directly, changes should be made only in your
A sa-update channel is a remote source where sa-update will get the new configuration files. If your company’s server security policy doesn’t allow this, you should disable the SpamAssassin cron job and either run a private channel or manually review the channel data and apply it to your server.
--channel can specify which channel to download new rules from. The default channel for many installations is
updates.spamassassin.org. If you run lots of SpamAssassin servers and want an easy way to update rules for all of them, you can run a channel and easily distribute the changes among your servers.
The way sa-update and channels interact is a bit strange and relies partially on DNS queries. A channel can either serve the configuration files themselves or point to a list of mirrors that would have the configuration files.
First, a TXT DNS request is made to the channel with the major, minor, and patch version numbers in reverse order as subdomains. The response is the latest version number of the configuration files.
$ dig +short txt 2.4.3.updates.spamassassin.org "1884121"
You can verify your version by looking for the line
# UPDATE version <version> in the file representing the channel URL e.g.
updates_spamassassin_org.cf inside of
/var/lib/spamassassin/<MAJOR>.<MINOR><PATCH>. This is done automatically by sa-update though, so don’t worry about it.
Now the list of mirrors has to be resolved with another TXT DNS query. The DNS response will be a URL to a
MIRRORED.BY file. The file lists one mirror per line in the format of
http(s)://<mirror> weight=<weight>. These mirrors are used to download the new configuration files.
$ dig +short txt mirrors.updates.spamassassin.org "http://spamassassin.apache.org/updates/MIRRORED.BY"
The program now tries a mirror to download the new configuration files. If a mirror fails sa-update will move onto the next one. The files that are downloaded are
Once the archive and checksum are verified, the archive is extracted into a directory representing the channel e.g.
updates_spamassassin_org in the directory
These new files aren’t yet used until you restart SpamAssassin. The cron job on the other hand will automatically restart the service.
SpamAssassin Cron Job
The sa-update tool generally isn’t called manually by administrators. Instead, the tool lives in a daily cron job for SpamAssassin. This cron job is disabled by default but it is recommended to enable it so that your server can get the latest rules from SpamAssassin.
Modify the environment variable
CRON to be any value other than
0 inside of the SpamAssassin environment file to enable the cron job. This environment file is loaded by SystemD and by the cron.daily cron job. The file is
/etc/defaults/spamassassin for Debian systems.
SA-Update Without Channels
You can avoid using channels by providing
sa-update with a
.tar.gz archive to be installed. This option works well if you have made lots of modifications to the rules and want to apply them to multiple servers, if your server security policy doesn’t allow for remote configuration updates, or if your SpamAssassin servers don’t have HTTP or DNS access.
Instead of calling with the
--channel <channel> option, you would use
--install <file>. The archive is in the same format as those downloaded from channels, it just uses a local file instead.