TestAliases is unfortunately racy, and cause occasional failures.
This patch rewrites it to be more end-to-end, similar to TestBasic,
which should remove the races while keeping the main objectives of the
test.
The queue currently only considers failed recipients when deciding
whether to send a DSN or not. This is a bug, as recipients that time out
are not taken into account.
This patch fixes that issue by including both failed and pending
recipients in the DSN.
It also adds more comprehensive tests for this case, both in the queue
and in the dsn generation code.
The default INFO logs are more oriented towards debugging and can be
a bit too verbose when looking for high-level information.
This patch introduces a new "maillog" package, used to log messages of
particular relevance to mail transmission at a higher level.
Today, we pick the domain used to send the DSN from based on what we
presented to the client at EHLO time, which itself may be based on the
TLS negotiation (which is not necessarily trusted).
This is complex, not necessarily correct, and involves passing the
domain around through the queue and persisting it in the items.
So this patch simplifies that handling by always using the main domain
as specified by the configuration.
This patch is the result of running go vet, go fmt -s and the linter,
and fixing some of the things they noted/suggested.
There shouldn't be any significant logic changes, it's mostly
readability improvements.
This patch simplifies the sending loop code:
- Move the recipient sending function from a closure to a method.
- Simplify the status update logic: we now update and write
unconditionally (as we should have been doing).
- Create a function for counting recipients in a given status.
It also adds a test for the removal of completed items from the queue,
which was not covered before and came up during development.
This patch introduces expvar counters to chasquid and the queue
packages.
For now there's only a handful of counters, but they will be expanded in
future patches.
The test courier has a racy map access, and this started to manifest
after some recent changes. This patch fixes the race by implementing the
corresponding locks.
This patch reviews various debug and informational messages, making more
uniform use of tracing, and extends the monitoring http server with
useful information like an index and a queue dump.
If there's an alias to forward email to a non-local domain, using the original
From is problematic, as we may not be an authorized sender for it.
Some MTAs (like Exim) will do it anyway, others (like gmail) will construct a
special address based on the original address.
This patch implements the latter approach, which is safer and allows the
receiver to properly enforce SPF.
We construct a (hopefully) reasonable From based on the local user, and
embedding the original From (but transformed for IDNA, as the receiver may not
support SMTPUTF8).
This patch performs some minor cleanups for things detected by "go vet":
- Remove one line of unreachable code.
- Don't leak contexts until their deadline expires, cancel them.
When we permanently failed to deliver to one or more recipients, send delivery
status notifications back to the sender.
To do this, we need to extend a couple of internal structures, to keep track
of the original destinations (so we can include them in the message, for
reference), and the hostname we're identifying ourselves as (this is arguable
but we're going with it for now, may change later).
This patch makes the queue and couriers distinguish between permanent and
transient errors when delivering mail to individual recipients.
Pipe delivery errors are always permanent.
Procmail delivery errors are almost always permanent, except if the command
exited with code 75, which is an indication of transient.
SMTP delivery errors are almost always transient, except if the DNS resolution
for the domain failed.
This patch tidies up the Procmail courier:
- Move the configuration options to the courier instance, instead of using
global variables.
- Implement more useful string replacement options.
- Use exec.CommandContext for running the command with a timeout.
As a consequence of the first item, the queue now takes the couriers via its
constructor.
With the introduction of aliases, the queue may now be delivering mail to
pipes. This patch implements pipe delivery.
It uses a fixed 30s timeout for now, as these commands should really not take
much time, and we don't want to overly complicate the configuration for now.
This patch makes the queue read and write items to disk.
It uses protobuf for serialization. We serialize to text format to make
manual troubleshooting easier, as the performance difference is not very
relevant for us.
This patch does various minor style and simplification cleanups, fixing things
detected by tools such as go vet, gofmt -s, and golint.
There are no functional changes, this change is purely cosmetic, but will
enable us to run those tools more regularly now that their output is clean.
The routing courier is a nice idea in theory, but at least for now, we want
the queue to be aware of when a destination is local so we can implement
differentiated logic.
This may change in the future, though, but at the moment it's not clear that
the abstractions will be worth it.
So this patch removes it, and makes the queue do the routing. There is no
difference in how the two are handled yet, those will come in subsequent
patches.
This patch introduces incremental delays in the queue, so retries are not all
done with the same delay.
The table is static; we could perturb it but there's not that much benefit
anyway, at least for now.
This patch introduces the couriers, which the queue uses to deliver mail.
We have a local courier (using procmail), a remote courier (uses SMTP), and a
router courier that decides which of the two to use based on a list of local
domains.
There are still a few things pending, but they all have their basic
functionality working and tested.
This patch introduces a basic, in-memory queue that only holds emails for now.
This slows down the benchmarks because we don't yet have a way to wait for
delivery (even if fake), that will come in later patches.