expvar.Int.Value appeared in Go 1.8, but we want to keep compatibility
with Go 1.7 at least until the next release.
So this patch replaces the use of expvar.Int.Value in tests, to make
them compatible with Go 1.7 again.
Currently, if we can't find a mail server for a domain, we consider that
a transient failure (semi-accidentally, as we iterate over the (empty)
list of MXs and fall through the list).
It should be treated as a permanent error, in line with other DNS
issues (which is not ideal but seems to be generally accepted
behaviour).
To avoid accidents/DoS when we are fetching a very very large policy,
this patch limits the size of the reads to 10k, which should be more
than enough for any reasonable policy as per the current draft.
The current tests stop short of fetching over HTTP, but that code is
unfortunately not trivial.
This patch changes the testing strategy to use a testing HTTP server,
which we point our URLs to. That way we can cover much more code with the
same tests.
This patch extends the SMTP courier to (optionally) do STS policy
checking when delivering mail.
As STS support is currently experimental, we gate this behind a flag and
is disabled by default.
This patch adds an on-disk cache for STS policies.
Policies are cached by domain, and stored on files in a single
directory. The files will have as mtime the time when the policy
expires, this makes the store simpler, as it can avoid keeping
additional metadata.
There is no in-memory caching. This may be added in the future, but for
now disk is good enough for our purposes.
This patch extends WriteFile to allow arbitrary operations to be applied
to the file before it is atomically renamed.
This will be used in upcoming patches to change the mtime of the file
before it is atomically renamed.
The "mx" field is required, a policy without it is invalid, so add a
check for it.
See
https://mailarchive.ietf.org/arch/msg/uta/Omqo1Bw6rJbrTMl2Zo69IJr35Qo
for more background, in particular the following paragraph:
> The "mx" field is required, so if it is missing, the policy is invalid
> and should not be honored. (It doesn't make sense to honor the policy
> anyway, I would say, since a policy without allowed MXs is essentially a
> way of saying, "There should be TLS and the server identity should match
> the MX, whatever the MX is." I guess this prevents SSL stripping, but
> doesn't prevent DNS injection, so it's of relatively little value.)
This EXPERIMENTAL patch has a basic implementation of MTA-STS (Strict
Transport Security), based on the current draft at
https://tools.ietf.org/html/draft-ietf-uta-mta-sts-02.
It integrates the policy fetching and checking into the smtp-check tool
for convenience, but not yet in chasquid itself.
This is a proof of concept. Many features and tests are missing; in
particular, there is no caching at all yet.
Currently, we pick the first host in the MX list, and attempt delivery
there. If it fails, we just report the failure to the queue, which will
wait for some time and then try again.
This is not ideal: we should fall back to the other MXs in the list, as
the first host could be having issues for a long time, and not
attempting with the rest just delays delivery.
This patch implements the fallback, so we try all MXs before deciding to
report a failed delivery (unless, of course, an MX returned a permanent
failure).
Currently there is no symbol for Fatal-level log entries, so they appear
with "-2", which is distracting.
This patch makes the Fatal log entries have their own symbol, ☠.
The server is written assuming there's at least one valid SSL/TLS
certificate. For example, it unconditionally advertises STARTTLS, and
only supports AUTH over TLS.
This patch makes the server fail to listen if there are no certificates
configured, so the users don't accidentally run an unsupported
configuration.
When testing, we don't want the server to do SPF lookups, as those cause
real DNS queries which can be problematic and add a dependency on the
environment.
This patch adds an internal boolean to disable the SPF lookups, which is
only set from the tests.
When building the Debian package, the tests are run in such a way that
test/util/exitcode is not found, which causes them to fail.
This patch is a workaround which skips the test if test/util/exitcode is
not found; for now we should consider it temporary, until either the
Debian packaging is fixed, or we decide that its environment is
reasonably enough to make it permanent.
:
Picking the domain used in the DSN message "From" header is more
complicated than it needs to be, causing confusing code paths and having
different uses for the hostname, which should be purely aesthetic.
This patch makes the queue pick the DSN "From" domain from the message
itself, by looking for a local domain in either the sender or the
original recipients. We should find at least one, otherwise it'd be
relaying.
This allows the code to be simplified, and we can narrow the scope of
the hostname option even further.
The loop test can be quite slow, specially on computers without
cryptography-friendly instructions.
This patch introduces a new flag for testing, so that we can bring the
threshold down to 5. The test is just as useful but now runs in a few
seconds, as opposed to a few minutes.
The queue IDs are internal to chasquid, and we don't need them to be
cryptographically secure, so this patch changes the ID generation to use
the PRNG.
This also helps avoid entropy issues.
glog works fine and has great features, but it does not play along well
with systemd or standard log rotators (as it does the rotation itself).
So this patch replaces glog with a new logging module "log", which by
default logs to stderr, in a systemd-friendly manner.
Logging to files or syslog is still supported.
This patch makes the hooks always have a complete set of environment
varuables, set to 0/1 or whatever is appropriate, to make it easier to
write the checks for them.
It is can be convenient for hooks to indicate that an error is
permanent; for example if the anti-virus found something.
This patch makes it so that if the hook exits with code 20, then it's
considered permanent. Otherwise it is considered transient, to help
prevent accidental errors cause final delivery issues.
The "ptr" mechanism should not be used, but unfortunately many domains
still do.
This patch extends our SPF implementation to support checking it.
https://tools.ietf.org/html/rfc7208#section-5.5
Calculating the next delay based on the previous delay causes daemon
restarts to start from scratch, as we don't persist it.
This can cause a few server restarts to generate many unnecessary sends.
This patch changes the next delay calculation to use the creation time
instead, and also adds a <=1m random perturbation to avoid all queued
emails to be retried at the exact same time after a restart.
TestAliases is unfortunately racy, and cause occasional failures.
This patch rewrites it to be more end-to-end, similar to TestBasic,
which should remove the races while keeping the main objectives of the
test.
The queue currently only considers failed recipients when deciding
whether to send a DSN or not. This is a bug, as recipients that time out
are not taken into account.
This patch fixes that issue by including both failed and pending
recipients in the DSN.
It also adds more comprehensive tests for this case, both in the queue
and in the dsn generation code.
The default INFO logs are more oriented towards debugging and can be
a bit too verbose when looking for high-level information.
This patch introduces a new "maillog" package, used to log messages of
particular relevance to mail transmission at a higher level.
This patch makes safeio preserve file ownership. This is specially
useful when using command-line utilities as root, but the files they
change are owned by a different user.
This patch implements a post-DATA hook, which is run after receiving the
data but before sending a reply.
It can be used to implement content filtering when receiving email, for
example for passing the email through an anti-spam or an anti-virus.
Unknown commands can fill the logs, traces and expvars with a lot of
noise; this patch sanitizes them a bit down to 6 bytes, as a compromise
to maintain some information for troubleshooting.
The submission port is expected to be used only by authenticated
clients, so this patch makes chasquid enforce this, which also helps
to reduce spam.
https://www.rfc-editor.org/rfc/rfc6409.txt
Today, we pick the domain used to send the DSN from based on what we
presented to the client at EHLO time, which itself may be based on the
TLS negotiation (which is not necessarily trusted).
This is complex, not necessarily correct, and involves passing the
domain around through the queue and persisting it in the items.
So this patch simplifies that handling by always using the main domain
as specified by the configuration.
This patch moves chasquid's Server and Conn structures to their own
smtpsrv package, to make chasquid.go a bit more readable. It also helps
clarify the relation between Server and Conn.
There are no functional changes.
Note that git can still track the history across this commit (e.g. git
gui blame shows the right data).
This patch introduces a new "domaininfo" package, which implements a
database with information about domains. In particular, it tracks
incoming and outgoing security levels.
That information is used in incoming and outgoing SMTP to prevent
downgrades.
This patch is the result of running go vet, go fmt -s and the linter,
and fixing some of the things they noted/suggested.
There shouldn't be any significant logic changes, it's mostly
readability improvements.
Errors can contain newlines, in particular this is common in messages
returned by remote SMTP servers.
This patch quotes them before logging, they don't interfere with the
log. Note the tracer itself can handle this just fine.
This patch simplifies the sending loop code:
- Move the recipient sending function from a closure to a method.
- Simplify the status update logic: we now update and write
unconditionally (as we should have been doing).
- Create a function for counting recipients in a given status.
It also adds a test for the removal of completed items from the queue,
which was not covered before and came up during development.
We should ignore the domains' case, and treat them uniformly, specially when it
comes to local domains.
This patch extends the existing normalization (IDNA, keeping domains as
UTF8 internally) to include case conversion and NFC form for
consistency.
This patch implements local username normalization using PRECIS
(https://tools.ietf.org/html/rfc7564,
https://tools.ietf.org/html/rfc7613)
It makes chasquid accept local email and authentication regardless of
the case. It covers both userdb and aliases.
Note that non-local usernames remain untouched.
The default for max events is 10, which is a bit short for a normal SMTP
exchange. Expand it to 30 which should be large enough to keep most of
the traces.
We don't want to write debug messages to the log, but having them in
traces is always useful.
The traces are volatile and self-cleaning so the additional volume
should not cause any problems, and helps troubleshooting.
While at it, fix the depth constant when logging.
The logic that picks the domain we use for EHLO does not need to live
within the TLS retry cycle, and makes it harder to understand.
This patch moves it out of the way, to improve readability.
This patch introduces expvar counters to chasquid and the queue
packages.
For now there's only a handful of counters, but they will be expanded in
future patches.
The test courier has a racy map access, and this started to manifest
after some recent changes. This patch fixes the race by implementing the
corresponding locks.