Postfix Performance Tuning


Purpose of Postfix performance tuning

The hints and tips in this document help you improve the performance of Postfix systems that already work. If your Postfix system is unable to receive or deliver mail, then you need to solve those problems first, using the DEBUG_README document as guidance.

For tuning external content filter performance, first read the respective information in the FILTER_README and SMTPD_PROXY_README documents. Then make sure to avoid latency in the content filter code. As much as possible avoid performing queries against external data sources with a high or highly variable delay. Your content filter will run with a small concurrency to avoid CPU/memory starvation, and if any latency creeps in, content filter throughput will suffer. High volume environments should avoid RBL lookups, complex database queries and so on.

Topics on mail receiving performance:

Topics on mail delivery performance:

Other Postfix performance tuning topics:

The following tools can be used to measure mail system performance under artificial loads. They are normally not installed with Postfix.

General mail receiving performance tips

When Postfix responds slowly to SMTP clients:

Doing more work with your SMTP server processes

With Postfix versions 2.0 and earlier, the smtpd(8) server pauses before reporting an error to an SMTP client. The idea is called tar pitting. However, these delays also slow down Postfix. When the smtpd(8) server replies slowly, sessions take more time, so that more smtpd(8) server processes are needed to handle the load. When your Postfix smtpd(8) server process limit is reached, new clients must wait until a server process becomes available. This means that all clients experience poor performance.

You can speed up the handling of smtpd(8) server error replies by turning off the delay:

/etc/postfix/main.cf:
    # Not needed with Postfix 2.1
    smtpd_error_sleep_time = 0

With the above setting, Postfix 2.0 and earlier can serve more SMTP clients with the same number SMTP server processes. The next section describes how Postfix deals with clients that make a large number of errors.

Slowing down SMTP clients that make many errors

The Postfix smtpd(8) server maintains a per-session error count. The error count is reset when a message is transferred successfully, and is incremented when a client request is unrecognized or unimplemented, when a client request violates access restrictions, or when some other error happens.

As the per-session error count increases, the smtpd(8) server changes behavior and begins to insert delays into the responses. The idea is to slow down a run-away client in order to limit resource usage. The behavior is Postfix version dependent.

IMPORTANT: These delays slow down Postfix, too. When too much delay is configured, the number of simultaneous SMTP sessions will increase until it reaches the smtpd(8) server process limit, and new SMTP clients must wait until an smtpd(8) server process becomes available.

Postfix version 2.1 and later:

Postfix version 2.0 and earlier:

Measures against clients that make too many connections

Note: this feature is not included with Postfix version 2.1.

The Postfix smtpd(8) server can limit the number of simultaneous connections from the same SMTP client, as well as the number of connections that a client is allowed to make per unit time. These statistics are maintained by the anvil(8) server (translation: if anvil(8) breaks, then connection limits stop working).

IMPORTANT: These limits are designed to protect the smtpd(8) server against flagrant abuse. Do not use these limits to regulate legitimate traffic: mail will suffer grotesque delays if you do so.

General mail delivery performance tips

Tuning the number of simultaneous deliveries

Although Postfix can be configured to run 1000 SMTP client processes at the same time, it is rarely desirable that it makes 1000 simultaneous connections to the same remote system. For this reason, Postfix has safety mechanisms in place to avoid this so-called "thundering herd" problem.

The Postfix queue manager implements the analog of the TCP slow start flow control strategy: when delivering to a site, send a small number of messages first, then increase the concurrency as long as all goes well; reduce concurrency in the face of congestion.

Examples of transport specific concurrency limits are:

The above default values of the concurrency limits work well in a broad range of situations. Knee-jerk changes to these parameters in the face of congestion can actually make problems worse. Specifically, large destination concurrencies should never be the default. They should be used only for transports that deliver mail to a small number of high volume domains.

A common situation where high concurrency is called for is on gateways relaying a high volume of mail from between the Internet and an intranet mail environment. Approximately half the mail (assuming equal volumes inbound and outbound) will be destined for the internal mail hubs. Since the internal mail hubs will be receiving all external mail exclusively from the gateway, it is reasonable to configure the gateway to make greater demands on the capacity of the internal SMTP servers.

The tuning of the inbound concurrency limits need not be trial and error. A high volume capable mailhub should be able to easily handle 50 or 100 (rather than the default 20) simultaneous connections, especially if the gateway forwards to multiple MX hosts. When all MX hosts are up and accepting connections in a timely fashion, throughput will be high. If any MX host is down and completely unresponsive, the average connection latency rises to at least 1/N * $smtp_connection_timeout, if there are N MX hosts. This limits throughput to at most the destination concurrency * N / $smtp_connection_timeout.

For example, with a destination concurrency of 100 and 2 MX hosts, each host will handle up to 50 simultaneous connections. If one MX host is down and the default SMTP connection timeout is 30s, the throughput limit is 100 * 2 / 30 ~= 6 messages per second. This suggests that high volume destinations with good connectivity and multiple MX hosts need a lower connection timeout, values as low as 5s or even 1s can be used to prevent congestion when one or more, but not all MX hosts are down.

If necessary, set a higher transport_destination_concurrency_limit (in main.cf since this is a queue manager parameter) and a lower smtp_connection_timeout (with a "-o" override in master.cf since this parameter has no per-transport name) for the relay transport and any transports dedicated for specific high volume destinations.

Tuning the number of recipients per delivery

The default_destination_recipient_limit parameter (default: 50) controls how many recipients a Postfix delivery agent will send with each copy of an email message. You can override this setting for specific Postfix delivery agents. For example, "uucp_destination_recipient_limit = 100" would limit the number of recipients per UUCP delivery to 100.

If an email message exceeds the recipient limit for some destination, the Postfix queue manager breaks up the list of recipients into smaller lists. Postfix will attempt to send multiple copies of the message in parallel.

IMPORTANT: Be careful when increasing the recipient limit per message delivery; some smtpd(8) servers abort the connection when they run out of memory or when a hard recipient limit is reached, so that the message will never be delivered.

The smtpd_recipient_limit parameter (default: 1000) controls how many recipients the Postfix smtpd(8) server will take per delivery. The default limit is more than any reasonable SMTP client would send. The limit exists to protect the local mail system against a run-away client.

Tuning the frequency of deferred mail delivery attempts

When a Postfix delivery agent (smtp(8), local(8), etc.) is unable to deliver a message it may blame the message itself, or it may blame the receiving party.

This process is governed by a bunch of little parameters.

queue_run_delay (default: 1000 seconds)
How often the queue manager scans the queue for deferred mail.
minimal_backoff_time (default: 1000 seconds)
The minimal amount of time a message won't be looked at, and the minimal amount of time to stay away from a "dead" destination.
maximal_backoff_time (default: 4000 seconds)
The maximal amount of time a message won't be looked at after a delivery failure.
maximal_queue_lifetime (default: 5 days)
How long a message stays in the queue before it is sent back as undeliverable. Specify 0 for mail that should be returned immediately after the first unsuccessful delivery attempt.
bounce_queue_lifetime (default: 5 days, available with Postfix version 2.1 and later)
How long a MAILER-DAEMON message stays in the queue before it is considered undeliverable. Specify 0 for mail that should be tried only once.
qmgr_message_recipient_limit (default: 20000)
The size of many in-memory queue manager data structures. Among others, this parameter limits the size of the short-term, in-memory list of "dead" destinations. Destinations that don't fit the list are not added.

IMPORTANT: If you increase the frequency of deferred mail delivery attempts, or if you flush the deferred mail queue frequently, then you may find that Postfix mail delivery performance actually becomes worse. The symptoms are as follows:

When mail is being deferred frequently, fixing the problem is always better than increasing the frequency of delivery attempts. However, if you can control only the delivery attempt frequency, consider using a dedicated fallback_relay "graveyard" machine for bad destinations so that they do not ruin the performance of normal mail deliveries.

Tuning the number of Postfix processes

The default_process_limit configuration parameter gives direct control over how many daemon processes Postfix will run. As of Postfix 2.0 the default limit is 100 smtp client processes, 100 smtp server processes, and so on. This may overwhelm systems with little memory, as well as networks with low bandwidth.

You can change the global process limit by specifying a non-default default_process_limit in the main.cf file. For example, to run up to 10 smtp client processes, 10 smtp server processes, and so on:

/etc/postfix/main.cf:
    default_process_limit = 10

You need to execute "postfix reload" to make the change effective. The limits are enforced by the Postfix master(8) daemon which does not automatically read main.cf when it changes.

You can override the process limit for specific Postfix daemons by editing the master.cf file. For example, if you do not wish to receive 100 SMTP messages at the same time, but do not want to change the process limits for local mail deliveries, you could specify:

/etc/postfix/master.cf:
    # ====================================================================
    # service type  private unpriv  chroot  wakeup  maxproc command + args
    #               (yes)   (yes)   (yes)   (never) (100)
    # ====================================================================
    . . .
    smtp      inet  n       -       -       -       10      smtpd
    . . .

Tuning the number of open files or sockets

When Postfix opens too many files or sockets, processes will abort with fatal errors, and the system may log "file table full" errors.