Next Previous Contents

20. httpd-accelerator mode

20.1 What is the httpd-accelerator mode?

Occasionally people have trouble understanding accelerators and proxy caches, usually resulting from mixed up interpretations of "incoming" and ``outgoing" data. I think in terms of requests (i.e., an outgoing request is from the local site out to the big bad Internet). The data received in reply is incoming, of course. Others think in the opposite sense of ``a request for incoming data".

An accelerator caches incoming requests for outgoing data (i.e., that which you publish to the world). It takes load away from your HTTP server and internal network. You move the server away from port 80 (or whatever your published port is), and substitute the accelerator, which then pulls the HTTP data from the ``real" HTTP server (only the accelerator needs to know where the real server is). The outside world sees no difference (apart from an increase in speed, with luck).

Quite apart from taking the load of a site's normal web server, accelerators can also sit outside firewalls or other network bottlenecks and talk to HTTP servers inside, reducing traffic across the bottleneck and simplifying the configuration. Two or more accelerators communicating via ICP can increase the speed and resilience of a web service to any single failure.

The Squid redirector can make one accelerator act as a single front-end for multiple servers. If you need to move parts of your filesystem from one server to another, or if separately administered HTTP servers should logically appear under a single URL hierarchy, the accelerator makes the right thing happen.

If you wish only to cache the ``rest of the world" to improve local users browsing performance, then accelerator mode is irrelevant. Sites which own and publish a URL hierarchy use an accelerator to improve other sites' access to it. Sites wishing to improve their local users' access to other sites' URLs use proxy caches. Many sites, like us, do both and hence run both.

Measurement of the Squid cache and its Harvest counterpart suggest an order of magnitude performance improvement over CERN or other widely available caching software. This order of magnitude performance improvement on hits suggests that the cache can serve as an httpd accelerator, a cache configured to act as a site's primary httpd server (on port 80), forwarding references that miss to the site's real httpd (on port 81).

In such a configuration, the web administrator renames all non-cachable URLs to the httpd's port (81). The cache serves references to cachable objects, such as HTML pages and GIFs, and the true httpd (on port 81) serves references to non-cachable objects, such as queries and cgi-bin programs. If a site's usage characteristics tend toward cachable objects, this configuration can dramatically reduce the site's web workload.

Note that it is best not to run a single squid process as both an httpd-accelerator and a proxy cache, since these two modes will have different working sets. You will get better performance by running two separate caches on separate machines. However, for compatability with how administrators are accustomed to running other servers that provide both proxy and Web serving capability (eg, CERN), the Squid supports operation as both a proxy and an accelerator if you set the httpd_accel_with_proxy variable to on inside your squid.conf configuration file.

20.2 How do I set it up?

First, you have to tell Squid to listen on port 80 (usually), so set the 'http_port' option:

        http_port 80

Next, you need to move your normal HTTP server to another port and/or another machine. If you want to run your HTTP server on the same machine, then it can not also use port 80 (except see the next FAQ entry below). A common choice is port 81. Configure squid as follows:

        httpd_accel_host localhost
        httpd_accel_port 81
Alternatively, you could move the HTTP server to another machine and leave it on port 80:
        httpd_accel_host otherhost.foo.com
        httpd_accel_port 80

You should now be able to start Squid and it will serve requests as a HTTP server.

If you are using Squid has an accelerator for a virtual host system, then you need to specify

        httpd_accel_host virtual

Finally, if you want Squid to also accept proxy requests (like it used to before you turned it into an accelerator), then you need to enable this option:

        httpd_accel_with_proxy on

20.3 When using an httpd-accelerator, the port number for redirects is wrong

Yes, this is because you probably moved your real httpd to port 81. When your httpd issues a redirect message (e.g. 302 Moved Temporarily), it knows it is not running on the standard port (80), so it inserts :81 in the redirected URL. Then, when the client requests the redirected URL, it bypasses the accelerator.

How can you fix this?

One way is to leave your httpd running on port 80, but bind the httpd socket to a specific interface, namely the loopback interface. With Apache you can do it like this in httpd.conf:

        Port 80
        BindAddress 127.0.0.1
Then, in your squid.conf file, you must specify the loopback address as the accelerator:
        httpd_accel_host 127.0.0.1
        httpd_accel_port 80

Note, you probably also need to add an /etc/hosts entry of 127.0.0.1 for your server hostname. Otherwise, Squid may get stuck in a forwarding loop.


Next Previous Contents