Although native Linux networking allows you to set up your Linux server as an Internet firewall and gateway for a number of machines on a network, utilizing a caching proxy server can help reduce your bandwidth usage, as well as give you enhanced logging and filtering capabilities.
The Squid Web Proxy Cache is a popular, free implementation of such a server, and it runs on most Unix systems. The Squid homepage is at http://www.squid-cache.org/. Squid can cache HTTP, FTP, and DNS lookups, enhancing the sharing of an Internet connection by storing frequently accessed data on the local network.
Getting and Installing Squid
Downloads are available from the Squid homepage, either as binary files or source tarballs. The stable version as of this writing is version 2.3. There is also a user guide at http://squid-docs.sourceforge.net/latest/html/book1.htm.
If you build from source, the compilation is quite easy. A basic installation with the default options would go something like this:
tar -xzf squid-2.3-200101270000-src.tar.gz
If you want to explore the options available at compile time, type:
A number of switchable options are available to control where Squid installs itself, memory usage, and default language, among others.
If you've installed Squid using the defaults, the configuration file can be found at /usr/local/squid/etc/squid.conf.
The first option you will see in this file is http_port. By default, Squid uses port 3128. Otherwise you define your port(s) as follows:
Another important item is the amount of memory allocated to the cache. The directive must be defined in multiples of 4KB. The default is 8MB:
cache_mem 8 MB
Squid also caches DNS lookups, which can also save time and bandwidth. The default setting is 1024 entries, and is controlled by the following line:
By default, Squid stores the cached data in/usr/local/squid/bin/squid. This directive controls the filesystem type, the directory used, the allowed size in MB, and the number of first- and second-level subdirectories:
cache_dir ufs /usr/local/squid/cache/ 100 16 256
Logging is done in /var/log/squid/access.log and /var/log/squid/cache.log. Other directives control where these logs are placed, and the level of logging:
If Squid dies, e-mail is sent to the user defined under cache_mgr. This address is also appended to error pages the users might see. The default is webmaster, but you can set it appropriately:
You should either create a "squid" user and group ID for the Squid server process, or assign it to another account with few system rights, like "nobody":
You will also need to create the cache directory and change the ownership of both the cache and log directories to the squid user:
chown nobody.nobody cache logs
Finally, we get to access control. It allows you to limit where, when, and what machines can access certain sites. You can get really draconian here and severely restrict access, or drill down and address problem employees who would rather surf than work. A very basic set of control lines is the following:
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl all src 0.0.0.0/0.0.0.0
acl allowed_hosts src 192.168.192.0/255.255.255.0
http_access deny manager all
http_access allow allowed_hosts
http_access deny all
icp_access allow allowed_hosts
icp_access deny all
The allowed_hosts line should correspond to your internal network configuration.
Many things can be done with combinations of access control lists and access rules. For example, these lines would keep all internal IPs off the Web except during lunchtime:
acl lunchtime MTWHF 12:00-13:00
http_access allow allowed_hosts lunchtime
And the following would bar a problem user from the ebay domain:
acl problem_user src 192.168.192.22/255.255.255.0
acl ebay dst ebay.com
http_access deny problem_user ebay