WAN optimization is a complex and expensive, yet sometimes required investment. Even if you aren't running a branch office in Africa over ISDN, the need for WAN optimization and acceleration exists within nearly every business. The problem is that these products are extremely expensive. Wouldn't it be great if the same functionality could be accomplished with commodity PC hardware and free open source tools?
Mostly, it can be done.
What Can't Be Done
Riverbed and other vendors implement Wide Area File Services (WAFS), which is a fancy way to say it caches CIFS and NFS data. If multiple people are working on the same file, or if the same file gets opened and closed more than once, that data does not really need to be shipped to the remote file server. It's even fancier than that; long-term caching of files also makes opening Word documents, which require a lot of bi-directional communication just to open, much faster. WAFS implements a (generally) safe mechanism for caching data when sending it over the WAN would be redundant. General caching proxies are not optimized for file sharing data, and will often have to send the whole thing, whereas the WAFS-style devices can be much more clever about it.
There are, unfortunately, no open source tools to create a systems like this, but WAFS is only one (albeit powerful) method for optimizing the WAN. Most businesses can realize substantial performance and usability improvements by leveraging QoS, caching proxies, and compression available in various open source tools. Jumping straight into a commercial WAFS solution is not recommended; the pricing is staggering, you may not even need WAFS-like features, and the architectural limitations of WAFS are sure to put a damper on your plans.
QoS and Queuing With FreeBSD or Linux
A big part of making a congested WAN link usable is prioritization. Especially if VoIP traffic traverses the WAN link! The good news is that Linux and the BSD family of operating systems can employ effective QoS and traffic shaping / queuing (shaping is accomplished by queuing packets). The prioritization aspect comes in when the kernel is deciding which traffic to allow through when some is queued.
Things can get very complex, very quickly, when configuring QoS. There are different methods for classifying and queuing traffic, as well as for determining how to stall (or kill) non-priority traffic. Ultimately, it certainly is possible to deploy a firewall with traffic shaping such that VoIP always works, Web browsing to internal applications gets priority over Web browsing to the Internet, and whatever other critical traffic you may have is given the proper consideration. Understanding how to classify packets is required to configure even a SOHO-class router with these features, so in the end it's worth doing it in Linux or FreeBSD to get the full feature set.
Web traffic, even to internal servers, can be drastically reduced by using a caching HTTP proxy. Images and other large items can be cached locally using the standard Squid or Varnish proxy servers. Instead of using the WAN link to fetch images on remote Web and application servers every time a user clicks, the content is served from the local cache. All HTTP traffic can be transparently redirected through a proxy without any client-side configuration. HTTPS traffic can be proxied as well, but the Web browsers will need to be configured to use the proxy.
Squid can also proxy FTP traffic, or any other protocol it is configured to work with. This is beneficial for sites that may have a custom application and communication protocol that doesn't use HTTP or FTP. The vast majority of use cases, though, simply require HTTP caching to realize a huge decrease in the amount of WAN traffic.
Before we talk about the two solutions that implement caching, compression, and other tricks, take note that another partial solution exists. In addition to providing security, OpenVPN can also employ compression. It runs in user-space, so latency will increase a tad, but when you need to physically ship less data (or risk saturating the link), OpenVPN is a good option.
WANProxy is a generic (as in flexible) TCP proxy. It can be deployed transparently to compress all TCP traffic between two endpoints. WANProxy can be used to filter data through a Squid proxy instead of the standard method of redirecting traffic using iptables rules in Linux. The benefit of filtering traffic through WANProxy as well, is that you get the compression benefits. And finally, WANProxy also caches (in RAM) some data, so that duplicate data doesn't have to be re-sent over the WAN.
With the combination of WANProxy and Squid, a random remote file, say a Word document, will be transferred over the WAN (compressed) in full the first time someone in the office opens it. The second time, however, it gets served from local cache using no WAN bandwidth and providing immediate response to the end user.