Yes Grasshopper, there are tools to ease the toil of administering large-scale filesystems. So many demands: ease of use, stability, speed, flexibility, mobile users, security, and all too often last on the list, ease of administration. AFS meets all of these goals.
AFS stands for Andrew File System. It was originally developed at Carnegie-Mellon University in the early 1980's, to meet the needs of serving many different departments, students, and faculty. It was named to honor the university's founder. The Transarc Corp. marketed it, then IBM bought Transarc in 1994. In 2000 IBM forked the code, releasing a free version, OpenAFS, and continues to sell and support their own commercial version. Both versions are actively maintained. IBM contributes to, but does not control or provide technical support for OpenAFS. (Officially, it is now simply AFS, not Andrew File System.)
OpenAFS is released under the CPL, IBM's Common Public License, formerly the IPL, IBM Public License. The CPL is an Open Source Initiative-compatible license, but is not GPL compatible. (Hobby idea for people with time on their hands: collect and index all available software licenses. Should keep you busy for a few months.)
AFS Speaks Volumes
Anyone who has ever struggled with editing /etc/fstab and mounting filesystems for NFS, or mapping herds of drive letters in Windows, should appreciate the elegance of AFS' design. It manages volumes, so users do not need to worry about the physical locations of files. It follows the client/server model: servers deliver files to client machines, so the admin doesn't need to learn any weirdo newfangled organizational concepts. And it does it in such a lovely way. A volume location database tracks the physical file and directory locations. User requests file, database locates it. No muss, no fuss.
Volumes reside on physical or logical partitions that are mounted on the AFS server. Each volume is independent of the others, and can be moved while in use, enabling load balancing and resource management. Copy and distribute heavily-used files over several servers. The client machines can even be directed to certain servers that you specify. This is more work than letting the volume location database answer client requests, but it is another way to manage load-balancing.
My favorite feature of volume managers is being able to resize volumes without endangering the data. No need to agonize over partition sizes and types on when installing the operating system- if you don't get it just right, no problem, AFS lets you do whatever you want. AFS can be used in conjunction with the Linux Volume Manager for even finer-grained and more flexible volume management.
Higher Than Root
Instead of managing far-flung files as relative to the user's local filesystem, AFS mounts its filesystems in a single namespace, /afs, which is "higher" than the root filesystems on the client PCs. Client PCs need only mount a single directory, /afs. Is that genius or what?
"The network is the computer"- this is more true than ever. Now we have powerful machines in both the server room and on the desktop. This is a whole lot of largely under-utilized power. AFS takes advantage of this abundance by caching server data on the client machines, and moving most of the computational work to the client. The cache is persistent, surviving power failures and reboots. Quite dandy for mobile users, who may connect via dialup, download application binaries and data files, then work offline. It is conservative of network resources, and because the user does not need a constant connection to the server, adds fault-tolerance. No need to heed the wails of despair- "The network is down!" No more excuses to not finish work.
AFS uses Kerberos for authentication and data security, and ACLs (access control lists) to manage directory privileges. Any user can create an AFS group, and assign users and privileges. In my world that is far better than pestering the sysadmin for such mundane chores.
Cells and Sites
Volumes are divided into cells and sites. A cell is a user group: perhaps a department, or a project group, whatever you need, and the client machines and servers for that group. An AFS site is a grouping of cells. The cell that you belong to is your local cell. All the others are foreign cells. Sounds rather Spy v. Spy, but there it is.
Each cell organizes and maintains its own filespace. Cells can connect to filespaces in other cells, and share files with each other. Volume quotas, of course, can be set on users. Can't have them cluttering up your nice servers with all kinds of stuff. Adding new users, cells, and servers is relatively simple.
Editing and saving files is easy as pie. Users can save as they go, in the usual fashion, and changes will be recorded on their local disk. Changes are not stored back on the server until the file is closed. No extra steps are needed. If there are multiple copies of the file in the AFS system, AFS updates all of them.
One thing it does not do is magically enable seamless collaboration on the same file. The last save overwrites all previous saves, like any filesystem. ACLs can be tweaked to protect your own directories, in addition to setting the usual permissions on individual files.
Installation and the initial setup are complex and time-consuming. After all, you're installing an entirely new filesystem. AFS runs on most UNIXes, and Windows. AFS must be either compiled into the kernel of every UNIX server and client machine, or loaded as a module. Windows installation employs the usual setup.exe, plus some system files must be added manually. The instructions are very detailed, do what they say and you'll be fine. The one problem is they are written for the commercial version, which comes on CD. The free version is available only as a download- be sure to use the included READMEs and instructions. Here's one gotcha I learned the hard way: on Linux, the AFS cache must be in ext2. Otherwise use any filesystem you like- ext3, ReiserFS, etc.
Gentoo Linux, the hot new Linux kid on the block, has a wonderful page on building an AFS server. Gentoo has a slick installation and package management system, and excellent instructions, this is one good way to build your first AFS server.