XFS: It's worth the wait

Friday Jul 21st 2000 by Vincent Danen

The advanced XFS file system is flexible, powerful, and fast. You may want to consider it for use on your Linux network.

In 1994, Silicon Graphics Inc., of Mountain View, Calif., (SGI) released a new journaled file system on IRIX, the company's System V-based version of UNIX. This advanced file system, called XFS, replaced SGI's old EFS (Extent File System) file system, which was designed similar to the Berkeley Fast File System. Coordinating with many other kernel developers, SGI is currently working to tightly integrate the XFS file system with the Linux operating system so that we can take advantage of the many benefits of XFS over the current ext2 file system. This article discusses XFS and its technical specifications.

Origin of XFS

SGI designed XFS with a few very important features in mind, and for very specific reasons. In 1990, SGI realized that it would need to create something to replace EFS; EFS could not handle the demands of new and forthcoming applications. The issues facing any file system at that time were demands for increased disk capacity and bandwidth, and parallelism with new applications such as film, video, and large databases. Because EFS couldn't hope to handle these needs efficiently, SGI created XFS for the purpose of handling new applications by providing support in a few key areas. These areas included fast crash recovery, large file systems, and large directories and files.

In 1999, SGI began to turn an eye to Linux as a viable and attractive operating platform to support. Due to the nature of Linux, and because SGI knew it had something to offer that would provide Linux with the same file-system capabilities as those found in IRIX, SGI released Open XFS to the Linux community.

Overview of XFS features

XFS provides some basic and powerful features that meet the requirements for any large file system, file, or directory. Let's take a look at some of these features:

XFS uses B+ trees extensively in place of the traditional linear file system structure. B+ trees use a highly efficient indexing method to index directory entries, manage file extents, locate free space, and keep track of the locations of file index information. As a result, reading file systems and retrieving information from them happens quickly--without using large amounts of system resources.

Currently, the XFS team is developing enhancements to the Linux page cache so XFS can be tightly integrated with the Linux kernel. This work is being done so XFS relies solely on the page cache to store both file data and file system metadata. This work can also be used to enhance other file systems to improve overall system performance, because it is being developed at a kernel level. These features will most likely be unavailable until Linux 2.5, except as a part of XFS itself.

XFS also dynamically allocates disk blocks to inodes. If an application uses a small number of files that are very large, very little disk data is used to store the actual files--and the remainder of the disk is freed for more data. If an application uses many small files, more disk space is made available for directories and files. This process is handled dynamically, with no need for user intervention or configuration; you can create your initial file system without specifying block sizes according to what type of application will be using it. For example, you no longer need to create a file system with a smaller block size for efficient use by a mail server. XFS handles all of this internally with an advanced space management technique that utilizes contiguity, parallelism, and fast logging.

Many powerful support utilities come with XFS and enhance it remarkably. It includes the following:

  • A very fast mkfs utility to make the file system
  • Advanced dump and restore utilities for backups
  • xfs_db for debugging
  • xfs_check for checking the file system
  • xfs_repair for file system repairs
  • xfs_fsr for defragmenting XFS file systems
  • xfs_bmap, which can be used to interpret the metadata layouts for the file system
  • grow_fs, which will enlarge XFS file systems online

XFS also provides file system journaling. This means that XFS uses database recovery techniques to recover a consistent file system state after a system crash. Using journaling, XFS is able to accomplish this recovery in under a second, regardless of the file system size. Traditional linear file systems without journaling, however, must run the fsck command over the entire file system to check it after a system crash; this process is rapid on smaller file systems, but can take a lot of time (in some cases measured in hours) on larger file systems. XFS is able to accomplish this fast recovery by logging all file transactions with information on free lists, inodes, directories, and so on. After a crash, the logs are analyzed, and XFS can quickly determine which transactions must be done in order to synchronize the file system to the state it was in prior to the crash.

XFS scalability

XFS Technical Specifications

The following list summarizes most of the features that XFS provides. Because the Linux implementation of XFS is still in the development stages, the features listed may or may not be applicable to the Open XFS for Linux specification. These features are available to XFS for IRIX, and they give a reasonable idea of what we can expect from a Linux implementation of XFS:

  • Scalable file sizes, up to 9 million TB
  • Scalable file systems, up to 18 million TB
  • High performance on all file systems, regardless of size
  • Millions of files per file system
  • Millions of files per directory
  • Rapid file system recovery
  • Rapid transaction rates
  • Rapid directory searches and space allocation using B+ trees
  • NFS version 3 compatibility
  • Journaled 64-bit file system
  • Fast performance. Throughput in excess of 7GBps has been demonstrated on a single file
  • System using a 32-processor Origin2000 server. Single file reads and writes exceed 4GBps

File system scalability is the ability of the file system to provide support for very large file systems, large files, large directories, and large numbers of files while still providing good I/O performance. The scalability of a file system depends somewhat on how it stores information on files.

To illustrate this point, let us compare XFS (a 64-bit file system) to any other 32-bit file system. Because XFS uses 64 bits to store inode numbers and addresses for each disk block, a single file can theoretically be as large as 9 million terabytes. A 32-bit file system, however, cannot usefully exceed file sizes of 4GB. I don't honestly know anyone who needs a file to be 9 million TB (or even 4GB!), but by providing such a high level of scalability, XFS ensures that it will not become an obsolete or unusable file system for many years to come. For individuals in high-level science applications (for example, NASA), or those in the video or audio industries where file sizes can reach ridiculous sizes, XFS is necessary to make their work easier and plausible.

Large directories are also an issue with traditional linear file systems. Applications such as Sendmail or news servers often result in spool directories with thousands of files. Looking up a filename in such a directory can take a long time, because typically the directory must be read from the beginning until the desired file is found. Because XFS uses a B+ tree structure, it makes directory searching extremely fast. Filenames in the directory are converted to a four-byte hash value and are used to index the B+ tree. Using this method, all directory functions (searching, creating, and removing) are very efficient and fast.

Using the same idea, XFS supports large numbers of files efficiently because inodes are allocated dynamically and multiple file operations are performed in parallel. The only limitation for XFS in regards to the number of files in a file system is the space available to hold them. Because XFS dynamically allocates inodes, free space usage is extremely efficient, regardless of the file size. With traditional file systems--in which the number of inodes is specified during file system creation--you are limited by that initial number of inodes. You can increase or decrease the inode size and number during the file-system creation, but then you end up locking the system into a specific state of usability. If you use a large number of inodes up front, you consume a lot of disk space that may never be used. But if you use a smaller number of inodes, any small files stored on the file system will use the full inode block size and waste space that could have been saved by using a smaller inode size (which results in more inodes).

Why choose XFS?

As we've seen, XFS is a flexible, powerful, and fast file system. Current development of file systems for Linux include a number of forthcoming journaling file systems. Available right now is the ReiserFS journaling file system, and coming soon is ext3, which is a backward-compatible journaling file system based on ext2. IBM also released an initial release of its Enterprise JFS, another journaled file system written initially for AIX.

So, in light of these forthcoming alternatives, why should you be concerned with XFS? If ReiserFS is currently available and these others are coming out, why should you choose XFS over any of them?

The main factor is maturity. ReiserFS and ext3 are still in-development immature file systems. XFS is mature--it's been running on IRIX machines since 1994. SGI developed it six years ago to be a robust, long-standing, viable alternative to linear file systems. In short, SGI knows how to make a good file system.

Yes, we may have to wait another few months before XFS is a realistic alternative to ReiserFS, which is currently available; but I think the wait will be worth it. I've illustrated the many benefits of XFS over traditional file systems. Because it has commercial backing and--perhaps more important--because commercial dollars are invested in the project, XFS for Linux will quickly attain the same level of reliability it has had on IRIX for years. To get more information on XFS or to contribute to the project, visit the project Web site at http://oss.sgi.com/projects/xfs/.

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved