Tools for managing systems and networks must be scalable in today's vast and complex IT environments. But scalable needn't mean grandiose.
Remember "small is beautiful?" "Less is more?" Sometimes it seems these simple concepts are lost on the makers of scalable products for managing enterprise systems and networks.
John D. Lewis, vice president of datacenter operations at First Maryland Bancorp in Baltimore, discovered this when he went shopping for a better way to manage his company's computer systems and networks. The management systems the bank was using were ineffective and inefficient, and it took 50 operators to run the command center, where between 50 and 100 terminals were stacked high. "We'd try to correlate data from dozens of management stations and system consoles--it was an impossible mission," Lewis says.
And an expensive one: the company was adding two full-time administrators a year, to the tune of $100,000 each.
So Lewis began looking for a comprehensive solution. "Along the way, I learned that some solutions are so grandiose in scale that they tend to fail because people underestimate the effort to implement them," he says.
For that reason, he chose MAX/Enterprise, a fault- and event-manager from Boole & Babbage , based in San Jose. MAX/Enterprise took 18 months to implement, and its management features now provide a consolidated view of the IT infrastructure as it affects availability of systems, meaning that administrators can use a single console to view alarms from any number of systems. This has allowed IT to reduce staff size while improving service and increasing the scope of its responsibility.
"It's a 'manager-of-managers' type of solution and would give us the best ROI," Lewis says. He adds that the product paid for itself in two-and-a-half years.
Management of the bank's systems and networks is simpler now, and fewer hands are required. "We have significantly reduced the complexity and manual labor intensity of performing fault management and are truly doing more with less," Lewis says. At First Maryland Bancorp small is beautiful; less is more.
A survey of Datamation readers shows that while 38% of them already have a systems management tool, 11% plan to install one by the end of 1999.
Source: SG Cowen/Datamation, User Survey: Networked Computing, Servers, and PCs, conducted summer 1998
Scalability of tools is key
As organizations have embraced client/server architecture, the proliferation of devices has made the scalability of tools for network and systems management increasingly critical.
Scalability today means different things to different people, but for many IT executives, scalable administration means effective administration of numerous remote offices, nodes, and heterogeneous systems. "For us, scalability means that a product can manage a variety of physical hardware devices, platforms, and applications, regardless of the manufacturer," Lewis says.
In the case of network- and desktop-management, vendors measure a product's scalability by the number of desktops, servers, routers, hubs, switches, and such, which can be managed reliably across an enterprise. Talk about performance or availability management, however, and scalability has less to do with sheer numbers than with a product's effectiveness at improving productivity and managing complex systems to solve problems, analysts say.
When Kind en Gezin, a Brussels-based company, went looking for a systems- and network-management solution, product scalability meant being able to manage 70 remote, regionally located servers and 1,300 clients who were using desktop or notebook computers. The company's core business is keeping records on some 200,000 children between birth and 3 years old for medical and social services purposes. Kind en Gezin sought system- and network-management tools to administer an increasingly complex, dispersed, client/server environment that was becoming more and more difficult to keep operational. The company uses small Pentium 100MHz NT machines with 48MB of RAM, and 64Kbps lines, either dial-up or ISDN, for communication between the central site and remote offices. "We needed a solution that wouldn't bring down our servers," says CIO Luc Verhelst.
| There's more to a tool's success than scalability |
Scalability of management tools means a lot to NationsBank and Prudential Insurance Company of America, two global financial corporations that are poised for future expansion.
But IT executives at those companies say the ultimate success of a systems- and network-management solution depends on more than scalability: it depends on the product's overhead requirements, on the systems architecture, and on effective IT policies.
NationsBank's network is about to grow rapidly, due to its recent merger with Bank of America. NationsBank operates the nation's second-largest private network based on an asynchronous transfer mode (ATM) backbone. The company's LAN/WAN environment includes 5,000 common network elements (that is, devices that serve more than one user), 140 ATM switches, and 3,000 telephone circuits. The merger will double the size of the network.
For the past three years, NationsBank has been using MAX/Enterprise from San Jose-based Boole & Babbage as a tactical--as opposed to a strategic--solution for fault/event management across the company's network infrastructure. In addition, NationsBank optimizes network engineering to ensure peak performance, the company has standardized on network hardware and software, and proactive network management is routine for engineers.
The result is 99.97% network availability across the LAN/WAN environment, according to John Lane, an executive with NationsBank's network solutions group.
"The benefits of that kind of availability are enormous savings in cost avoidance and increased productivity for the company's 85,000 employees," Lane says. It also provides the bank's customers with the high-quality service they expect.
In order to maintain those standards with a much larger network and across all of the company's systems, NationsBank plans to deploy a scalable, strategic solution for managing networks and systems enterprisewide. In selecting a product, the company will look at how many objects the management system can manage, the scalability of the tool's database, and how much overhead the management system requires. The company expects to choose a solution sometime in 1999.
Lane points out that management systems capable of providing a lot of information about an organization's networks and systems often "require additional bandwidth to do all the housekeeping. There's a trade-off."
Coping with the unexpected
For Aureus Azares, Roseland, N.J.-based vice president of information systems with corporate technology services at Prudential, whose corporate headquarters are in Newark, scalability means the ability to cope with unpredictable situations. With 18,000 desktops, 4,000 servers, and 100 terabytes of data in its databases, Prudential's IT infrastructure needs to be able to serve employees and customers anywhere, anytime. "Our systems/network management solution needs to be flexible and capable of expanding globally," Azares says.
Prudential's management solution centers around Tivoli Management Framework, version 3.6, which scales to manage up to 10,000 endpoints, at least in theory. Before upgrading to the newest version, the company used an earlier version under which "scalability was doable but it didn't make sense," Azares says. That's because each management server accommodated only 200 to 250 nodes, and scaling required too many servers. These days, with the new version, Prudential deploys fewer management servers and the product is more far-reaching.
Azares adds that in an effort to optimize operations, the company has standardized on the way it builds its networks, business applications, and databases.
As Azares puts it, successful systems/network management is 35% technology and 65% human processes and organizational policies.
Kind en Gezin's choice was Platinum's ProVision suite, partly because it could be implemented step-by-step to minimize the risk of disrupting operations.
According to Stephen Elliot, senior analyst at Cahners InStat Group in Newton, Mass., suite solutions cost between $1 million and $3 million, on average, for 15,000 and 20,000 devices to be monitored. It's common for implementation costs to exceed the purchase price of the administration product by two or three times.
While First Maryland Bancorp's implementation of MAX/Enterprise took only a year and a half, Cahners InStat's Elliot says "many corporations who purchased and installed suite products from companies like CA and Tivoli two to three years ago are still working on implementations, adding additional functionality over time."
First Maryland's slippery slope
First Maryland Bancorp, operating in four mid-Atlantic states, has more than 300 full-service branches and 550 automated teller machines. The people who depend on the bank's IT services include more than 500,000 retail customers, 50,000 corporate customers, and 5,000 employees. It offers financial services in retail and commercial banking, mutual funds, and discount-brokerage services. And it operates a heterogeneous IT environment over a broad geographic area--First Maryland Bancorp's parent company, Allied Irish Bancshares, is located in Dublin. Managing the bank's diverse and complex computer systems and networks had become a slippery slope.
Before implementing MAX/Enterprise, the IT department faced multiple challenges, including the enormous task of sorting through thousands of confusing alarms and developing the underlying logic to process the alarms. Event/alarm management covered a wide range of systems, networks, and applications. The company's nodes number more than 15,000, and IT operations include:
three IBM MVS mainframes, two Tandem mainframes, one Stratus minicomputer, an AS/400, 20 UNIX servers, and 60 Novell servers;
SNA/Netview 390 and TCP/IP/Netview 6000 network protocols; and
applications for ATMs, branch banks, wire transfer, cash management, and human resources, to name a few.
The complexity of the environment dictated a scalable solution. MAX/Enterprise, the core of the bank's Enterprise Systems Management system, is attached to and is managing events from 21 different types of critical components. The solution provides a comprehensive, streamlined view of the bank's computing infrastructure and the critical events that may affect service delivery. Using automated capabilities built into the product, the system can proactively check the status of components, perform error recovery or diagnostic routines, and, when required, electronically dispatch those routines most likely to correct a problem. For example, the bank uses a UNIX-based application that runs on a Tandem computer for billions of dollars of wire transfers per day to and from the Federal Reserve and for international fund transfers. If a critical process fails, an IT administrator logs onto the server and sends the restart command to MAX/Enterprise. If the process is successful, an alarm is posted to MAX/Enterprise. If it is unsuccessful, the system developer for the application is automatically paged.
A new take on scalability
New issues are forcing vendors to revisit the notion of scalability. These include the need to manage intranets and extranets, and information correlation among networks, systems, and application management tools to help IT manage more proactively.
As Internet-based links with customers and suppliers become mission-critical, many users of existing systems- and network-management tools will be disappointed to discover gaps in the tools' capabilities. "Companies can purchase solid network- and systems-management products today, but if IT wants them to be strategic to the core business and be proactive, then there's more work for vendors to do," says Elliot of Cahners InStat. He notes that if some aspect of the enterprise infrastructure isn't being monitored, then it's unmanaged.
"Many corporations that purchased and installed suite products from companies like CA and Tivoli two to three years ago are still working on implementations, adding additional functionality over time."
--Stephen Elliot, Cahners InStat Group
The Internet has enabled the scaling of administration via Web-browser technology. At companies like Milpitas, Calif.-based LSI Logic, for example, anyone on the system with a Web browser can view information from Tivoli's Maestro, which the company uses for job scheduling. This kind of tool can help cut down on calls to the help desk when problems arise, says David Bristow, manager of enterprise management services at LSI.
The company is using Maestro for stress testing and capacity planning as it rolls out an SAP financial and human-resources package to more than 20 sites worldwide. The company has 4,000 employees.
Users agree that there's no easy way to measure product scalability. In fact, many say scalability is very environment-specific. But it's important for any organization looking for a systems- and network-management tool to spend time defining what scalability means to it now and in the future, as well as the specific needs of its business. //
Lynn Haber, based in Norwell, Mass., writes about information technology and related issues. She can be reached at email@example.com.
Systems- and network-management lessons learned Define what scalability means to your organization.
Talk to independent business groups and consultants to understand various products and to explore scalability issues.
Ask vendors to provide reference customers with requirements similar to yours.
Decide up front the scope of the project, the cost, and the implementation time.
Ask vendors about their professional service offerings to assist with implementation.
Run a pilot project to measure product performance in your environment.
Related articles "The cost of networking," InformationWeek, Oct. 19, 1998
Managing the cost and complexity of the network is taking a large toll of IT resources. This article outlines strategies to tame unwieldy networks and get them to work for your company.
"Vendors focus on Web traffic management," InternetWeek, Oct. 19, 1998
Several vendors offer products to track Web traffic and transactions, helping organizations make sense of how Internet resources are being used.
"Systems management: the other side of the ledger," PC Week, July 31, 1997
The Royal Bank of Canada is an example of how a large organization tackles its network/systems management problems to meet its business requirements. Bank IT managers discover that network/systems management is about more than just tools.
Read all about it Quality of Service: Delivering QoS on the Internet and in Corporate Networks by Paul Ferguson and Geoff Huston
John Wiley & Sons, January 1998, ISBN 0471243582
Aimed at readers interested in QoS, implementing policy in networks, how to think about QoS, or how to get the different levels of service to better serve the mission of a business.
Managing the Corporate Intranet by Mitra Miller, Andrew Roehr, and Benjamin Bernard
John Wiley & Sons, December 1997, ISBN 0471199788
Offers network administrators hands-on solutions, action plans, and checklists for maintaining and optimizing the corporate intranet.