The Next Generation Network Operations CenterThe Next Generation Network Operations Center
NOCs are spending more time on applications and performance, instead of simply monitoring availability. IT organizations must craft a new approach in response to this trend.
February 13, 2008
A recent survey of 176 IT professionals, along with in-depth interviews with a number of IT professionals, uncovered the fact that over a quarter of network operations centers (NOCs) do not meet their organizations’ current needs. This market research pointed out that the inability of the NOC to identify issues before the user does hurts the overall creditability of the IT organization and that the role of the NOC is often not well understood – even within the IT organization.
The market research, which was sponsored by NetQoS, also showed that while the vast majority of NOCs are undergoing significant change, not all NOCs are starting at the same place in terms of the functionality that they currently provide. In addition, IT organizations do not have a common vision of the structure and functionality of the next generation NOC.
This article will summarize the market research findings. The article will also identify what IT organizations must do to migrate away from the current stove-piped, reactionary NOC and to a proactive Integrated Operations Center (IOC) that effectively supports all components of IT. More details on the market research can be found at the NetQoS website.
RESPONDING TO A TROUBLE
One of the findings of the research is that the help desk typically routes issues that it cannot resolve to the NOC. This is not surprising since three quarters of the survey respondents also indicated that the network is generally assumed to be the source of application response time degradation.
There are also cultural reasons why the help desk typically routes issues that it cannot resolve to the NOC. For example, one of the IT professionals who was interviewed as part of the market research is a network analyst for a manufacturing company and will be referred to in this article as The Manufacturing Analyst. The Manufacturing Analyst stated that in his company, if there is an IT problem, the tendency of the user is to contact the NOC because, “We have always had the tools to identify the cause of the problems.”
Another interviewee is the manager of network management and security for a non-profit organization. He stated that as recently as a year ago his organization had a very defensive approach to operations, with a focus on showing that the network was not the source of a trouble. His current motto is “I don’t care what the problem is, we are all going to get involved in fixing it.” When asked if his motto was widely accepted within the organization he replied, “Some of the mentality is changing, but this is still not the norm.”
WHAT ABOUT ITIL (IT Infrastructure Library)?
There has been significant discussion over the last few years about using a framework such as ITIL to improve network management practices. To probe the use of ITIL, the survey respondents were asked if their organization has an IT service management process like ITIL in place or intends to adopt such a process within the next 12 months. The majority of respondents (62%) indicated that their organizations do have such a process in place. Of those respondents who do not, a similar percentage (63%) believe that their organization will put such a process in place within the next 12 months. The fact that 86% of respondents stated that their organization either have or will have within 12 months a service management process in place indicates the emphasis being placed within the NOC to improve its processes.
While the survey data highlighted the strong interest in ITIL, the interviewees were not as enthusiastic. For example, The Manufacturing Analyst stated that his organization has begun to use ITIL but they “do not live by the [ITIL] book.” He believes ITIL will make a difference, but probably not that big of a difference. In addition, as part of the market research a CIO of a medical supplies company was interviewed, and he will be referred to in this article as The CIO. The CIO said that his organization tried to use ITIL to improve some of its processes. However, while he does not disagree with the benefits promised by ITIL, he finds it to be too theoretical and he lacks the resources to get deeply involved with it.
HOW IS THE NOC PERCEIVED?
One of the most discouraging results of the market research is that only a small majority of Survey Respondents (58%) believes that the role of the NOC is understood by the entire IT organization. This finding will be elaborated on in the section of this article that discusses where NOC personnel spend their time.
The Survey Respondents were asked a series of questions regarding senior IT management’s attitude towards the NOC. The results are shown in Table 1.
TABLE 1 IT Management’s Perception of the NOC
Overall, the data in Table 1 is somewhat positive. However, there is a notable exception. Over a quarter of the total base of Survey Respondents indicated that the NOC does not meet the organization’s current needs.
One of the interviewees was the IT manager for a manufacturing company (The Manufacturing Manager). The Manufacturing Manager described his concerns about the ability of the network support organization to meet the organization’s current needs. He referred to the support organization as not being very network-savvy and that once they are alerted to a possible trouble, they follow a simple script to try to resolve it. He said that his organization is not satisfied with the role and performance of the network support group and has considered outsourcing the function. They ultimately decided to keep the function in-house, but are committed to improving its ability to respond to problems.
WHAT DOES THE NOC DO?
When it comes to how the NOC functions, one of the most disappointing findings is that just under two thirds of the NOC respondents believe that the NOC tends to work on a reactive basis, identifying a problem only after it impacts end users. The CIO stated that the most frequent question he gets from users is, “Why don’t you know that my system is down? Why do I have to tell you?” He said that the fact that end users tend to notice a problem before IT does has the affect of eroding the users’ confidence in IT in general.
The conventional wisdom in our industry is that NOC efficiency is reduced because of the silos that exist within the NOC. In this context, silos means that the workgroups have few common goals, processes and tools. Just under half of the survey respondents indicated that their NOC has functional silos. In addition, a small majority (61%) of NOC personnel feel that they use many management tools that are not well integrated.
One of the interviewees was the network and systems manager for a multi-national conglomerate, and will be referred to as The Management Systems Manger. She stated that it is challenging to bring together the IT groups necessary to resolve a problem. She added that the group responsible for the performance of applications and servers has very little understanding of the network. The Manufacturing Analyst stated that having management tools that are not well integrated “is a fact of life.” He added that his organization has a variety of point products and does not currently have a unified framework for these tools. This is one of the issues his company is hoping to change with a NOC redesign project currently underway.
WHERE DOES THE NOC SPEND MOST OF ITS TIME?
The survey respondents were asked to indicate where NOC personnel spend most of their time. Table 2 shows the answers of just the NOC respondents.
TABLE 2: Where the NOC Spends the Most Time
One obvious conclusion that can be drawn from the data in Table 2 is that NOC personnel spend the greatest amount of time on applications. An additional conclusion is that NOC personnel support a broad range of IT functionality.
The Manufacturing Analyst said that his organization focuses on the availability of networks and does not get involved in problem resolution. He added, however, that there is a project underway to change how the NOC functions. The goal of the project is to create a NOC that is more proactive and which focuses both on performance and availability.
When analyzing where the NOC spends its time, however, equally interesting is the vast gap in perceptions between members of the IT organization based on whether they work inside or outside of the NOC. Table 3 indicates where NOC personnel say that they spend the greatest amount of time and contrasts that to where non-NOC personnel believe NOC personnel spend the greatest amount of time. As is clearly indicated, NOC personnel say they spend the most time on applications. However, non-NOC personnel not only do not perceive this, but roughly half of them believe that NOC personnel spend the greatest amount of their time on the WAN. This perception gap clearly supports the previously mentioned survey data that indicates the role of the NOC is not well understood outside of the NOC.
TABLE 3: Contrasting Views of the NOC
While the general trend is for the NOC to be more involved in supporting applications, not all organizations are heading in that direction. The Management Systems Manager pointed out that her organization does not focus on applications. In her company there is a separate performance and availability group that focuses on the performance of applications and servers.
WHAT DO NOC PERSONNEL MONITOR?
The Survey Respondents were asked four questions about what their NOC personnel monitor. The results from NOC respondents are shown in Figure 1.
FIGURE 1: What the NOC Monitors
The data in Figure 1 clearly shows that the NOC is almost as likely to monitor performance, as it is to monitor availability. In addition, while there is still more focus in the NOC on networks, there is a significant emphasis on applications.
FACTORS DRIVING OR INHIBITING CHANGE
As shown in Table 1, over a quarter of the total base of survey respondents indicated that the NOC does not meet the organization’s current needs. This level of dissatisfaction with the NOC is supported by the data in Figure 2 that shows almost two thirds of the respondents believe their organizations will attempt to make significant changes in their NOC processes within the next 12 months.
FIGURE 2: Interest in Changing NOC Processes
FACTORS DRIVING CHANGE
The Survey Respondents were asked to indicate which factors will drive their NOC to change within the next 12 months. Their responses are shown in Figure 3.
FIGURE 3: Factors Driving Change in the NOC
Given that NOC personnel spend the greatest amount of time on applications, it is not surprising that the top driver of change is “greater emphasis on ensuring acceptable performance for key applications.” A related driver, “the need for better visibility into applications,” is almost as strong a factor driving change.
One of the interviewees is the global networking manager for an energy company. He stated that currently the NOC does not play a role in resolving application degradation issues. Within his company application degradation issues are handled by network engineers. However, he pointed out that this approach has some limitations. In particular, application degradation issues can occur anytime and typically need to be addressed immediately. As a result, these issues are assigned to engineers around the world based on the time of day and who is working at that hour. Assigning application degradation issues to these engineers inhibits their ability to perform other job responsibilities. In addition, if the problem is not resolved before they leave work for the day, they either must brief another engineer in a different part of the world on the issue and what has been done, or put off working on the problem until the next day. The Energy Manager went on to say that they are currently in the process of providing training to selected NOC personnel to help them resolve application degradation issues. He commented that he has some concerns about how successful the current NOC personnel will be at resolving application degradation issues due to the specialized nature of the task.
FACTORS INHIBITING CHANGE
Particularly within large organizations, change is difficult. To better understand the resistance to change, we asked the Survey Respondents to indicate what factors will inhibit their organizations from improving the NOC. Their responses are shown in Figure 4.
FIGURE 4: Factors Inhibiting Change in the NOC
It was not surprising that the two biggest factors inhibiting change are the lack of personnel resources and the lack of funding. It is also not surprising that internal processes are listed as a major factor inhibiting change. The siloed NOC, the interest in ITIL and the need to make significant changes to NOC processes have been constant themes throughout this article.
The Management Systems Manager stated that her organization monitors network availability but does not monitor network performance. She added that her organization would like to monitor performance but “It is a resource issue. The only way we can monitor performance is if we get more people.” On a related issue, The Management Systems Manager said that due to relatively constant turnover in personnel, “Management vision changes every couple of years. Some managers have been open to monitoring performance while others have not believed in the importance of managing network performance.”
Even if her NOC does not begin to monitor performance, The Management Systems Manager doubts that the NOC will be able to meet the growing demand for its services a year from now. She said, “Hopefully the NOC will be allowed to scale. However, typically, growth in demand happens first and the growth of the NOC happens a lot later.”
CALL TO ACTION: THE NEXT-GENERATION INTEGRATED OPERATIONS CENTER
The market research that was presented in this article demonstrates that there is considerable dissatisfaction with the role currently played by the NOC, and as a result there is also widespread interest in making significant changes to the NOC. Given the interest in making significant changes to the NOC, this section will describe the key characteristics of a truly next generation NOC – one that integrates the operations of each component of IT.
An Integrated Operations Center (IOC) would not have to be housed in a single facility, nor would it necessarily have to be provided by a single organization within the IT function. However, independent of how it is organized, the IT professionals who work in an IOC must have a common language—e.g., everyone in the IOC has the same definition for the word, “service”--and common goals. Below is a listing of the other key characteristics of an IOC as well as a summary of where the bulk of IT organizations currently stand relative to each characteristic.
Efficient Processes: There is clear recognition on the part of the survey base that the NOC needs to improve its processes. There is also clear acknowledgement that the vast majority of IT organizations will use ITIL as part of their process improvement efforts. However, The Manufacturing Analyst summarized the feeling of many IT professionals when he said, “ITIL will make a difference, but probably not that big of a difference.”
Focus on Performance: Today’s NOC is almost as likely to focus on performance as it is to focus on availability. This focus on performance will likely increase in the near term in part because placing greater emphasis on ensuring acceptable application performance for key applications is the strongest factor driving change in the NOC. However, as strong as the movement is to focus on performance, it is not universal. For example, as mentioned earlier, The Management Systems Manager pointed out that due to relatively constant turnover in personnel, “Management vision changes every couple of years. Some managers have been open to monitoring performance while others have not believed in the importance of managing network performance.”
Skilled Staff: In general, the skill set of NOC personnel has been increasing and the majority of NOC personnel are now performing functions that until recently were considered to be Level 2 or Level 3 functions. However, while the skill of NOC personnel has generally been increasing, there is still room for improvement. For example, both The Manufacturing Manager and The Energy Manager discussed the limited skill set of their NOC personnel as well as the attempts that their organizations are undertaking to increase these skill sets.
Automation & Intelligent Tools: Many NOCs have begun the shift away from having NOC personnel sitting at screens all day waiting for green lights to turn yellow or red. For example, The Management and Security Manager stated that his organization has implemented tools to automate most Level 1 issues. In addition, over a quarter of the NOC respondents indicated that their company has “eliminated or reduced the size of our NOC because we have automated monitoring, problem detection and notification.” This trend, combined with the trend to increase the skill set of NOC personnel, indicates that more intelligence is being placed in the NOC, and that intelligence is a combination of people and tools.
Integrated Set of Tools: As was pointed out by The Manufacturing Analyst, having management tools that are not well integrated “is a fact of life.” This situation, however, may be changing, as The Manufacturing Analyst also expressed a common theme of the market research when he added that tool integration is one of the biggest issues his organization hopes to address with the NOC redesign project they currently have underway.
Focus on Applications: NOCs currently have a significant focus on managing application performance. There is also very strong interest in having NOCs get better at managing application performance. As a result, it is highly likely that within the next two years the vast majority of operations centers will be managing application performance.
Focus on Security: NOC personnel do not currently spend a lot of their time on security. However, two thirds of the survey respondents indicated that a growing emphasis on security will impact their NOC over the next 12 months. In addition, almost half of the Survey Respondents indicated that combining network and security operations will impact their NOC over the next 12 months.
Being Proactive: In spite of the widespread interest in being proactive, the majority of the NOCs currently work on a reactive basis, identifying a problem only after it impacts end users. There is some evidence that this may be changing. For example, The Manufacturing Analyst expressed the feeling of many of the interviewees when he said that his organization has a project underway and that one of the goals of the project is to create a NOC that is more proactive.
The migration away from today’s stove-piped, reactionary NOC to an effective IOC that exhibits the characteristics described above will not be easy. This migration will require the active involvement of both the senior management as well as rank and file members of the operations function. Part of senior management’s role is to articulate a clear vision of the future role of the operations center, and to be the champion of that role, both inside of the IT organization as well as more broadly within the company. In addition, senior management must ensure the creation of a roadmap that leads to an effective IOC, and must also closely manage the journey.
While it is the role of senior management to create the vision and the roadmap, a major part of the role of the rank and file members of the operations function is to ground senior management in terms of what is possible in what timeframe. The rank and file must also work with senior management to establish a program composed of formal training, on-the-job training, and job rotations that leads to increasing and broadening the skills of the operations group. In addition, the rank and file must embrace change, as their jobs five years from now will have very little in common with what their jobs were five years ago.
Jim Metzler the Vice President of Ashton, Metzler & Associates and focuses on the broad range of issues that impact an organization's ability to ensure acceptable application performance.