World Wide Packets 

OAM: Operations, Administration and Maintenance

OAM Protocols

With the addition of comprehensive OAM capabilities, Ethernet and MPLS now offer a complete feature set that allows carriers to maximize revenue of Ethernet-based services. IEEE, IETF, ITU-T and MEF now describe mechanisms that report status of a given end-to-end service, representing a subscriber-centric view of the network, as well as mechanisms that provide link connectivity information representing a provider-centric view of the network.

The following figure offers a high-level view of these mechanisms against the OAM process flow and the different failure categories.

Figure - OAM Protocols Matrix

IEEE 802.3ah Ethernet First Mile (EFM) OAM

EFM OAM provides link layer mechanisms that complement applications that may reside in higher layers (e.g., IEEE 802.1ag, MEF Service OAM). EFM OAM, also called link OAM, encompasses a simple protocol that operates across a single link.

IEEE 802.3ah EFM OAM
Features Benefits
Auto-discovery Eliminates the need for operator configuration
Uni-directional Fault Signaling Enables the detection of a one-way link failure
Remote Loopback Provides on-demand link diagnostics, including bit-error rate approximation
Link Monitoring Offers proactive traffic based threshold link monitoring
Critical Events Supports communication of network element conditions that may cause link failure, i.e. power and temperature
Layer 2 Variable Retrieval Allows supplemental link statistics collection, augmenting SNMP
Organization specific extensions Enables standards development organizations and vendors to expand scope

Thresholds are configured to monitor signal degradation (i.e., frame errors). Messages are passed across the link to communicate statistics regarding link health. When a failing link is detected, SNMP communicates this to management stations. In addition, the link may be taken out of service and placed in remote loopback mode for fault isolation.

Prior to placing a link in service, EFM OAM may be used to test the performance of the link. Once verified to be operational and error-free, the link is taken out of remote loopback and placed in service.

Standby links may be continuously tested prior to being activated by protocols such as IEEE 802.1w Rapid Spanning Tree Protocol or IEEE 802.1aq Shortest Path Bridging (in development).

IEEE 802.1ag Connectivity Fault Management (CFM)

Building upon IEEE 802.3ah EFM OAM, IEEE 802.1 is working on a project to provide CFM capabilities for detecting, isolating, and reporting connectivity faults for VLAN-based service transport networks. CFM operates at both the physical and logical level monitoring and troubleshooting faults. For instance, physical links between adjacent or distant devices can be monitored using CFM. In addition, fault monitoring between two end points can be configured based on a logical network layer (e.g., per-VLAN).

IEEE 802.3ag CFM
Features Benefits
Continuity Check Continuously verifies VLAN connectivity and may indicate network faults or misconfigurations
Loopback Request (MAC Ping) Offers on-demand or proactive indication of VLAN control-plane responsiveness
Linktrace Request (MAC Traceroute) Provides on-demand or proactive VLAN topology information

The CFM protocol, often called Ethernet OAM, sends heart-beat style Continuity Check Messages (CCMs). Failure to receive these messages in order in a certain amount of time indicates one or more possible network errors, including path or device failure or network configuration problems. Management stations monitor the status of the reception of CCMs and take appropriate action.

Troubleshooting tools are provided in the form of MAC ping (formally known as IEEE 802.1ag Loopback Request) and MAC traceroute (formally known as IEEE 802.1ag Linktrace Request). These features may be initiated by network operators or run automatically in background processes as monitoring functions.

Since CFM is being developed after IEEE 802.1ad Provider Bridges was completed, a second important aspect of the project is enabling multiple nested Maintenance Domains (MD) co-existing on the same physical network, each potentially managed by a different administrative organization (service provider or network operator). Figure 5 shows an example of three nested domains.

ITU-T Y.1731

ITU-T Study Group 13 developed Y.1731 in cooperation with IEEE 802.1ag CFM, defining further VLAN-based service transport OAM functionality. Several of the additional features offer performance monitoring capabilities. ITU-T Y.1731 and CFM use an identical frame format and share the same OpCode space. As a result, these complementary protocols are simpler to deploy in a service provider’s network.

The following table provides a summary of the features contained in Y.1731.

ITU-T Y.1731
Features Benefits
Alarm Indication Signal (ETH-AIS) Provides fault notification for devices not participating in the VLAN-based Ethernet Continuity Check
Remote Defect Indication (ETH-RDI) Offers fault indication of the other end of a VLAN-based Ethernet service
Locked Signal (ETH-LCK) Enables maintenance actions while being able to differentiate and isolate actual fault conditions
Test Signal (ETH-Test) Allows a one-way on-demand in-service or out-of-service VLAN test, e.g. throughput, frame loss, etc.
Performance Monitoring (ETH-PM) Monitors traffic performance on a point-to-point end-to-end VLAN-based Ethernet service
Frame Loss Measurement (ETH-LM) Collects end-to-end frame loss information to approximate severely errored seconds, which indicates VLAN-based service transport availability
Frame Delay Measurement (ETH-DM) Provides an on-demand Frame Delay and Frame Delay Variation measurement between two points of the VLAN-based service

VLAN-based service transport networks configure certain network elements at Maintenance End Points (MEPs). These MEPs sit at the boundaries of Ethernet domains. The following figure shows the span of the different OAM mechanisms offered by Y.1731.

Figure - ITU-T Y.1731 Architecture

MPLS

World Wide Packets offers a solution allowing transport of Ethernet services either natively, or using MPLS encapsulation. Deploying MPLS to the customer premises facilitates the interconnection of the access infrastructure with the existing MPLS core network, while increasing the need for MPLS-specific OAM tools.

MPLS OAM
Features Benefits
Label Switched Path (LSP) Ping Offers on-demand connectivity information about MPLS tunnels
LSP Traceroute Provides MPLS switching and Maximum Transmission Unit (MTU) configuration information
Virtual Circuit Connection Verification (VCCV) Enables proactive connectivity monitoring of MPLS pseudowires
Bi-directional Forwarding Detection (BFD) Allows scalable, proactive data-plane verification of MPLS LSPs
Fast ReRoute Provides automated repair of MPLS failures
   
   
   
   

LSP Ping

LSP ping is an in-band on-demand mechanism to verify the status of an MPLS tunnel. An LSP can fail because of misconfigurations such as MPLS being disabled, having mismatched labels or misrouting into the wrong tunnel. It can also fail due to broken Label Distribution Protocol (LDP) adjacencies, corruption of Forwarding Information Bases (FIB) or other software/hardware failures.

LSP ping sends an echo request to a target Label Switch Router (LSR) using MPLS addressing. The destination IP address of the echo request packet is defined as 127.0.0.0/8 to prevent the IP packet from being routed to its destination. If reached, the destination LSR sends an echo reply back to the originator of the MPLS echo request.

LSP Traceroute

LSP Traceroute is used to determine the hop-by-hop path and destination of an LSP. Like LSP ping, it is an in-band on-demand MPLS OAM utility. It also can be used to detect MTU misconfiguration between LSRs.

LSP Traceroute also uses an MPLS echo request/reply mechanism. However, with LSP traceroute, all LSRs along the path up to and including the destination LSR reply to the echo request. This technique allows the operator to identify and distinguish LSRs along a path.

Virtual Circuit Connection Verification (VCCV)

Using LSP ping, a service provider can monitor the status of an MPLS tunnel. To diagnose a problem within the tunnel, the service provider needs a mechanism to verify the connectivity of the pseudowires (VCs). VCCV allows the proactive monitoring of pseudowires within MPLS tunnels, by establishing a control channel associated with each pseudowire.

Bi-directional Forwarding Detection (BFD)

VCCV requires involvement of the MPLS control-plane. This means that as the number of VCs increase, so will the load on the control-plane. BFD allows systematic and more scalable detection of MPLS LSP data-plane failures with less involvement from the control-plane. As a result, BFD allows faster detection on a larger number of LSPs.

BFD relies on a hello packet exchanged by neighbors at negotiated regular intervals. When a hello packet is not received as expected, the neighbor is declared down.

Fast ReRoute (FRR)

Fast ReRoute allows automated repair of LSP tunnels to reduce packet loss on LSPs. If there is a link or node failure, an LSP that employs Fast ReRoute is able to redirect MPLS traffic to previously-computed and established alternate paths around the failed link or node. The alternate paths are selected during the establishment of a primary LSP under hop-by-hop control. With Fast ReRoute enabled, Resource ReSerVation Protocol-Traffic Extension (RSVP-TE) establishes local alternate LSPs for each potential point of failure along the primary path.

MEF Service OAM

The Metro Ethernet Forum is actively pursing a complementary set of OAM-related functions operating at the Service Level Agreement layer. The Phase 1 specification will contain performance monitoring capabilities for point-to-point services reflecting the frame loss ratio, frame delay (latency), and frame delay variation (jitter) characteristics of the service.

MEF Service OAM
Features Benefits
Point-to-point Ethernet Virtual Circuit (EVC) Performance Monitoring (PM) Provides Service Level Agreement assurance for different services
Point-to-multipoint EVC PM
Multipoint-to-multipoint EVC PM
EVC Fault Management Enables identification and isolation of fault at the Service Level Agreement layer

In addition, per-service fault management will be supported for point-to-point, point-to-multipoint, and multi-point services. Fault detection encompasses loss of continuity between management end points and detection of potential for loops in the service. This fault detection/verification capability is supported proactively or on-demand through operator action. MEF Service OAM, often called Service OAM, also provides fault isolation and fault notification.

IP

One of the main benefits of Ethernet services is to offer low deployment costs by not requiring IP provisioning of each individual data-plane element. However, the control-plane uses mostly IP-based protocols, such as Telnet, SNMP or IGMP. In that regard, there is a need to detect control-plane failures at the IP level. Two mechanisms have been in use since the advent of IP networking: IP ping and IP traceroute.

IP Layer
Features Benefits
IP Ping Provides on-demand connectivity verification of the IP control-plane
IP Traceroute Offers routing and delay information for an IP destination
   
   
   
   
   
   
   

IP Ping

IP ping is a basic mechanism that verifies IP connectivity through the network. It verifies that a given IP address exists, is reachable and can accept ping requests. It also calculates the latency between the control-planes of two IP network elements.

IP Traceroute

IP traceroute is another OAM tool that records and displays the route that IP messages follow between two IP elements. It also calculates the latency between the control-planes of each IP elements of the route.

Continue