MPLS has become the primary WAN technology being used by Enterprises to connect their global offices. As a networking technology, MPLS is primarily a backbone transport technology that’s deployed by service providers and is rarely, if ever, natively implemented at the customer premises. Instead, Enterprises will deploy some type of local access connectivity (i.e. P2P, Ethernet, Internet VPN via Broadband or wireless) to connect to their respective service provider’s MPLS service offering.

While MPLS technology is fairly standard in terms of the RFCs and overall base functionality, there are many differences in the features and capabilities that are implemented and offered from one vendor to the next.  We will examine some of these differences so that Enterprises will have the right knowledge base when evaluating solutions for meeting both their current and future business requirements.

MPLS Access Technologies

While traditional serial access technologies are considered the global standard for local access connectivity. Many MPLS service providers have added additional access technologies to their portfolio in order to either a) provide cost competitiveness, whether deployed natively or as a solution over existing copper (i.e. T1 or DS3) facilities, or b) add more value to their solution offering.

Ethernet

The first, as mentioned, is that the cost of Ethernet equipment and Ethernet interfaces are less expensive for both the service provider, as well as the client; and in addition to being a lower cost solution in terms of hardware, it also provides clients with the ability to set incremental ranges in terms of the bandwidth that they would be contractually committed. For example, if an Enterprise has an office in what appears to be a growth market, you may want a solution that provides you with the opportunity to grow the bandwidth as the number of people, as well as the application resources used by those people, increases. If the current requirement only justifies 2Mb worth of connectivity, but the business needs flexibility, then deploying a 10Mb Ethernet Access solution with a 2Mb MPLS port commitment would be a nice option; allowing the Enterprise to upgrade their bandwidth (usually with 48 hours) without having to go through a re-deployment (usually 6 to 8 weeks) in order to get a larger circuit.

However, like with any technology, Ethernet as an access solution does have some drawbacks that every Enterprise must be aware of. Ethernet, while being a very mature LAN technology, isn’t yet widely deployed to business complexes throughout the United States and Europe, as it is in many Asian and Indian markets. This means that the Enterprise will often see additional costs due to new build outs, as well as lengthier (4 to 6 month) installation time frames. However, in most cases, both of these issues are out weighed by the cost effectiveness and contractual bandwidth flexibility that’s offered by Ethernet.  In addition to the extra cost and time associated with getting Ethernet to the actual building, there are also the costs of getting Ethernet extensions (whether copper or fiber) from the building’s main point of entry (MPOE to the offices server or Telco room. Quite often, whether the access solution was Native Ethernet or Ethernet over Copper, Enterprise will either need to pay for fiber extensions and/or conduit (ranging from $1500 to $5000) or pay for multiple copper pair runs (ranging from $1500 to $5250) to accommodate Native 10/100BaseTX Ethernet extension or Ethernet over Copper (i.e. 10Mb over 7 T1s) extensions.

In relation to cost, the Enterprise must compare the total cost of deployment with the needs of the business. For example, if the organization is going to be operating at a particular location for several years, and if that office needs bandwidth flexibility to accommodate both expansion and contraction, then committing to longer contractual terms can often reduce or eliminate the up front cost associated with an Ethernet deployment. Otherwise, an organization may need to look at alternative access methods and/or cover these addition costs.

The second major value point offered by Ethernet is that it can flexibly scale. If an Enterprise has a 10MbE access circuit for a given office and they are only committing to 5Mb for MPLS network access, then that leaves the organization with 5Mb of bandwidth to spare on that access connection. Many vendors offer additional services and features that allow an Enterprise to utilize that bandwidth in several advantageous/cost effective ways. For example, if the vendor supports VLAN tagging in addition to QoS enforcement at the edge of their network, the Enterprise could easily add a VLAN for Internet access and terminate it to a firewall. If VLAN tagging is not supported on the access link, the Enterprise could inquire about setting up special QoS classes that can be dedicated to either external, SIP based voice services (i.e. IP based Toll-Free or Long Distance) or for sending internet traffic to a cloud based firewall. These additional services, when integrated over existing Ethernet access links, are often much more cost effective vs. deploying these service on their own.

Internet VPN via Broadband & Wireless

Internet VPN can be a very cost effective means for accessing an MPLS network. Many Enterprises will deploy this type of access solution when connecting Small Office/Home Office (SOHO) sites (i.e. executive suites like Regus, home offices, as well as teleworker mobile devices and laptops) to an MPLS network. Typically, this solution is driven primarily by both cost and availability. Usually it’s not cost effective to deploy a dedicated connection to a SOHO location due to the number of people being supported or it’s location, and often it’s not available to organizations that utilize executive office suites to policies outside of their control.  However, in addition to meeting the SOHO connectivity needs of the Enterprise, Internet VPN access can also provide an organization with a very cost effective MPLS WAN backup solution that doesn’t completely rely on the public internet. For example, if the organization has a large corporate office or data center that’s housing many of the organization’s applications, it might make more sense to have the VPN tunnels terminating back into the service providers cloud instead of between sites directly. This would ensure the following: 1) if a host site failed, all remote sites would stay on their primary MPLS connection and the host site would use it’s VPN connection to the MPLS network, 2) the VPN connection to the MPLS network will often provide critical Enterprise applications with much better transport performance during an outage by providing lower latency than the public internet, as well as by enabling QoS enforcement once traffic is on the MPLS network.

As a an access technology, the type of Internet connection being deployed is the key to this solution. If the location in question doesn’t have symmetrical bandwidth (i.e. like an ADSL connections with 256Kbps upload/1.5Mbps download, wireless 3G/4G data cards), then it’s likely that this solution will only be able to provide basic, private connectivity for specific data applications (i.e. email, file sharing, etc…). However, if the location has a dedicated, symmetrical connection (i.e. SDSL 1Mb up/down, Share/Dedicated Internet, etc…) or a guaranteed asymmetrical high bandwidth connection (i.e. Cable Internet (2Mb up/12Mb down) with specific packet delivery SLAs, then Enterprises may be able to subscribe to deliver multiple services (Voice, Video, and Data) over these Internet VPN access connections.

As far as cost, an Enterprise must absolutely research and compare one provider’s offering with another and negotiate using that research. If the service providers oversubscribe their VPN access points with both existing and future customers, and don’t offer any bandwidth guarantees, then their prices should be inline with such offerings. However, if the service provider offers dedicated bandwidth, monitoring, and the ability to setup policies on traffic that’s entering and exiting their network, likely there is much more value, and the price will be inline with those additional capabilities.

QoS Capabilities

Once an Enterprise has determined the amount of raw bandwidth that’s required by the organization, as well as the amount of flexibility they’ll need to have for either expanding or contracting that bandwidth, the next major feature or capability an organization must research are the QoS offerings available from different providers. Because the service of MPLS can be looked at as being an intelligent router in the cloud that’s being managed by the service provider, it is very similar to Software as a Service where all the capabilities and features of an application live off premises.  Thus, while provider SLAs are still a valuable component in determining which provider has the right QoS offering, there are additional details that an enterprise must know.

MPLS QoS Enforcement

The first of these is how and where a service provider enforces QoS. Some providers use a non-hierarchical QoS enforcement policy. This means that some or all traffic classes (i.e. usually voice and video) get allocated a specific amount of bandwidth (i.e. as a percentage of the whole or a as a specific value) for their particular forwarding class. If one bit of data hits that forwarding class, then the entire amount of bandwidth dedicated to that class is carved out of the pipe to support that one bit. For example, if an Enterprise has a 10MbE access circuit with a 5Mb MPLS port and subscribes to 2Mb forwarding class that provides a specific packet delivery guarantee for any traffic that traverses that class; if that Enterprise should send one voice call, the QoS policy doesn’t just allocate bandwidth for that one voice call, it utilizes the entire 2Mb of that forwarding class. Obviously, this isn’t an efficient use of bandwidth for something like a voice call; however, for “chatty” applications that dynamically scale their bandwidth usage while in session (i.e. Telepresence, Desktop Video Conferencing, etc…), reserving the forwarding class’s entire amount of allocated bandwidth may be required. The other enforcement policy is exactly the opposite (i.e. a hierarchical) policy. In this type of policy, forwarding class bandwidth reservations are allocated on a FIFO basis. For example, if traffic assigned to a higher class arrives at the same time as traffic in a lower forwarding class, the high class traffic not only gets access to the bandwidth first, but it’s only allocated enough bandwidth from its forwarding class (up to that classes max bandwidth reservation) that it needs to complete it’s transaction within the defined SLA parameters of that particular forwarding class. This means that instead of a single 83Kbps VoIP call taking up the full 2Mb of the forwarding class on the circuit that’s been allocated, it will only take what it needs, allowing lower classes to utilize the remaining bandwidth. Thus, an efficient solution that’s making the best use of the available bandwidth.

Number of MPLS QoS Classes

While managing how much bandwidth a given class with take from the overall circuit is important for application performance, it is also important to have the right number of classes. Often, Enterprises will account for application traffic (i.e. citrix, email, ftp, voice, video, etc…), but tend to forget about signaling traffic (i.e. BGP and OSPF updates, SIP Signaling, ICA signaling, etc…); this type of traffic is just as critical as end-user facing application traffic. Therefore, ideally an organization should assign the highest forwarding class to your connection state traffic (i.e. routing traffic and signaling that’s responsible for keeping the link active).  Failing to isolate this traffic into the highest forwarding class can lend to protocol and session flaps during times of congestion, and thus affect the entire circuit performance, regardless of the application. Next, an organization needs to account for both the real-time traffic streams, as well as the signaling session associated to them. Since signaling is usually only an incremental piece of any voice or video call, ideally that traffic should be classified to use the same forwarding class that actual voice and video RTP traffic will use. Thus, when calculating the amount of forwarding class bandwidth that’s required, be sure to include the signaling traffic. Once the most critical traffic types are classified, an organization must determine how many classes will be needed to meet the available SLAs associated with the Enterprise’s application portfolio. Some Enterprises may simply need one data class (i.e. usually because all of their applications are wrapped within citrix or terminal services), others may need more forwarding classes for many different applications (i.e. such as citrix or terminal services, Internet access via centralized sites, non-critical unified communication services such as messaging, etc…).

End-to-End QoS Enforcement & Monitoring

Once an Enterprise has determined the “how” (i.e. the type of QoS needed and the number of classes needed), the next major QoS related requirement that must be reviewed with the service provider is where QoS is actually enforced and if the service provider gives the organization visibility at these enforcement points.

This item is one of the most the most critical QoS components as it relates to successfully meeting the application delivery SLAs of the organization. Nearly all service providers will enforce QoS at the edge of their networks as traffic enters their backbone; this ensures that the appropriate policies are being applied. However, if an Enterprise is oversubscribing its resource locations (i.e. corporate offices, data centers, regional offices, etc…) by not having bandwidth that is equal to the sum of the organization’s field offices, then it is equally important that a service provider enforce QoS at the egress point of their network. For example, if an Enterprise has 2 resource centers and 8 remote offices, and those 8 remotes have 10Mb each and the host site has only 50Mb, then the organization will experience oversubscription should all of the remotes sites start communicating at the same time.  This oversubscription would not be an issue if the service provider enforces QoS as traffic exits their network (i.e. allowing higher priority traffic to exit first, while buffering or dropping lower priority traffic as needed). Most Enterprise IT Architects, software engineers, and network engineers realize that TCP based application can sustain an acceptable amount of retransmission while still meeting user expectations; this then allows the organization to save money by oversubscribing available bandwidth. However, if the service provider doesn’t provide QoS enforcement for traffic exiting the network, the Enterprise will need to buy enough bandwidth to meet the SLA requirements dictated by the organization.

While it is not always a key requirement for helping an Enterprise IT Architect to meet their organizational SLA objective, having visibility to what is being classified, forwarded, buffered, and dropped by the service provider can dramatically improve an organizations application troubleshooting capabilities when issues arise. Thus, if an application is classified to use a given forwarding class, and then exceeds the bandwidth allocated to that forwarding class, traffic may be re-classified, buffered, or dropped; all of which can impact user experience. Typically, a user will open a ticket for application related issues with the help desk; the help desk then reviews the applications attributes as well as the users device, the server, the network; escalating the issue as necessary. If an organization doesn’t have visibility to a critical piece of the network because it resides outside of the premises, not only is precious time wasted on troubleshooting to get to the root cause of a possible chronic issue, but the root cause itself may turn out to be inaccurate; causing unnecessary expenses.

MPLS Network Support

After researching both the MPLS access technologies and QoS capabilities, an Enterprise must have a clear understanding of how the solution is supported by the service provider. In an environment where the transport features and capabilities are 100% owned by the service provider, an organization must have proactive notification and trouble ticketing for elements outside of their control. The interface on the service provider’s network should be visible to their NOC, consistent packet drops on that interface should be visible to their NOC, routing session changes should be visible to their NOC, network backbone issues should be visible to their NOC, and so on. Moreover, since these issues should be visible by the service provider’s NOC, proactive alerts and ticketing should be included in the service offering.

Ironically, many service providers do not monitor the connections between the customer premises and their own network, and if they do, quite often they put the responsibility of notification for resolution onto their clients. If this is the case, it is absolutely imperative to understand where the demarcation point exist between the organization’s network support team and the service provider’s NOC support group. Moreover, if proactive alerts and ticketing are not part of the service offering, then it is imperative that organizations negotiate any necessary discounts that will be needed to supplement the lack of monitoring being provided by the service provider.

On the other hand, if the service provider offers proactive alerts and ticketing, the Enterprise must compare both the type of alerts being ticketed by each service provider, as well as the NOC processes that each service provider uses to work individual tickets to resolution. If two providers have similar service offerings and features, but one of them offers an MTTR (mean time to repair) of 4 hours and the other is 2 hours; then the Enterprise should use the provider with an MTTR of 2 hours, correct?  No, the Enterprise should request a report from that service provider that shows what their network wide/customer wide, average MTTR is for the different categories of outages. An SLA is only a written document with credits associated to it. Most of the credits don’t come close to the dollar figure associated with productivity loss and/or revenue loss.