Luc-Yves Pagal Vinette
The key ingredients of an NFVi/ViM solution for Edge Disaggregated networks.
Dernière mise à jour : 24 févr. 2022
The Telecom Industry has radically changed since the inception of the conceptual idea of hardware & software separation described as SDN/NFV. ETSI and other bodies such as MEF, ONF, LF and more recent ones such as TIP have been instrumental to shift our market perspectives, then, defining new and innovative paradigms of how service infrastructure could be-reviewed. Additionally, how service providers could alter their service infrastructure to condition and integrate such new definition of services using an ad-hoc layered-based structure that we would represent as MANO, which was initially established and revisited by ETSI.
ETSI MANO was immediately received as a service infrastructure guidelines with clearly defined layers. The ETSI MANO approach has influenced a lot of other SDOs (Standard Organizations) however used as guidelines to market actors but rarely implemented. ETSI MANO conceptual ideas had a profound impact on the market and on the Cloud service concepts on the best practices of how service providers of any kind would necessarily attempt for transforming their existing service infrastructure. While, anticipating the best ways of delivering next gen services where services/functions will certainly help in being software-driven instead of being hardware-dependent .
SDN/NFV with all its promises has been extremely slow to deliver all its expected benefits but as a side effect, Cloud services have been thriving all over the globe. However, can we literally transfer successful recipes from Cloud service recipes to the Edge contexts?
What are these typical key ingredients ?
· Power : Abundance on one side and limited on the other
· Space : Relatively open on one side and limited or even confined on the other
· Compute and service density requirements : Moderately important on one side and unquestionably critical on the other
· Real-Time OS Kernel fabric: Extremely important to address both Edge service requirements of low latency/Jitter constraints and Service Chain principles and ensure immediate dataplane changes
Cloud and Edge might be sharing similar traits they differ drastically in the principles of supporting and even delivering services. but how so? Let’s explore how these key ingredients impacts both service delivery, efficiency and monetization potential..
What Edge and Disaggregation really means?
We often associate the terms edge and disaggregation together without a complete explanation of what they describe and entails. Edge should be appreciated as the continuity of the Cloud and should normally follow the same technology directions (in normal circumstances). We should see the Edge as a way to distribute apps/functions/services that would generally be centralized in Datacentre locations but now closer to end users.
What could be typical Edge locations?
As described above, Datacentre distribute applications over Central Office locations, usually identified as Core PoPs. Therefore, more time-sensitive or business-critical applications can be further distributes to lower Edge locations such as Aggregation or/and Access PoPs. As depicted above, the more I see it and the more I consider that the customer domains and associated devices (aka uCPE) as an integral part of the Edge.
What disaggregation means in such a case?
The term disaggregation is often utilized without a proper meaning or context to understand its perspective especially at the Edge of Service Infrastructure. It simply means that dataplane or software service operation is separated from the orchestration piece. But, it is falling under the same orchestration/automation/Management services umbrella, which is also shared by the NFVi/ViM. This concept of separation is not new and it has been used throughout the Telecom service history and defined as Dataplane/Control Plane separation in a recent past. The most notable examples are MPLS-based technologies where the separation of the dataplane and the control plane have greatly help in scaling Telecom Infrastructures all over the globe.
What disaggregation means in such a case?
Today, the concept hasn’t changed but simply evolved to a point where hardware is now segregated from software and software can be now disaggregated over a 3rd party Service intelligence called OSL (Orchestration Service Layer). As said earlier, the disaggregation implies that the NFVi/ViM layer is then ideally open enough to be “orchestrated” by any orchestration layer platform. Therefore, if the orchestration platform is based on open/proprietary concepts (Tosca, Rest-based interfaces, Netconf, RestConf, BGP-M, XMPP, etc…) then it should theoretically onboard/control any VNFs/CNFs or PNFs, orchestrate and support Multi-Cloud capabilities also called Multi-ViMs as referred in an ONAP context.
What are the NFVi Cloud/Edge requirements?
Without repeating what we’ve all read and hear in our industry, Cloud and Edge locations do share similar requirements notably in terms of service construct and operations. However, they don’t notably share the same constraints notably concerning: the available space and related footprint, Real-time sensitiveness and scalability.
Space and footprint: Datacentre locations and Edge locations such as Core PoPs (Central Office) and Aggregation or Access PoPs (other edge locations) don’t share the same physical constraints. Indeed, at Datacentres space can offer plenty of capabilities to grow an environment almost as CSPs see it fit financially. On the other side, Edge locations have limited space, room to grow and access to power could also be complex occasionally. Therefore, it requires NFVi/ViM solutions that can be deployed in the smallest footprint to support Cloud delivery softwares such as OpenStack or Docker type environments. Without a minimum footprint, Edge locations wouldn’t be able to support enough disaggregated functions (VNFs or CNFs) to ensure a minimum level of services monetization.
Edge and even Cloud have already been challenged over time-sensitive applications/services such as ToIP, UC&C, etc.. However, there are infrastructure-based operations inherent to certain edge or cloud use cases that have imposed very strict timing requirements to overcome latency and jitter. Typically, Mobile Network Transformation, NFV-enabled Satellite Networks and even Service Function Chaining (SFC) are typical contexts/use cases where a real-time OS kernel could do wonders.
In the context of Mobile Networks, a given service infrastructure require to distribute the best possible clock information over dataplane using functions as Synchronous Ethernet but also to synchronize the entire distribution networks with a correct alignment on the TOD (Time of Day) with other functions such as 1588v2 (Precision Time Protocol) which can be overcome for given profiles supported by TSN (Time Sensitive Networking). Evolution towards 5G with RU/DU split with Backhaul network requirements. And further with CU/DU split, leading to Midhaul & Fronthaul requirements where any network requirements are pushing the sensitivity to network delays and jitter constraints to its extremes. Satellite networks also are not spared, indeed QoS allocation on a per satellite channels (beams) operations are also very sensitive to delay. We shouldn’t be neglecting how important Service Function Chaining is. By establishing the logical/virtual chain between various functions (VNFs / CNFs) that formalize what we call a managed service naturally composed of multiple elements in it.
All of these examples, share a consistent set of services and operations with the NFVi/ViM service construct. The NFVi/ViM building blocks, allows the virtualization/containerization of applications or services, provides monitoring tools, ensures Zero Touch operations (Provisioning & Configuration), guarantees the support of required southbound protocols for SFC (Service Functions chaining) and 3rd party orchestration capabilities to widen options and market openness. To address the time sensitiveness related to any applications or mode of operations, it is therefore crucial that the given NFVi/ViM solution of being based-upon a real-time OS kernel alongside its other typical functions.
We often use the term scalability in our presentations and our approach to pitch our respective products and services. However, scalability in the NFVi/ViM context holds a different meaning and can reasonably be achieved in a specific context when multiple elements are aligned. First, it requires an alignment on the hardware of the whitebox selected to ensure an adequate setting between the NFVi/ViM and its host. Second, the hardware whitebox has to be dense and scalable at the compute granular level such as CPU, Memory and storage options to scale the setup as the number of customers to support will change over time. And thirdly, a low-footprint based NFVi/ViM combined with a fully open northbound set of API options (Rest-Conf, Netconf-based or others).
Can Cloud and Edge support different NFVi/ViM solutions?
This is a tough question to address in a limited chapter, I’ll attempt an approach and I would suggest to any readers to bring their own views on this complex subject.
My answer would be an absolute and resounding YES !! But, this response will come with a very important question: How open your Orchestration Service Layer is !!?
We often address different service and applications requirements differently on a per NFVi/ViM consideration. Indeed, depending of how much multi-tenancy, how spread, how fast and how responsive applications or services need to be. This will therefore bring different lights of consideration, through which, orchestration service solutions would be considered. Subsequently, the question of multi-Cloud or Multi-ViM becomes extremely relevant as much as other critical capabilities , such as:
· Easy and wide open for onboarding VNFs/CNFs
· Capable of administrating/controlling PNFs
· Capable of supporting multiple clouds therefore NFVi/ViM service platforms (OpenStack, Kubernetes, Juke, Docker, etc..) but also integrate different Hardware Asset Management softwares such as MaaS / Ansible / Puppet, etc..
Do we need a ViM per set of requirements ?
My view is rather simple about this actually… There is no market ViM solutions capable of addressing all needed requirements. Given specific conditions, some OpenStack vendor implementations can be extremely relevant for Cloud and less relevant for Edge and vice-versa. The same reasoning can be applied on the question OpenStack vs Kubernetes/Docker. There is no distinct and clear winner as they can both co-exist under the same service infrastructure and under the same orchestration service layer umbrella. At the end, I believe that there are solutions for all circumstances and they apply very differently to all requirements so the question of relevance should then apply to the Orchestration Service Layer. Can it handle multiple NFVi/ViM in a given infrastructure?
The best examples of this, is how Cloudify/Blue Planet and others do on the market as MSDO (Multi-Service Domain Orchestrator) or ONAP with its Multi-VIM component. I won’t start diving on this (one of my favorite subject). However, it remains important to mention that most of orchestration platforms in the public or open markets (Cisco, Juniper, ADVA, Cloudify, Blue Planet, etc..) have an intrinsic design/capability of supporting multiple NFVi & ViM platforms. Similarly, within the open source community, ONAP has paved the way towards a market where a multiple definition of clouds/functions will be required in the future, components such as Multi-VIM, SDN-C & APP-C or VFC has showed to multiple VIMs and Network Functions (PNFs, VNFs and CNFs) support.
Distributing Applications/Services, what’s the NFVi recipe?
As per the Fig.1 above, distributing applications consists in moving applications/services or functions closer to the end users from the Cloud to the Edgier locations of the network. This approach is certainly not new, notably with past networking technologies (MPLS-like, Carrier-Ethernet or EVPN, etc.) separated the dataplane from the control plane while bringing service segregation/grooming capabilities closer to end users with VPN capabilities.
The difficult equation to find the way of moving a greater number of resources-hungry applications onto cheaper and denser hardware platforms while sustaining a satisfying layer of abstraction (NFVi & ViM) with limited footprint and resources overhead while leaving sufficient resources level for mission-critical applications. Most of Edge hardware platforms are meant to address physical restrictions of space/power but are necessarily to support mid-power and scalable CPUs or/and GPUs capable whitebox. Market examples such as ENEA and Wind River have demonstrated the vision and the capabilities required to address some of the Edge Network requirements.
Contexts and technologies have now evolved considerably to consider new requirements and options where hardware and software are separated. The analogy with legacy MPLS-based technologies can be associated with today's networks. Although, current NFVi/ViM solutions are ultimately disaggregated and themselves look-up towards the upper layer in the service infrastructure to leverage all SDN/Orchestration and automation expected capabilities.
NFVi/VIM, what can we expect for Orchestration/Automation at Edge? Recent market trends and the chase for new network transformations leading to 5G have changed our industry profoundly, as the market is trying to find ways to push services and apps closer to end users even our perspective about Orchestration and automation is changing. Service Providers and Mobile Network actors (virtual or not) are also facing the impossible tasks of mitigating between choice of technologies, CAPex & OPex investments and selecting the best possible Multi Service Domains Orchestrator to maximize the range of services that will be monetized. But, I believe that, more importantly, Service Providers are also tackling head on the questions of the efficiency, with which, services will instantiated over MEC devices that could be different from each other while also transiting between service provider domains. The question is then, how possible it is to improve service intelligence and service delivery and their efficiency ?
As Edge networking/computing evolves, we’ll witness a growing number of key applications/services or functions that will be distributed across Service Provider networks to ensure improved quality of experience on a-per distributed application. However, the more apps/services or functions get distributed/disaggregated and more service intelligence or orchestration capabilities will be distributed alongside the users they serve. Both respective examples of PCCW Global and Cloudify demonstrate how Edge can be tackled differently to simplify but also accelerate services to be rendered in-between operators.
On PCCW Global side, by leveraging both SDI (Software-Defined Interconnection) and disaggregated capabilities to allow intra-Carrier services without compromising access to Cloud Services. Cloudify qnd its recent new version of their orchestrator manager: Spire. It is indeed a prime example of how a Orchestration vendor sees the impacts and the upcoming challenges of a centralized MSDO platform. In order to plays a larger and more efficient role than just orchestration with orchestration/PNFs & SDN Capabilities /Generic VNFM/Control over HW asset management such as Ansible, etc… It simply requires a more distributed approach of an Orchestration Service Layer (OSL) where the service intelligence is then distributed alongside the applications it supports.
Disaggregating Applications/Services, what’s important?
The hardware and software separation was the first step of making services/apps/functions to be fundamentally segregated from the hardware layer required to operate fully. In a disaggregation mode, the principle now is to allow any functions (PNFs, VNFs or CNFs) to be piloted or as we call it orchestrated from a 3rd party applications, which for most of cases would be coming from a different vendor or from the open source ecosystem.
To give a couple of examples, today, orchestration solutions such as Cloudify, Blue Planet, RIFT IO and even an Open Source platform such as ONAP have inherent capability to operate as a MSDO (Multi-Service Domain Orchestrator). This MSDO concept combined with disaggregation possibilities generally opens-up a lot of options : Multi-Clouds or Multi-ViMs support, PNFs support and a very exhaustive list of vendor VNFs and CNFs from all verticals (Networking, Security, Mobile, Satellite operations, SD-WAN, NoS, etc..).
Purely from a ViM standpoint, it requires once again to meet compulsory requirements such as:
· Openness to guarantee a seamless integration with the Orchestration service layer
· Light-weight footprint to minimize the useless usage of hardware resources and maximize apps/functions/services monetization
· Ability to support both virtualization (VNFs) & containerization (CNFs)
· Service Assurance offering
What about Service Assurance?
Service Assurance often sound trivial but the evolution of SDN/NFV related service infrastructures dictates a change in the perspectives of how services are delivered and monitored to ensure the best Quality of Experience (QoE). As the layers of Service Provider service infrastructure change, there is a strong requirements for NFVi & ViM platforms to evolve and accompany these requirements.
Indeed, service operations are natural components of any service infrastructure such as management (SNMP-based), Mass-scale deployment capabilities (Zero-Touch Provisioning and Configuration). But, it should be noticed that due to the constant evolution towards glitch-free orchestration operations. The Orchestration Service Layer (OSL) is now closer of reaching the capacity to deliver Closed-Loop-Orchestration, which would require from all dependencies to support it from hardware, deployed VNFs/CNFs and obviously NFVi/ViM included. Hardware Asset Management softwares play a significant in this. First, by easing the pain of hardware discovery upon installation. Second, they would play a central role in gathering granular information from the hardware platforms deployed and grooming monitoring/management informational packets up to the Orchestration Service Layer (OSL) and to the OSS/BSS functions eventually.
To ensure closed-loop orchestration, there is a consensus growing around RedFish API and based on the work that several orchestration vendors or the Open Source community is currently undergoing with ONAP components like DCAE / AA&I / Service Orchestration therefore paving the way towards Intent-based orchestration. However, without the NFVi/ViM implication, it won’t go very far.
NFVi and Mobile Network Transformation
How a NFVi/ differ in a Mobile environment?
In a virtualized mobile environment, the construct and the requirements of an NFVi & ViM changes fundamentally due in most part to the nature of a mobile service, how it is supported and rendered. Indeed, Mobile Service Infrastructure requires four important elements:
1. Real-Time Kernel OS:
o It addresses the Mobile Network functions (RU/DU/CU) separation / the Mobile Backhaul and Clock distribution & synchronization requirements
2. Disaggregated X-haul:
o Often referred as Backhaul/Midhaul/Fronthaul requirements that are addressed in MEF Mobile Backhaul but also in the TIP DCSG requirements
3. Clock distribution and clock synchronization:
o Refers to the inherent ability to distribute to all physical/virtual ports the Clock frequency/synchronization required information via the network service infrastructure
4. Service Assurance :
o It plays a fundamental role to allow dataplane/control plane monitoring capabilities such as CFM OAM (802.1ag), Service OAM (Y.1731) or monitoring capabilities from Sync.E (SSM), 1588v2 or TSN (Timing and Synchronization) notably for vRAN and Cell-Site Gateway softwares.
Edge and Cloud, how important are they in a Mobile Service Infrastructure?
One of the key issue that Service Providers or Mobile operators have for that matter, is that Datacenter and Cloud solution considerations and solutions were there before edge became a key subject. Therefore, key features of a given NFVi/ViM in a Cloud environment might not be addressing the same required needed criterias for Edge-based services. This will leave Mobile operators or Service Providers with two possible choices: either to align both Cloud and Edge infrastructure with the same NFVi/ViM solution if possible. Or, working out a Multi-Cloud/ViM strategy whereby the Cloud NFVi and Edge NFVi might be different vendor solutions. Both options will directly impact the selection making process for the Orchestration Service Layer (OSL) solution.
Indeed, in a Mobile Network Transformation context, both Cloud and Edge NFVi/ViM environments have to ensure a tight collaboration to ensure the virtualized functions separation/disaggregation notably RU/vDU but also with DU/CU that is leading to Cloud-RAN solution then 5G.
Mobile derived Functions are fundamentally dependent of networking underlay/overlay capabilities. Depending on the level of separation of the Mobile Network functions separation, the service infrastructure will be highly dependent on Backhaul/Midhaul/Fronthaul features and would require a disaggregated approach. Whereby, a NoS (Network Operation Software) function would be used to interconnect both Cloud and Edge NFVi/ViM layers.
This new notion of a NoS being disaggregated and controlled aka orchestrated by the OSL (Orchestration Service Layer) is certainly not farfetched and has been pushed by SDO such as ONF and Linux Foundation but also by TIP with the DCSG (Disaggregated Cell Site Gateways) whereby a Mobile Network service infrastructure would regroup function segments under an Orchestration services umbrella.These function segments would be categorized, such as this:
Apps & Service Functions (PNFs/VNFs/CNFs)
Networking service Functions
VNFM (VNF/CNF Manager)
All above functions/services would then be logically segregated but then supported by a common Orchestration Service Layer (OSL) umbrella, which provide the needed orchestration / automation / SDN / Service Assurance tools. More importantly, It is also interesting that major SDOs (Standard Organizations) are now reaching the consensus on the MEF 55 (LSO) to converge on their under-defining APIs set to align all the service infrastructure layers to be aligned.
Written by Luc-Yves Pagal Vinette