Resumen de Deep learning solutions for next generation slicing-aware mobile networks

Ayuda

Resumen de Deep learning solutions for next generation slicing-aware mobile networks

Dario Bega

The expectations that build around future 5G Networks are very high, as the en- visioned Key Performance Indicators (KPIs) represent a giant leap when compared to legacy 4G/LTE networks. Very high data rates, extensive coverage, sub-ms delays are just few of the performance metrics that 5G networks are expected to boost when deployed.

This game changer relies on new technical enablers such as Software Defined Net- working (SDN) or Network Function Virtualization (NFV) that will bring the network architecture from a purely hardbox based paradigm (e.g., a eNodeB or a Packet Gate- way) to a completely cloudified approach, in which network functions that formerly were hardware-based (e.g., baseband processing, mobility management) are implemented as software NFV running on a, possibly hierarichical, general purpose telco-cloud.

Building on these enablers, several novel key concepts have been proposed for next generation 5G networks (e.g., virtualized Radio Access Network (RAN) or Network Slic- ing) requiring a re-design of current network functionalities.

Among them, Network Slicing is probably the most important one. Indeed, there is a wide consensus in that accommodating the very diverse requirements demanded by 5G services using the same infrastructure will not be possible with the current, relatively monolithic architecture in a cost efficient way. In contrast, with network slicing the infrastructure can be divided in different logically independent slices, each of which can be tailored to meet specific service requirements.

A network slice consists of a set of Virtual Network Functions (VNFs) that run on a virtual network infrastructure and provide a specific telecommunication service. The ser- vices provided are usually typified in macro-categories, depending on the most important KPI they target. Enhanced Mobile Broadband (eMBB), massive Machine Type Communication (mMTC) or Ultra Reliable Low Latency Communication (URLLC) are the type of services currently envisioned by, e.g., ITU. Each of these services is instantiated in a specific network slice, which has especially tailored management and orchestration algorithms to perform the lifecycle management within the slice.

In this way, heterogeneous services may be provided using the same infrastructure, as different telecommunication services (that are mapped to a specific slice) can be configured independently according to their specific requirements. Additionally, the cloudification of the network allows for the cost-efficient customization of network slices, as the slices run on a shared infrastructure.

Much of the complexity in re-designing mobile networks for slicing relates to decision- making towards an efficient, dynamic management of resources in real-time. There, Artificial Intelligence (AI) appears as a natural approach to design the various algorithms employed by different network functions. As a matter of fact, (AI) provides a powerful tool to address highly complex problems that involve large amounts of data. This is indeed the case with network slicing, where the presence of a large number of slices each independently operated by a different tenant, drastically increases the complexity of the system with respect to legacy non-sliced networks controlled by a single entity, i.e., the network operator. Also, the sheer amount of data flowing through the network and potentially relevant to resource allocation decision-making, and the diffi- culties in forecasting the overall behavior of a system involving many different players, make traditional tools for network management insufficient.

Devising a network slicing framework requires novel algorithms to manage the infrastructure resources, sharing them between the different slices while guaranteeing the requirements of each slice will be met. This applies throughout the following network functions: • Admission Control is in charge of deciding whether upcoming network slice requests can be admitted or not in the system, and is enacted so as to ensure that the requirements of the admitted slices are satisfied.

• Network (re-)orchestration is central to both slice instantiation and run-time operation, since it allocates the available network resources to the admitted slices in the most efficient way possible, and then dynamically updates such an allocation at run-time in order to fulfill the time-varying demands of each slice while avoiding capacity outages.

• Radio resource sharing is paramount at run-time, as it manages the sharing of radio access resources among the network slices, ensuring that potentially stringent requirements of all the slices (e.g., in terms of latency and throughput) are met over the air interface.

In this thesis we envisage a new framework to support network slicing. This framework gathers components to deal with each of the above functions. These components involve different phases of the “Network Slice Lifecycle Management”, consisting of four main steps that have to be addressed: (i) preparation; (ii) instantiation, configuration and activation; (iii) run-time and (iv) de-commissioning. Each of these phases involves different timescales: (i) admission control runs at frequencies that match those of arrivals of new network slices requests, which may be in the order of hours; (ii) the orchestration of resources in softwarized networks occurs at frequencies that depend on the time required to re-sizing virtual machines resources, typically in the order of minutes; and, (iii) scheduling of radio resources applies at a finer granularity, down to millisecond intervals in extreme cases. It is also worth highlighting that the different functions (and their associated algorithms) may benefit from mutual interactions. The information on resource utilization gathered at the network orchestration level can be leveraged for admission control, where it allows understanding whether admitting a new slice may lead to problems in provisioning enough capacity for all admitted services. Similarly, data collected by the resource management function can help an orchestrator to produce more accurate forecasts of future slice demands for anticipatory resource allocation.

All functions above need to make decisions to meet the requirements of the individual slices while maximizing the overall system performance. To this end, they need to learn the dynamics of per-slice data traffic, and automatically react to their impact on the network architecture, towards their respective management goals. Self-adapting network function configurations were introduced over a decade ago, however the solutions designed so far typically apply control on limited sets of parameters that change slowly in time (e.g., eNB transmission power). Also, current approaches produce outputs that then need human intervention to be translated into modifications of the network configuration (e.g., updating the transport network so as to optimize handovers in a given region).

These characteristics are not compatible with the novel requirements introduced by network slicing. The parameters that may need reconfiguration are much more numerous, as each virtual network functions may expose several of them in a programmatic way. The timescale at which decisions must be made is drastically reduced, as one must be ideally capable of acting at radio level timings or even at wire-speed. Decisions often need to take into account metrics that go beyond pure network performance, such as energy efficiency or infrastructure monetization, which may hide complex cross-relationships.

This context provides a fertile ground for AI to become instrumental in mobile network operation. All classes of AI may be useful to this end, including (i) supervised solutions that require ground truth data for training, (ii) unsupervised techniques that work in absence of ground truth, and (iii) reinforcement learning approaches where different forms of interaction with the system that has to be controlled are possible. The most appropriate AI tools must be selected case by case, depending on the involved algorithmic requirements and operation timescales.

For instance, reinforcement learning is particularly well suited when the time dynamics of the problem can accommodate a learning curve, and the objective is to define a sequence of actions that maximizes a certain reward: this is the case of admission control algorithm as demonstrated by the practical implementations presented in this thesis. Conversely, when the target is to provide decisions that are independent of those previously taken and whose quality can be assessed during systems training, supervised learning solutions are a strong option: this is precisely the settings where network resource orchestration takes place, as illustrated by the applied solution devised for addressing capacity orchestration at different network levels.

Before proceeding further, we remark that those presented next are examples of successful integration of AI across the envisaged framework. They do not exhaust the application space of AI for network operations; rather, they realize important components in the comprehensive design of self-organizing sliced mobile networks.

As a consequence of the profound changes in the next mobile network generation architecture, the integration of the key novel concepts envisioned for 5G networks will affect the current mobile network functions. In this thesis we focus on Network Slicing and in particular on the re-design of two fundamental network functions, namely Admission Control and Resource Orchestration. Each of them conveys several challenges that need to be taken account of, while designing viable implementations of slice management functions.

Network infrastructure resources are limited and network slice demand quality guarantees, which calls for admission control on new slice requests. According to 3GPP standardization on network slicing, the Communication Service Client (CSC), i.e., the tenant, will request specific services to the Communication Service Provider (CSP), i.e., the network provider, among those available in the offered portfolio. Then, it will pay for the service according to metrics like, e.g., the number of served users, the service coverage area, or the duration of the slice instance. Such admission control decisions have profound business implications: the choice of how many network slices to run simultaneously, and how to share the network infrastructure among slices have an impact on the revenues of the network provider.

The complexity and heterogeneity of the slice admission decision process deprecates manual configuration, which is the de-facto legacy approach in 4G networks. To identify the best operating point, slice admission control must learn the arrival dynamics of slices and make decisions that maximize the revenue, based on the current system occupation and its expected long-term evolution. This problem is highly dimensional (growing linearly with the number of network slices) with a potential huge number of states (increasing exponentially with the number of classes) and many variables (one for each state). Furthermore, in many cases the behavior of the tenants that request slices is not known a priori and may vary with time. For these reasons, traditional solutions building on optimization techniques are not affordable (because of complexity reasons) or simply impossible (when slice behavior is not known). Instead, AI provides a means to cope with such complex problems.

Once admitted, slices must be allocated sufficient resources. Due to the prevailing soft- warization of mobile networks, such resources are increasingly of computational nature. This holds both at the (RAN) where they map to, e.g., CPU time for containers running baseband units (BBUS) in Cloud Radio Access Networks (C-RAN) datacenters, and in the Core Network (CN) where, e.g., virtual machines run softwarized Evolved Packet Core (EPC) entities in datacenters. In these case, ensuring strong Key Performance Indicator (KPI) guarantees often requires that computational resources are exclusively allocated to specific slices, and cannot be shared across others. The dynamic allocation of network resources to the different admitted slices becomes then a chief management task in network slicing.

In this context, the network operator needs to decide in advance the amount of resources that should be dedicated to the different slices, so as to ensure that the available capacity is used in the most efficient way possible and thus minimize operating expenses (OPEX). Finding the correct operational point requires (i) predicting the future demand in each slice, and (ii) deciding what amount of resources is needed to serve such demand.

These two problems are complex per-se: forecasting future demands at service level requires designing dedicated, accurate predictors; instead, allocating resources in a way that minimizes the OPEX of the operator requires estimating the expected error of the prediction. Moreover, addressing (i) and (ii) above as separate problems risks to lead to largely suboptimal solutions, since legacy predictors do not provide reliable information about the expected error they will incur into. While the complexity of the complete solution may be daunting with traditional techniques, AI can be leveraged to address both aspects at once.

This thesis investigates the application of deep learning solutions for next generation sliced mobile networks. In details, we have designed: 1. Admissibility region analytical formulation. An analytical model for the admissibility region of a network slicing-capable 5G Network has been devised. It provides to the Infrastructure Provider (InP) the information about the maximum number of network slices can be admitted in the system to maximize his revenue while guaranteeing that the Service Level Agreements (SLAs) are met for all tenants. This is a fundamental information for the InP, indeed if admitting a new network slice in the system would lead to violating the SLA of already admitted slices, then such a request should be rejected.

2. Decision-making optimal and adaptive algorithms design. The decision-making process on slice requests has been modeled as Semi-Markov Decision Process (SMDP), including the definition of the state space of the system, along with the decisions that can be taken at each state and the resulting revenue. This is used to derive the optimal admission control policy that maximizes the revenue of the infrastructure provider, which serves as a benchmark for the performance evaluation, and an adaptive algorithm that provides close to optimal performance.

3. Machine Learning based admission control algorithm. Admission control represents a very complex task. This problem is highly dimensional with a potential huge number of states and many variables. Optimal methods require that all variables are known, and as adaptive algorithms do not scale to huge space states. Machine Learning provides a mean to cope with such complex problems, consequently we have designed a practicable Neural Networks (NNs) solution based on deep reinforcement learning that interacting with the system, learns the acceptance policy that provides close to optimal performance. This represents a very flexible and scalable solution that substantially outperforms naive approaches as well as smart heuristics requiring only few hundreds of iterations to converge optimal performance.

4. Capacity Forecast. Legacy techniques for the prediction of mobile network traffic aim at perfectly matching the temporal behavior of traffic, independently of whether the anticipated demand is above or below the target. As a result, they are not aware of the network costs and they incur in substantial SLA violations. Hence, we introduce the notion of capacity forecast, i.e., the minimum provisioned capacity needed to cut down SLA violations. This closes the gap between simple traffic prediction and practical orchestration.

5. A Deep Learning Framework for Resource Orchestration. DeepCog, a new mobile traffic data analytics tool explicitly tailored to solve capacity forecast problem, has been designed. It hinges on a deep learning architecture that leverages a customized loss function that targets capacity forecast rather than plain mobile traffic prediction. It also provides long-term forecasts over configurable prediction horizons operating on a per-service base. Thorough empirical evaluation with real-world metropolitan-scale data show the substantial advantages granted by DeepCog over state-of-the-art predictors and other automated orchestration strategies, providing a first analysis of the practical costs of heterogeneous network slice management across a variety of case studies.

6. Two-time scale anticipatory capacity allocation for network slicing with hard guarantees. An original model for the anticipatory allocation of capacity to network slices, which is mindful of all operating costs is proposed. The new model takes into account not only the orchestration costs associated with over- and under-provisioning but also the ones linked with resource instantiation and reconfiguration. AZTEC, a complete framework for capacity allocation to network slices has been designed. It relies on a combination of deep learning architectures and traditional optimizer. The evaluation results, performed on extensive real-world data show that AZTEC significantly improves the performance of state-of-the-art solutions, while providing operators with a fine-level control on the underlying system.

At the time of writing DeepCog and AZTEC are, to the best of our knowledge, the only works where a deep learning architecture is explicitly tailored to the problem of anticipatory resource orchestration in mobile networks.

Acceso de usuarios registrados

¿Olvidó su contraseña?

¿Es nuevo? Regístrese

Ventajas de registrarse

Dialnet Plus

Coordinado por: