As the volume of data, digital transformation, and the pace of technological change accelerate, the ability of organizations and professionals to keep up and capitalize on the opportunity is becoming more challenging. In particular, traditional software-based approaches cannot deal with the new heterogeneity of service's demands in the 5th Generation mobile network (5G) paradigm. It is in this framework where Software Defined Networking (SDN) and Network Function Virtualization (NFV) technologies appear, as part of the new set of equipment and techniques needed. Among them, network slicing seems to be the most promising tool for the allocation of needed resources when customization of services, Key Performance Indicators (KPIs) and Quality of Service (QoS) guarantees are essential.
Given this new reality, operators and tenants will work in a multi-domain network context, where adaptability and programmability are paramount, as well as data isolation and management automation. Our data-driven study contemplates an extensive dataset with a high-granularity, discussing two main topics that will help decision makers to base their investments in real-world information and address issues before they become problems.
On the one hand, we identify the following classes of components in the time series of the demands of multiple popular mobile services: (i) non-composable, i.e., recurrent patterns that are found in all time series, hence are inherently impossible to compose (their sum is just a scaling of the pattern observed in each time series); (ii) composable, i.e., recurrent patterns that are specific to each time series (which are the portion we can actually sum with some hope to obtain near-constant load); and (iii) noise, non-recurrent patterns that are excluded from our analysis, small in magnitude. Then, we correlate and apply clustering algorithms to the relevant features in three dimensions (i.e., time, space and frequency) to classify the service's behavior and be able to provide recommendations on how to allocate network resources for distinct clusters.
A first finding from this research is that no two services exhibit similar time patterns in their nationwide aggregate traffic: although expected for different service categories, this is less obvious for akin services, e.g., diverse applications that all provide video streaming. A second key insight is that mobile services have very comparable geographical distributions of both total and the per-user traffic demands. That is, different services have different temporal patterns (\i.e., they are consumed at different times), but their geographical patterns (i.e., locations where they are consumed) are very similar. An exception was found on 2 services, given their ubiquity and synchronized nature.
Our third takeaway message is that spatial distributions of per-subscriber service usage are in fact driven by land use, i.e., the urbanization level plays a major role in influencing how much mobile services users consume. Nonetheless, it has a much lower impact on when they do so, as the average subscribers in urban, semi-urban and rural regions all follow similar service access patterns; a notable exception is represented by users on high-speed trains, who show unique time dynamics. Along these lines, we could identify common periodic behaviors in the real-world traffic generated by a large set of applications by leveraging spectral methods. This approach requires a two-step representation of the components extracted with a Fourier Transform, followed with a Density-Based Spatial Clustering of Applications with Noise (DBSCAN). In particular, we can summarize them as follows: I. Almost all (33 out 37) of services have a largely dominant component with a 24-hour periodicity. It is easy to map such a component to the circadian rhythm of human activities, which alternates low traffic overnight and high demand during the day.
II. Most (32 out of 37) services also show the same significant dynamic at a 12-hour periodicity. Many (22 out of 37) also share components that highlight regular patterns at every one week, 4.8 hours. An investigation of the causes for these sub-daily patterns is out of the scope here, and an object for future research; yet, we speculate that commuting affects the demands for many services and may be behind these dynamics.
III. Common regular behaviors are present also at periods longer than one day for many (18-21 out of 37) services. One week and 28 hours are the most relevant periods, and we consider that those are linked with different dynamics occurring during weekends.
IV. Several services tend to defy classification, and (i) have no or much less relevant components in the 24-hour cluster, (ii) have an unusually high weight associated with specific clusters, and/or (iii) have a high incidence of components that are not included in any cluster (i.e., are outliers). Services in this category include games (King, Pokemon Go, generic Gaming platforms), audio streaming services (iTunes, generic Audio streaming), Netflix, peer-to-peer, and adult web traffic. These are fairly specific categories of mobile applications, with different but reasonable reasons for their diversity. For instance, Netflix is a fairly unique service providing long-lived video streams to niche mobile users. Audio streaming applications are the sole that do not need visual attention by the user. Or, adult web traffic is characterized by unique patterns due to its socially inconvenient nature.
After revealing that service requirements are heterogeneous, we present our next data-driven clustering results, which are significantly meaningful for network orchestration and planning purposes, as the operator decides the number of antennas that are required to process all the data and the region where a data center is needed. The proposed techniques, complementary to the definition of network slices and hierarchies in the related work, provide some light in regard to take informed decisions. We not only fill the gap on current possible clustering methodologies with reasonable performance, but also show the trade-off between complexity and efficiency in data's computational effort. We emphasize that for the static scenarios, our time-frequency hybrid approach implies a reduction of the data with principal component analysis (PCA), in order to achieve a high-speed performance. However, for the frequency approach the task that consumes more effort resides in tuning the parameters to have a reasonable number of discarded components without biasing the result.
Moreover, we should mention that such data reductions when implementing slices will be key in dynamic 5G scenarios, where a quick reconfiguration of the network is expected. Above all, wavelets have shown the best applicability for understanding service requirements geographically. The decomposition of service's traffic data in several patterns, exhibited distinct behaviors even for a specific service. This fact allows the operator to define clusters based on similar patterns and less time periods than the spectral approach, aiming to complement the one-side spectral proposal that defined many additional clusters.
Hence, the trade-off between the complexity of the defined virtual network structure and the number of slices depends upon a new variable: the number of regions where a service has a similar usage pattern. For instance, we can observe that YouTube has up to 8 different patterns, and therefore the network operator could tune the network slice not only considering the number of clusters of a city, but also taking into account the number of regions to cover.
In fact, the wavelet signature of a service can be clustered over space given its own patterns, as we clearly identify that both YouTube and Instagram services have analogous areas defined for their hybrid time-frequency characterization. This means that an operator could decide to cluster together both services (i.e., group them in the same customized network slice) over those regions for orchestration purposes, as long as their network requirements are compatible. Under this scheme, services that are clustered together tend to use more resources during the same period of time, as they are synchronized, whereas the non-clustered services potentially do not suffer this effect. While we leave the orchestration algorithm for future work, in the next section we quantify the efficiency of putting together different services in the cloud.
To do so, we carry a profound data-driven analysis on quantifying resource management efficiency and cost-effectiveness of the system showing the trade-off between (i) assigning dedicated resources for service customization purposes, and (ii) resource sharing practices among services. Particularly, our results provide insights on the achievable efficiency of network slicing architectures, their dimensioning, and their interplay with resource management algorithms at different locations and reconfiguration timescales in a multi-service, multi-tenant network at scale. Specifically, we retain the following main conclusions: Multi-service requires more resources. Building a network that is capable of providing different services (possibly associated to several tenants) will necessarily reduce efficiency in resource utilization. We quantify this loss in almost one order of magnitude if considering distributed resources (such as spectrum), yet the efficiency loss stays as high as 20% even in large datacenters in the core network. These figures translate into high costs for the infrastructure provider, who must compensate for them by aggressively monetizing on the new business models enabled by a multi-service scenario (e.g., Network Slice as a Service, Infrastructure as a Service).
Traffic direction is a factor. Uplink and downlink traffic exhibit similar efficiency trends across network levels, but uplink exacts a much higher efficiency degradation to meet equivalent QoS requirements. Although uploads account for a small fraction of the overall load, the lower efficiency of uplink may entail additional challenges for the operators. Indeed, uplink QoS requirements are key to specific services with stringent network access needs such as mobile gaming, and it is likely that multiple instances of such services belonging to different tenants (e.g., video-gaming platforms owned by different gaming providers) have to be served in a resource-isolated fashion in parallel.
Loose service level agreements may not help. Although slice specifications granted to tenants may be moderated, the overall efficiency grows only when guarantees on the serviced demand are very much lowered, up to a point that they may not be suitable for certain services (needing, e.g., “5 nines reliability”, or strict bandwidth requirements over very short time slots).
Overbooking is a key strategy. While downgrading the requirements in terms of served fraction of traffic only helps when brought to extreme levels, flexibly serving small portions of the individual slice demands via a non-customized common slice provides high benefits. Therefore, overbooking solutions that only marginally underserve slices may yield substantial economic gains for the operators, as they allow trading off substantial resource deployment costs with negligible penalty fees due to slight Service Level Agreement (SLA) violations. This corroborates the importance of recent approaches for practical end-to-end resource overbooking in sliced 5G networks.
Guaranteeing traffic volumes at the antenna is costly. If operators define SLAs in terms of assured traffic volumes, they shall note that meeting the QoS requirements will need substantial additional resources at the radio access, even if guarantees are loose and overbooking is in place. SLAs defined in terms of guaranteed time slots allow much more flexibility in balancing efficiency and QoS for each network slice.
Dynamic resource assignment must be rapid. The design of dynamic resource allocation algorithms is crucial to increase the efficiency of future sliced networks. However, substantial gains will only be attained if the virtualization technologies enable a fast enough re-orchestration of network resources. While current Management and Orchestration frameworks provide such capabilities, intelligent algorithms able to forecast mobile service demands and anticipate resource reconfiguration are also required, which may be challenging for short timescales. Underestimation of resources may lead to SLA violations, whereas over-provisioning may harm the economic feasibility of the system. Artificial Intelligence and Machine Learning are promising techniques to accomplish this, and are also being brought into the network management landscape by standards like the European Telecommunications Standards Institute (ETSI).
Aggregating services is beneficial. Aggregating similar services into a same slice increases the system efficiency significantly, yet it comes at the price of losing the ability to customize treatment to each service. Particularly, we find that if the services with the highest traffic load acquire their own slice and the remaining ones are aggregated into a common slice, gains are limited unless the common slice includes services with significant load. This implies that operators may face a business trade-off between providing dedicated support to highly remunerative, popular services, and incurring high management costs to implement the associated slices.
Deployment is slightly more efficient than operation. We analyzed the sharing efficiency from both a continuous resource usage and an infrastructure deployment perspective. While they have similar trends in the network core, the efficiency at the radio access is higher for installed hardware in presence of high-frequency resource reallocation.
Urban topography has limited impact. The fact that all of our results are very consistent in two urban areas with a quite different nature lets us provide general insights that hold beyond one particular scenario. More precisely, as usage demands are eventually driven by human factors, we expect that our considerations might apply to other metropolitan regions in (and possibly beyond) Europe.
Efficiency under uncertain load demands. Our analysis concerns resource management efficiency under known loads, as slices are allocated the exact resources needed to meet the corresponding service demands. This lets us investigate the impact of the limited reconfigurability of resources, which forces the operator to provision a constant amount of resources during the following reconfiguration period. In a real system, however, the network slices demands are not known a priori, and resources have to be allocated based on a forecast of the expected demand during the next re-orchestration interval.
This introduces a second source of inefficiency, i.e., the inaccuracy of traffic predictions, which imposes some overprovisioning in the allocated capacity to combat the uncertainty associated with the future load information.
In summary, this work provides an original perspective on the temporal analysis of mobile applications. By leveraging distinct methods, we could identify common behaviors in the real-data traffic generated by a large set of services, which were not detected by previous studies. Our results pave the road for further investigations, aimed at explaining the root causes for these temporal similarities, at assessing their generality at different spatial and geographical scales, and at exploiting them for applications in network planning and resource management. For instance, new and improved data plans, widespread broadband availability, and services anticipate and meet the demands, will boost new lifestyles where connectivity and mobility are paramount.
However, our work is meaningful for domains beyond networking, as it will help to establish a symbiosis between the Telecommunications and Transportation industries to improve the mobile coverage and the transport connection at underserved regions. Hence, society will have additional tools to fight social segregation issues and avoid self-contained residential areas. In addition, as mentioned before, when new mobility and lifestyle trends appear, it will affect other fields of study such Sociology, Education or Medicine. For example, the digital transformation will modify how we work and study, or it will allow us to receive telesurgery treatments and telemedicine diagnosis.
On a side note, as the dataset is from 2016, it would be interesting to update the information up to the present day and study the main changes found in this thesis, as well as to perform historical evaluations in different countries that consume other mobile services (e.g., Line messaging or TikTok video streaming service).
Also, we can integrate current algorithms in systems and work on the algorithms that leverage our analysis, while we keep track on how the parameters defined in our models evolve with respect to them. Of course, this is out of the scope of this thesis, but it would be of significant value, as it would allow to study both the composability of resources, and the flexible SLA adherence for future 5G networks. For example, we could quantify the network efficiency of resources when clustering together services that are synchronized in distinct regions, as they tend to consume more resources over the same period, while this effect could not be exhibited by the non-clustered ones.
Last but not least, we claim that the novel methodology to generate and work with synthetic data solves data privacy derived issues stated in the General Data Protection Regulation (GDPR). This technique, combined with the wavelets approach and forecasting algorithms would allow operators to design their network infrastructure in advance, reducing the over-provisioning risk and boosting new networks based on distinct data traffic consumption.
© 2001-2024 Fundación Dialnet · Todos los derechos reservados