An effective deep learning architecture leveraging BIRCH clustering for resource usage prediction of heterogeneous machines in cloud data center

Sheetal Garg, Rohit Ahuja, Raman Singh, Ivan Perl

Research output: Contribution to journalArticlepeer-review

Abstract

Given the rise in demand for cloud computing in the modern era, the effectiveness of resource utilization is eminent to decrease energy footprint and achieve economic services. With the emerging machine learning and artificial intelligence techniques to model and predict, it is essential to explore a principal method that provides the best solution for the accurate provisioning of forthcoming requests in a cloud data center. Recent studies used machine learning and other advanced analytics to predict resource usage; however, these do not consider long-range dependencies in the time series, which is essential to capture for better prediction. Further, they show limitations in handling noise, missing values, and outliers in datasets. In this paper, we explored the problem by studying three techniques that enabled us to answer improvements in short-term forecasting of physical machines’ resource usage if the above factors are considered. We evaluated the predictions using Transformer and Informer deep learning models that cover the above aspects and compared them with the Long short-term memory (LSTM) model. We used a real-world Google cluster trace usage dataset and employed Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) algorithm to select heterogeneous machines. The evaluation of the three models depicts that the Transformer architecture that considers long-range dependencies in time series and shortcomings with datasets shows improvement in forecasting with 14.2% reduction in RMSE than LSTM. However, LSTM shows better results for some machines than the Transformer, which depicts the importance of input sequence order. The Informer model, which considers both dependencies and is a hybrid of LSTM and Transformer, outperformed both models with 21.7% from LSTM and 20.8% from Transformer reduction in RMSE. The results also depict Informer model consistently performs better than the other models across all subsets of the dataset. Our study proves that considering long-range dependencies and sequence ordering for resource usage time series improves the prediction.
Original languageEnglish
Number of pages21
JournalCluster Computing
DOIs
Publication statusPublished - 6 Feb 2024

Keywords

  • Cloud computing
  • Clustering
  • Informer
  • LSTM
  • Time-series prediction
  • Transformer

Fingerprint

Dive into the research topics of 'An effective deep learning architecture leveraging BIRCH clustering for resource usage prediction of heterogeneous machines in cloud data center'. Together they form a unique fingerprint.

Cite this