Performance evaluation of selected ML algorithms in GC and AWS cloud environments

Authors

  • Grzegorz Blinowski Warsaw University of Technology, Faculty of Electronics and Information Technology, Institute of Computer Science
  • Marcin Bogucki Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warszawa, Poland.

Abstract

In this paper, we analyze the performance of common machine learning (ML) algorithms executed in Google Cloud and Amazon Web Services environments. The primary metric is training and prediction time as a function of the number of virtual machine cores. For comparison, benchmarks also include a "bare metal" (i.e. - non-cloud) environment, with results adjusted using the "Multi-thread Score" to account for architectural differences among the tested platforms.

Our focus is on CPU-intensive algorithms. The test suite includes Support Vector Machines, Decision Trees, K-Nearest Neighbors, Linear Models, and Ensemble Methods. The evaluated classifiers, sourced from the scikit-learn and ThunderSVM libraries, include: Extra Trees, Support Vector Machines, K-Nearest Neighbors, Random Forest, Gradient Boosting Classifier, and Stochastic Gradient Descent. GPU-accelerated deep learning models, such as large language models, are excluded due to the difficulty of establishing a common baseline across platforms.

The dataset used is the widely known "Higgs dataset," which describes kinematic properties measured by particle detectors in the search for the Higgs boson.

Benchmark results are best described as varied—there is no clear trend, as training and prediction times scale differently depending on both the cloud platform and the algorithm type. This paper provides practical insights and guidance for deploying and optimizing CPU-based ML workloads in cloud environments.

Author Biography

Grzegorz Blinowski, Warsaw University of Technology, Faculty of Electronics and Information Technology, Institute of Computer Science

Ph.D.

Assistant Professor

Additional Files

Published

2025-10-13

Issue

Section

Applied Informatics