OPTIMIZING MACHINE LEARNING WORKLOADS USING CUDA AND TENSORFLOW ON GPUS
Abstract
Significant computer resources are needed for modern machine learning (MO) technologies, particularly for deep neural network training and data processing. In this sense, increasing productivity and processing efficiency increasingly depends on workload optimization. Graphics processing units (GPUs), which offer a considerable acceleration over conventional central processing units (CPUs), are among the most effective technologies for speeding up MO procedures.
Tensor Flow is one of the most widely used platforms for creating and implementing machine learning models.
One of the most popular platforms for developing and deploying machine learning models is TensorFlow, which supports GPU computing using CUDA (Compute Unified Device Architecture) technology developed by NVIDIA. CUDA allows you to efficiently parallelize calculations, optimize the processing of large amounts of data, and reduce the cost of training models. Using CUDA in Tens or Flow allows you to achieve significant performance gains through parallel data processing and efficient use of GPU architecture.
The relevance of the research is due to the growing demands on computing power in the tasks of the Ministry of Defense, as well as the need to develop methods and approaches to optimize GPU utilization. In modern conditions, not only the speed of data processing becomes an important factor, but also power consumption, memory efficiency, and load balancing between computing devices.
The purpose of this work is to analyze and develop strategies for optimizing machine learning workloads using CUDA and Tensor Flow on GPUs. The research is aimed at identifying effective methods for allocating computing resources, reducing model training time and improving the scalability of MO systems.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 European Research Materials

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.