A collaboration with two of our strategic partners, Merge IT and Zadara. Zadara is well known in the MSP space for delivering object based storage that can provide immutable backups when partnered with Veeam and other backup technologies. Zadara acquired a company called Stratoscale and rebranded the technology as zCompute which acts as their IaaS offering. They have their own KVM hyper-visor and they have the capability to run VMware vSphere and Microsoft Hyper-V.

Utilising a SDS (Software Defined Storage) layer and ultra fast 100GB Infiniband Mellanox switching, Zadara is a true HCI (Hyper Converged Infrastructure) appliance that can provide Block, File and Object storage, in a single, unified platform.  The purpose of this dedicated showcase is to demonstrate that we can take Kaggle Datasets, codebases and scripts to benchmark certain Machine Learning models on different hardware stacks.

 

We baselined the initial tests on a small machine in AWS utilising a legacy NVIDIA GPU and deployed the same Datasets, codebases and scripts on two Zadara virtual machines running on their zCompute platform (which utilises 2x newer GPUs from NVIDIA).

To provide a diverse range of testing, we chose to select workloads and datasets that varied in the type of data (Structured, Semi-Structured & Unstructured) that was being modelled. These tests ranged from medical image scanning, detecting genetic biomarkers and predicting stock market sentiments.

 

We captured all of the performance and resource utilisation statistics (using Cloudwatch in AWS and Prometheus on Zadara) in order to showcase that intense workload processing such as medical image scanning can be run in a short amount of time using modest compute and storage resources by targeting a small Dataset. We compared a small machine in AWS using an NVIDIA T4 GPU against 2x larger machines on Zadara hardware using NVIDIA A2 and A40 GPUs. A simple software stack was deployed which was consistent across machine testing (Ubuntu Linux, Docker Container).

 

During our initial tests, we ascertained that the scripts provided by the Data Scientist for Kaggle were not optimised and therefore the machine wasn’t using the resources available in an efficient manner. If we continue to listen to the narrative that we need super powered GPUs for AI, ML and HPC workloads, we will miss the point of this project. The video we produced to showcase the results of Project Demeter, clearly shows that the average use of GPU memory for reasonably priced GPUs performed more than adequately. In fact, we can see that the average use of GPU memory was very low through the entire test and most of the load was on the disk.

We purposefully chose the size and specs of our virtual machines to show that intense workload processing can be done with modest resources that carry a modest cost.

 

CONTACT US

Get a quote or request a demo of our products.

Contact us

OUR PARTNERS:

GRC Castrol GRC NYX VX Merge IT