Table of Contents
- Getting Started with PyTorch on Saturn Cloud
- Setting up LSTM Model Training
- Model Training and GPU Comparison
- Model Inference
- Final Thoughts
Disclaimer: I worked with Saturn Cloud to make this example.
A hurdle data scientists often face is waiting for a training process to finish. As a quick solution, some may limit the data they use, reduce the features they have, or use a model that is less complex. These are nice workarounds, however, it is not the best solution. The best way to train your model is how you intended without shortcuts. The real solution is to utilize a powerful GPU instead of a CPU. This method can be integrated into a variety of ways, but one specific way that is especially easy to use, is the from the platform, Saturn Cloud. This platform allows data scientists to easily set up a project including disk space, hardware, size, and image of your Jupyter Server. In just a few minutes, you can quickly set up your Jupyter Notebook to train your model on GPU instead of CPU. For this tutorial, we will be diving into model training with the use of the popular Python library, PyTorch, and compare how long it takes with CPU vs GPU.
Getting Started with PyTorch on Saturn Cloud
To follow this tutorial, follow the link to Saturn Cloud , and click on the button on the homepage “Try Saturn Cloud For Free”. You will then either sign up or log in with Google or GitHub. The free trial includes the following:
- 10 hrs of Jupyter per month (including GPU)
- 3 hrs of Dask per month (including GPU)
- Deploy dashboards
We will be going over Jupyter, including the use of GPU. Once you are in the Saturn Cloud app, you will create and name a PyTorch project. Next, you scroll down to the “Workspace” where you will see the Jupyter Server. You will automatically have 1 GPU selected in your settings, which can be edited by the blue box. You will want to hit play, or click on the green button to start the server, so that the status is running. You will click on Jupyter Notebook, and then it will launch your notebook instantly. You will see a few tutorials already created for you. We will follow the tutorial “01-start-with-pytorch”. For the purpose of this comparison, you will see that I make a few small changes to the code to see what the CPU user time is versus the GPU user time.
Setting up LSTM Model Training
Now that we have quickly launched a notebook using GPU from the Saturn Cloud app, we can start to perform some data science modeling! We will be training a neural network off of one 1 GPU in our project after importing a few of the necessary libraries as seen below:
Next, you will download your data, which is composed of pet names as well as define classes and functions to manage this data. The model architecture will also be defined in this code snippet as well:
As you will see later, these functions and classes will be referenced in the training function, where we can define if we want to use the default GPU of the project, or set it to CPU for the purpose of comparison.
Model Training and GPU Comparison
The default setting in the code is set to GPU. If you want to explicitly set the GPU, you will need to assign the device variable, as
device = torch.device(0). However, for the first training with CPU, we will set the device as the
device = torch.device(‘cpu’). Instead of just the provided
train() function, we will reassign the function as
gpu_train(). Everything will stay the same within the function other than the device. For the sake of easier viewing, I will paste the original train function here, which uses the default 1 GPU:
The next part of the process is to actually execute the training function. You can either execute the original
train(), or run both the renamed
gpu_train(), if you want to compare the speed of CPU versus GPU. Spoiler alert, GPU is considerably faster, so I recommend using the default
train() function if you want to skip to just using that and not compare. You will then set the number of epochs as well as the batch size accordingly. Using
%%time, we can see that the speed of using GPU with PyTorch is nearly 30 times faster,
26.88 to be more specific. As a data scientist, you can imagine how this increase in speed can ease the pain of having to wait for your model to train, as well as reduce slowing down your computer.
Here is what setting your
batch_sizelooks like, as well as comparing the two different trainings, while showing each epoch output along with its respective loss:
Summary of training speed results:
To better understand and compare the CPU versus GPU of the training, we can show the
total times side by side. All types of type have decreased with the use of GPU as highlighted by the green background below.
Next, you will generate names using a function that takes the model, and runs it over and over again on a string, which ultimately generates a new character until a stop character is met. You will prep the data to run in the model again, get the probabilities of each possible next character by running the model, determine what the actual letter is — if the next character is not a stop character, then it will add the latest generated character to the name and continue, to finally return the pet name by turning the list of characters into a single string.
In the example below, you will see how you can display your model output, which will generate 50 names and filter out existing ones.
The final result is a list of names, which will look like this output in your Jupyter Notebook cell:
['Moicu', 'Caspa', 'Penke', 'Lare', 'Otlnys', 'Zexto', 'Toba', 'Siralto', 'Luny', 'Lit',
'Bonhe', 'Mashs', 'Riys Wargen', 'Roli', 'Sape', 'Anhyyhe', 'Lorla', 'Boupir', 'Zicka',
'Muktse', 'Musko', 'Mosdin', 'Yapfe', 'Snevi', 'Zedy', 'Cedi', 'Wivagok Rayten', 'Luzia',
'Teclyn', 'Pibty', 'Cheynet', 'Lazyh', 'Ragopes', 'Bitt', 'Bemmen', 'Duuxy', 'Graggie',
'Rari', 'Kisi', 'Lvanxoeber', 'Bonu', 'Masnen', 'Isphofke', 'Myai', 'Shur', 'Lani', 'Ructli',
'Folsy', 'Icthobewlels', 'Kuet Roter']
Training a model can be time-consuming. It can slow down your computer, which sometimes can mean that you cannot work on another project or task in the meantime. This problem does not have to occur, because you can use GPU instead of CPU. We have seen that by using PyTorch to train an LSTM network, we can quickly improve user time with a simple GPU setup. The comparisons and benefits do not stop there, as you can apply this GPU to other models as well.
If you would like to learn more, here is a link for extra resources for getting started with PyTorch .
I hope you found this article both interesting and useful! Please feel free to reach out if you have any questions. Thank you for reading!
I wrote this article, and was paid to write it by Saturn Cloud. I enjoyed working with them to share easy scaling with GPU to my audience that is a part of a free trial. I do not receive money after this article has been published from them.
 Saturn Cloud, Saturn Cloud, (2021)
 M.Przybyla, Jupyter Notebook Setup Screenshot, (2021)
 M.Przybyla, CPU vs GPU Training Screenshot, (2021)
 M.Przybyla, Comparison Output Image, (2021)
 Saturn Cloud, Getting Started with PyTorch, (2021)
Saturn Cloud, code referenced is from Saturn Cloud developers, (2021)