A Data Scientist’s perspective on SQL-like Python functions

Photo by Bruce Hong on Unsplash [1].

Table of Contents

Introduction

Whether you are transitioning from a data engineer/data analyst or wanting to become a more efficient data scientist, querying your dataframe can prove to be quite a useful method of returning specific rows that you want. It is important to note that there is a specific query function for pandas, appropriately named, query. However, I will instead be discussing the other ways that you can mimic querying, filtering, and merging your data. We will present common scenarios or questions that you would ask to your data, and rather than SQL…


Opinion

These careers can be surprisingly different…

Photo by Leon on Unsplash [1].

Table of Contents

Introduction

Before we start, I want to mention that some data science roles require both of the skills from both of these positions. So, yes, some data scientists are required to be machine learning engineers as well. But, there is still a great number of roles that have more well-defined requirements that are not as much overlapping. With that being said, if we are to look at positions where these roles are separate, as personally, I have experienced, then we can look at their differences. Overall, it is best…


How to make more money while also becoming a better Data Scientist

Photo by Danielle MacInnes on Unsplash [1].

Table of Contents

Introduction

Other than having a steady data science job during the workday, there are several other smaller ways to increase your income significantly on the side. It can be said that having a few side hustles would be overwhelming, and take away from your main job, however, I think it is quite the opposite. When you are working on more data science projects (to a certain point), you are strengthening your knowledge base in data science, which can ultimately help you perform better at your current job…


Opinion

Here’s the difference

Photo by bruce mars on Unsplash [1].

Table of Contents

Introduction

When doing a search for data science versus deep learning, the results are surprising. Most of the articles that show up are comparing data science to machine learning, which is of course useful, but not as relevant as comparing it directly to deep learning. With that being said, that is the purpose of this article — to compare, directly, these two popular fields of study. While there are comparisons out there, I wanted to give my professional comparison from my experience — hence, the opinion label of this article. …


Opinion

No, Data Scientists will not lose their job, and here’s why.

Photo by Aideal Hwa on Unsplash [1].

Table of Contents

Introduction

As data science has increased in popularity, as well as become more well-defined, there has been the idea that data science itself can be automated. While, yes, there are plenty of processes that data scientists do that can and probably will be automated, there are key steps to the process that will almost always need expert intervention. Some aspects of data science like model comparison, visualization creation, and data cleaning, can be automated. However, some of these steps are not really…


Opinion

Here’s why… the one-stop-shop for Machine Learning

Photo by Pablo Arroyo on Unsplash [1].

Table of Contents

Introduction

Whereas data scientists in the past have had to use quite a bit of code to come up with testing, comparing, and evaluating machine learning algorithms, there has recently been an emergence of libraries in Python that reduce that work significantly. One of those libraries is PyCaret [2], by Moez Ali, an open-source library with small amounts of code required that ultimately allows you to quickly prepare data to deploy your final model in minutes. There are several benefits…


Opinion

… that are not Data Science specific

Photo by Csaba Balazs on Unsplash [1].

Table of Contents

Introduction

This article is intended for beginner data scientists and data scientists who may not use these tools already. As data scientists learning or working professionally, we can sometimes forget more general tools that are incredibly important to everyday work. These tools are mainly useful because they serve as avenues of explaining work to stakeholders or non-data science audiences. I also find them very user-friendly, serving as tools to provide and describe quick analysis, summaries, and documentation. …


…removing stop words for Data Science applications

Photo by JESHOOTS.COM on Unsplash [1].

Table of Contents

Introduction

It should be no surprise that data is most of the time, messy, unorganized, and difficult to deal with. As you work your way into data science from educational practice, you will see that most data is obtained from multiple sources, multiple queries, and that can lead to some unclean data. In some or most situations, you will have to come up with the dataset that will ultimately be used to train your model. There are a few articles out there that focus on numeric data, but I want the focus of this article…


Opinion

and use CatBoost instead.

Photo by Pacto Visual on Unsplash [1].

Table of Contents

Introduction

I want to first say that XGBoost is an incredibly powerful machine learning algorithm that has proven to win countless data science competitions, as well as most likely be at the forefront of most professional use cases in data science. For that reason, this algorithm has been in the spotlight for the past few years, which is a good thing. With that being said, it means that its disadvantages have also been in the spotlight, therefore, creating the motivation to make a similar algorithm that excels where XGBoost does not. Let’s discuss XGBoost…


Opinion

A closer look into the professional aspects of being in Data Science

Photo by The Climate Reality Project on Unsplash [1].

Table of Contents

Introduction

Educational programs, whether that may be an online course, an article even, or an undergraduate and graduate program, often neglect the professional aspect of data science. Of course, highly complex, machine learning algorithms and deployment of models is incredibly important to learn, but there are some other aspects of data science that are especially important as a professional data scientist or data scientist that is more customer-facing. A customer also does not necessarily mean the customer of a product, but the customer of your company, as in the stakeholder. …

Matt Przybyla

Sr. Data Scientist. Top Writer in Technology and Education. Author - Towards Data Science. MS in Data Science - SMU.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store