My career choice is Data Science. However, 3D modeling always appealed had a special place in my heart, probably because video games play a big role role in my free time (pun intended).
I’ve been slowly learning Blender and 3D Modeling since last summer, so today I decided to make a short compilation of the best tutorials I’ve completed. Those that were jam packed with helpful beginner tips and because of the timing or their sheer quality stuck me with the most.
Don’t get me wrong, I’m still a Blender noob, but at this point I’ve gone through several tutorials…
In the previous articles we’ve created four different Jupyter Notebooks that achieve different data transformations and visualizations of the 2020 Stack Overflow Developer Survey data. Today we are moving all of that to the cloud, more specifically to an Azure DataBricks workspace.
As prerequisites, you need an Azure account as well as a valid subscription, and of course an Azure DataBricks (ADB) workspace (check this documentation page on how to do it). After you’ve created the workspace, I will show you how to set up a cluster to run your computations on, and upload the notebooks and the data.
Today we reach the fourth part of this series, the last part about writing code. The fifth and last part will be about moving our notebooks to the cloud, to an Azure DataBricks workspace.
But before that, we need to analyse the programming languages used by the respondents of the 2020 Stack Overflow Developer survey. This column comes in a bad format, as the choices of each developer are put in the same column, separated by a semicolon (;). …
So far in this series we’ve worked with numerical data. Today we’ll analyse the education of respondents to the 2020 Stack Overflow Developer Survey by finding out which are the most frequent education levels.
Thankfully this is a straightforward demo. We just need to map the original options to new values (e.g. “Master’s degree (M.A., M.S., M.Eng., MBA, etc.)” to simply “Master’s degree”), count their frequencies and plot the bar chart.
As usual, here are some handy links to navigate the contents of this series:
In the first part of this series we went through some exploratory data analysis of ages to filter out the bad data, and at the end plot a bar chart with the age frequencies. Today we are working on the annual compensations of the 2020 Stack Overflow Developer Survey results.
This will involve binning the values so that we can plot them in a histogram at the end. For that, we need to create bin labels (to improve the visualization) and the bin intervals. We’ll make plenty use of the wonderful list comprehension feature of Python!
While Plotly can bin…
After posting a handful of separate articles on data analysis with Python, I’ve decided to share some of the work I did on previous personal projects in the form of a proper series.
This “Python Data Analysis” series will consist of five articles tackling different data problems using the 2020 Stack Overflow Developer survey results dataset. I will show you how to use pandas to overcome issues with numeric and categorical data to create nice visualizations with Plotly (Express) at the end.
Although I only show a Python script here, each article has its own Jupyter notebook with the same…
pandas is a wonderful library to work with data in Python. If you’re accustomed to tabular data, then you will feel right at home with this pandas, better yet, while writing Python code. I’ve started working with this library a couple years ago, but I only started using it seriously last year. In this period, I’ve come across many useful functions and so today I will briefly show-off five that have stood out to me for their applications.
Sometimes there is a need for a custom sorting order. If you try to use the
sort_values function in a column with…
As with many other things, Python is pretty good for programmatic image editing. This is in large part thanks to the Pillow library. Creating an image from scratch to paste a couple layers on top, or apply some filters is easily accomplished with a few lines of code.
Today, I will show you a short script that creates custom-shape masks. We’ll load a Santa hat without background and add one to it, along with a shadow in less than 20 lines of code!
For the original Santa hat image, please find it here.
If you’re already familiar with Pillow…
Among the many powerful connectors available in Power BI, the Web connector is a great option if you wish to integrate web scraping in your reports. While it does have some limitations, such as only scraping HTML tables, it is nonetheless a strong addition to your ETL capabilities in Power BI.
In this tutorial, I will walk you through the scraping of country flag images from here. We will start by connecting to the website to extract the tables of countries, and then write some M code in Power Query to create the URLs for each country flag image (yes…
When working with pandas dataframes, sometimes there is a need to sort data in a column by a specific order. For example, you may want to sort a Dataframe by its column of months so that they are properly sorted for a time series visualization. The problem is, a normal sort will get your months sorted alphabetically, not in the natural January to December order.
It’s in cases like these that the
Categorical function can help. Just like you can transform a column to a numeric type, you can also transform it to the
category type to be treated as…
I write technical articles about data analysis and other things that catch my attention