Memory Efficiency in Pandas

If you’re working with big data in pandas you can run into memory problems very quickly. When working locally, your machine might slow down or you even get this lovely message that asks you to please kill some applications. If working in the cloud, one can of course always ramp up memory but trust me, having to restart a couple of thousand killed jobs because of Out-of-Memory errors is not fun and also pricey! »

Piping in Pandas: Group By and Mutate

I am a big fan of the tidyverse in R but most of the time, I actually use Python. If the rest of your team uses Python, your production code is in Python, it simply doesn’t make much sense to use R. Anyway, I started to like working with pandas much better once I figured out how to pipe with pandas and how to “translate” from tidyverse to pandas. Then this code in R »