Member-only story

Speed up Pandas 150x without Code Changes

Agent Issue
3 min readNov 9, 2023

--

As your data grows, pandas feels the strain. The amount of data you can handle depends on the memory available on the host machine running the code, and it’s often not just about the size of the dataset — you usually need to juggle multiple copies together for various operations such as chain indexing, assignment, apply functions or type conversions.

Well… NVIDIA has just flipped the switch on RAPIDS cuDF — Python GPU DataFrame library — to accelerate pandas.

Join our next cohort: Full-stack GenAI SaaS Product in 4 weeks!

For those acquainted with pandas’ CPU constraints, cuDF’s GPU-accelerated performance is not a distant dream anymore, starting with the RAPIDS v23.10 release.

Without changing a single line of code, you will be able to get up to 150x speed boost!

Performance comparison between Traditional pandas v1.5 on Intel Xeon Platinum 8480CL CPU and pandas v1.5 with RAPIDS cuDF on NVIDIA Grace Hopper

When you load cudf.pandas, Pandas types like Series and DataFrame are replaced by proxy objects that dispatch operations to cuDF when possible.

cuDF will have many practical implications; pandas is pretty much the Swiss Army knife for data wrangling in Python, and there are 9.5M million pandas users. It’s used in all kinds of applications — reporting, business intelligence, optimization, machine learning, and AI.

--

--

Agent Issue
Agent Issue

Written by Agent Issue

Your front-row seat to the future of Agents.

No responses yet