How to Apply Lambda & Apply Function in a Pandas Dataframe

Three strategies for creating a new Pandas DataFrame column from a calculation, and a comparison of performance

David Allen
3 min readApr 19, 2020

As always, documentation for myself made public for you…

The core learning here is to use cell magic and measure the execution time! It’s a terrific way to evaluate your code performance:

Table of Contents:

Strategy 1: Write a function, and apply that function.

Strategy 2: Write a Lambda, and apply that Lambda

Strategy 1.2: Write a better function, and apply that function.

Strategy 1: Write a function, and apply that function.

I’ve been working with a lot of canine health data lately, and one thing that has been fascinating to watch develop is the change “before COVID-19” vs. “after COVID-19”.

One of the ways to cut the data, therefore, is to tag each row (I’m dealing with a huge trough of daily data) with “before” or “after” what I see as the inflection date: 3/6/2020

So, like the Python/Pandas newb that I am, I wrote this function and then applied it to my multi-million row DF. Here’s how that goes:

First, write the function:

def covid_before_age(start_date) :
if start_date < pd.to_datetime('3/6/2020') :
covid_status = "before"
else :
covid_status = "after"
return covid_status

Second, apply the function:

df['covid_status'] = df.start_date.apply(covid_before_age)

The shape of my data frame:

df.shape

Cell magic says it takes this long to calculate:

28 minutes, 53 seconds. That’s a long time…

Strategy 2: Write a Lambda, and apply that Lambda

--

--

David Allen

Documentation and tutorials on Python, Pandas, Jupyter Notebook, and Data Analysis.