Home Divide pandas dataframe into bins
Post
Cancel

Divide pandas dataframe into bins

Dividing pandas dataframe into bins using qcut and cut

Lets use a sample dataframe

1
2
df = sns.load_dataset('iris')
df.head()

iris_data

Lets say we want to divide the dataframe into 5 bins based on the petal length. We can do that using qcut or cut.

Using qcut

qcut tries to divide the dataframe into bins such that similar proportion of data numbers are present in each bin.

1
df['qcut_bin'] = pd.qcut(df['petal_length'],5)

using cut

If we use cut to divide into 5 bins using petal_length, then it will generate the bins into 5 equal proportion based on the petal length values.

1
df['cut_bin'] = pd.cut(df['petal_length'],5, include_lowest=True)

The following picture represents different bins and numbers of data in each bin

1
2
3
4
5
6
7
8
plt.figure(figsize=(15,7))
plt.subplot(1,2,1)
sns.countplot(df['qcut_bin'])
plt.xticks(rotation=90)

plt.subplot(1,2,2)
sns.countplot(df['cut_bin'])
plt.xticks(rotation=90)

count_plot

This post is licensed under CC BY 4.0 by the author.