Dividing pandas dataframe into bins using qcut and cut
Lets use a sample dataframe
1
2
df = sns.load_dataset('iris')
df.head()

Lets say we want to divide the dataframe into 5 bins based on the petal length. We can do that using qcut or cut.
Using qcut
qcut tries to divide the dataframe into bins such that similar proportion of data numbers are present in each bin.
1
df['qcut_bin'] = pd.qcut(df['petal_length'],5)
using cut
If we use cut to divide into 5 bins using petal_length, then it will generate the bins into 5 equal proportion based on the petal length values.
1
df['cut_bin'] = pd.cut(df['petal_length'],5, include_lowest=True)
The following picture represents different bins and numbers of data in each bin
1
2
3
4
5
6
7
8
plt.figure(figsize=(15,7))
plt.subplot(1,2,1)
sns.countplot(df['qcut_bin'])
plt.xticks(rotation=90)
plt.subplot(1,2,2)
sns.countplot(df['cut_bin'])
plt.xticks(rotation=90)
