3.3.3. Which Plot Type Should I Use?¶

This a brief listing of common graphs and their functions

The functions below are but a little tasting of common plots, and I’m not specifying parameters beyond the utterly necessary. pd and sns functions get their flexibility from the wide assortment of parameters you can alter. Changing the parameters a bit can produce large (and interesting!) alterations. For example, col and hue typically multiply the amount of info in a graph.

You can either read the function’s documentation (and I frequently do!) via SHIFT+TAB or look through the graph example galleries here and here until you see graphs with features you want, and then you can look at how they are made.

Tip

I would absolutely bookmark these links:

3.3.3.1. Common plot functions¶

Examining one variable

Note

Below, if I call something like df['variable'].<someplottype> that means we are using pandas built in plotting methods. Else, we call sns to use seaborn.

If the variable is called \(x\) in the dataset,

Graph	Code example
frequency count	`df['x'].value_counts().plot.bar() # built in pandas fnc` `df['x'].value_counts()[:10].plot.bar() # only the top 10 values` `sns.countplot(data=df, x='x')`
histogram	`sns.displot(data=df, x='x')` `sns.displot(data=df, x='x',bins=15) # lots of opts, one is num of bins`
KDE (Kernel density est.)	`sns.displot(data=df, x='x',kind='kde')` `sns.displot(data=df, x='x',kde=True) # includes both kde and hist by default`
boxplot	`sns.boxplot(x="x", data=df)`

The countplot/bar graph counts frequency of values (# of times that value exists) within a variable, and is best when there are fewer possible values or when the variable is categorical instead of numerical (e.g. the color of a car).

The others examine the distribution of values for numerical variables (not categorical) and also work on continuous variables or those with many values.

Examining one variable by group

If you want to examine \(y\) for each group in \(group\)

Graph	Code example
boxplot	`sns.boxplot(x="group",y="y", data=df)`
distplot	`sns.FacetGrid(temp_df, hue="group").map(sns.kdeplot, "y")` `kdeplot` is the KDE plot portion of `distplot`. FacetGrid is something we should defer talking about….
violinplot	`sns.catplot(x="group",y="y", data=df, kind='violin')` `catplot` can quickly plot many different types of categorical plots!

Tip

Most functions accept some subset of hue, row, col, style, size. Each of these add new facets to your graphs. Facets are ways of either repeating graphs for different subgroups or overlaying figures for different subgroups on each other.

Examining two variables

Graph	Code example
line	`sns.lineplot(x="x", y="y", data=df)`
scatterplot	`sns.scatterplot(x="x", y="y", data=df)`
scatter + density	`sns.jointplot(x="x", y="y", data=df)`
with fit line	`sns.jointplot(x="x", y="y", data=df,kind=reg) # regress to get fit`
hexbin	`sns.jointplot(x=x, y=y, kind="hex") # possibly better than scatter with larger data`
topograph	`sns.jointplot(x=x, y=y, kind="kde") topo map with kde on sides`
pairwise scatter	`sns.pairplot(df[['x','y','z']])` `sns.pairplot(df[['x','y','z']],kind='reg) # add fit lines`

Examining two variables by group

Graph	Code example
line	`sns.lineplot(x="x", y="y", data=df,hue='group')`
scatterplot	`sns.scatterplot(x="x", y="y", data=df,hue='group')`
pairplot	`sns.pairplot(df,hue='group')`

You will come across times where you think the relationship between \(x\) and \(y\) might on a third variable, \(z\), or maybe even a fourth variable \(w\). For example, age and income are related, but the relationship is different for college educated women than it is for high-school only men.

If you want to examine the relationship of \(x\) and \(y\) for each group in \(group\), you can do so using any two-way plot type (scatter and its cousins).

Hue vs Col

Some functions achieve the group analysis with a hue argument (give different groups different colors) and some do it with col (give different groups different subfigures).

3.3.3.2. Faceting¶

Facets allow you to present more info on a graph by designing a plot for a subset of the data, and quickly repeating it for other parts.

You can think of facets as either

creating subfigures
- the pairplot below creates subfigures for each combination of variables in the dataset
- the Anscombe example makes subfigures for subsets of the data
or overlaying figures on top of each other in a single figure
- the categorical boxplot below does this for each sub group
- the “omitted group effects”

Let’s look at some examples quickly:

import seaborn as sns
import matplotlib.pyplot as plt

iris = sns.load_dataset("iris")
sns.pairplot(iris)
plt.suptitle('Faceting by repeating scatter plots for each pair of variables',fontsize=18)
plt.subplots_adjust(top=0.95) # Reduce plot to make room for the title
plt.show()

# note: .set(title) doesn't work here - it tries to title the individual subfigures (axes)
#       to title the whole thing, I had to use suptitle. 

sns.pairplot(iris, hue="species")
plt.suptitle('Faceting by overlaying figures by group',fontsize=18)
plt.subplots_adjust(top=0.95) # Reduce plot to make room for the title 
plt.show()

Boxplot by group: Just use the x and y arguments together.

sns.boxplot(x="species",y="petal_width", data=iris,)
plt.show()

An example of faceting via the col argument. Using row instead does what you’d think. Protip: You can use row and col together to make a grid of groups.

sns.lmplot(data=iris,x='petal_width',y="petal_length",col="species")
plt.show()
sns.lmplot(data=iris,x='petal_width',y="petal_length",col="species")
plt.show()

sns.lmplot(data=iris,x='petal_width',y="petal_length",hue="species")
plt.show()

3.3.3.3. Practice: Thinking and planning¶

Questions: Which type of graph (bar, line, or histogram) would you use?

The volume of apples picked at an orchard based on the type of apple (Granny Smith, Fuji, etcetera).
The number of points for each game in a basketball season for a team.
The count of apartment buildings in Chicago by the number of individual units.

Answers

LeDataSciFi-2021

3.3.3. Which Plot Type Should I Use?¶

3.3.3.1. Common plot functions¶

3.3.3.2. Faceting¶

3.3.3.2.1. I want to `Facet` my figure, but…¶

3.3.3.3. Practice: Thinking and planning¶

LeDataSciFi-2021

3.3.3. Which Plot Type Should I Use?¶

3.3.3.1. Common plot functions¶

3.3.3.2. Faceting¶

3.3.3.2.1. I want to Facet my figure, but…¶

3.3.3.3. Practice: Thinking and planning¶

3.3.3.2.1. I want to `Facet` my figure, but…¶