Seaborn StatisticalCheat Sheet

Master statistical data visualization with comprehensive Seaborn commands and plotting techniques

Back to Cheat Sheets

Statistical Plots

Beautiful Defaults

Data Exploration

1
Distribution Plots

Import Seaborn

Standard way to import seaborn for statistical plotting

import seaborn as sns

basicsetupimport

Example:

import seaborn as sns import matplotlib.pyplot as plt sns.set_style("whitegrid")

Quick Tip:

Always import as sns for convention, set style for better defaults

✅ Best Practice:

DO: Import matplotlib.pyplot alongside seaborn, set default style early in your script, use consistent import aliases (sns, plt)

❌ Common Mistake:

DON'T: Import seaborn without matplotlib.pyplot, forget to set style preferences, use non-standard import names that confuse collaborators

Histogram Plot

Create histogram with enhanced statistical features

sns.histplot(data=df, x="column", options)

basicdistributionhistogram

Example:

sns.histplot(data=tips, x="total_bill", kde=True, bins=20)

Quick Tip:

Use kde=True to overlay kernel density estimate

✅ Best Practice:

DO: Use appropriate bin count (bins=20-50 for most data), add kde=True for smooth overlay, specify stat="density" for probability density

❌ Common Mistake:

DON'T: Use default bins for all data sizes, forget to add kde for better visualization, mix count and density scales without clarification

Box Plot

Create box plot to show distribution by categories

sns.boxplot(data=df, x="cat", y="num", options)

basicdistributioncategorical

Example:

sns.boxplot(data=tips, x="day", y="total_bill", hue="smoker")

Quick Tip:

Use hue parameter to add another categorical dimension

✅ Best Practice:

DO: Use hue for group comparisons, order categories logically, add notch=True for confidence intervals around median

❌ Common Mistake:

DON'T: Use with too many categories (becomes cluttered), ignore outliers without investigation, forget to label axes clearly

Violin Plot

Combine box plot and kernel density estimation

sns.violinplot(data=df, x="cat", y="num", options)

intermediatedistributiondensity

Example:

sns.violinplot(data=tips, x="day", y="total_bill", inner="box")

Quick Tip:

Shows full distribution shape, use inner="box" for quartiles

✅ Best Practice:

DO: Use inner="box" or "quart" for statistical summaries, split=True for comparing two groups, scale="width" for equal violin widths

❌ Common Mistake:

DON'T: Use with small sample sizes (<30 per group), ignore the density information, use default inner="box" without understanding what it shows

Distribution Plot

Flexible interface for distribution plots

sns.displot(data=df, x="column", kind="hist/kde/ecdf")

intermediatedistributionflexible

Example:

sns.displot(data=tips, x="total_bill", col="time", kind="kde")

Quick Tip:

Use col/row for faceting, kind for different plot types

✅ Best Practice:

DO: Use col_wrap to control subplot layout, specify aspect ratio for better proportions, combine multiple kinds for comprehensive view

❌ Common Mistake:

DON'T: Create too many facets (>12 becomes unreadable), use wrong kind for your data type, forget to adjust figure size for multiple subplots

Strip Plot

Show individual data points for categorical data

sns.stripplot(data=df, x="cat", y="num", options)

basiccategoricalpoints

Example:

sns.stripplot(data=tips, x="day", y="total_bill", jitter=True, alpha=0.7)

Quick Tip:

Use jitter=True to separate overlapping points

✅ Best Practice:

DO: Use jitter=True to avoid overplotting, set alpha<1 for transparency with many points, combine with boxplot for comprehensive view

❌ Common Mistake:

DON'T: Use without jitter when points overlap, use with very large datasets (becomes cluttered), forget to adjust point size for readability

Swarm Plot

Non-overlapping strip plot showing distribution shape

sns.swarmplot(data=df, x="cat", y="num", options)

intermediatecategoricaldistribution

Example:

sns.swarmplot(data=tips, x="day", y="total_bill", hue="sex")

Quick Tip:

Better than strip plot for smaller datasets, shows density

✅ Best Practice:

DO: Use for small to medium datasets (<2000 points), combine with violin plots for statistical context, use size parameter to adjust point size

❌ Common Mistake:

DON'T: Use with large datasets (becomes slow and cluttered), ignore the density information it provides, use without considering computational cost

2
Relationship Plots

Scatter Plot

Enhanced scatter plot with statistical features

sns.scatterplot(data=df, x="col1", y="col2", options)

basiccorrelationscatter

Example:

sns.scatterplot(data=tips, x="total_bill", y="tip", hue="time", size="size")

Quick Tip:

Use hue for color coding, size for bubble charts

✅ Best Practice:

DO: Use alpha for transparency with overlapping points, set appropriate size range with sizes parameter, add regression line with sns.regplot for trends

❌ Common Mistake:

DON'T: Use without alpha when points overlap heavily, ignore outliers that may indicate data quality issues, use too many hue categories (>10 becomes confusing)

Line Plot

Line plot with confidence intervals and grouping

sns.lineplot(data=df, x="x", y="y", options)

basictime-seriestrends

Example:

sns.lineplot(data=flights, x="year", y="passengers", hue="month")

Quick Tip:

Automatically computes confidence intervals for multiple observations

✅ Best Practice:

DO: Use markers for discrete time points, set ci=None to remove confidence bands when not needed, use style parameter for line patterns

❌ Common Mistake:

DON'T: Use with non-sequential x-axis data, ignore confidence intervals without understanding what they represent, use too many hue levels (>8 becomes cluttered)

Regression Plot

Scatter plot with regression line and confidence interval

sns.regplot(data=df, x="x", y="y", options)

intermediateregressioncorrelation

Example:

sns.regplot(data=tips, x="total_bill", y="tip", scatter_kws={"alpha":0.7})

Quick Tip:

Use order parameter for polynomial regression

✅ Best Practice:

DO: Check residuals to validate linear assumptions, use robust=True for outlier resistance, set appropriate order for polynomial fits (usually 2-3)

❌ Common Mistake:

DON'T: Assume linear relationship without checking residuals, use high polynomial orders (>3) without justification, ignore confidence intervals when interpreting results

Linear Model Plot

Regression plot with faceting capabilities

sns.lmplot(data=df, x="x", y="y", options)

intermediateregressionfaceting

Example:

sns.lmplot(data=tips, x="total_bill", y="tip", col="time", hue="smoker")

Quick Tip:

Combines regplot with FacetGrid for multiple subplots

✅ Best Practice:

DO: Use col_wrap to control subplot layout, set aspect and height for proper proportions, use truncate=False to show full regression line

❌ Common Mistake:

DON'T: Create too many facets without adjusting figure size, ignore different slopes across groups, use without checking regression assumptions for each group

Joint Plot

Bivariate plot with marginal distributions

sns.jointplot(data=df, x="x", y="y", kind="scatter/reg/hex")

intermediatebivariatedistribution

Example:

sns.jointplot(data=tips, x="total_bill", y="tip", kind="reg")

Quick Tip:

Shows both relationship and individual distributions

✅ Best Practice:

DO: Use kind="hex" for large datasets to avoid overplotting, add marginal_kws for customizing marginal plots, use kind="kde" for smooth density visualization

❌ Common Mistake:

DON'T: Use scatter kind with very large datasets (use hex instead), ignore the marginal distributions when interpreting, forget to adjust figure size for readability

Pair Plot

Matrix of pairwise relationships in dataset

sns.pairplot(data=df, options)

intermediateexplorationmultivariate

Example:

sns.pairplot(data=iris, hue="species", diag_kind="kde")

Quick Tip:

Great for exploring relationships in entire dataset

✅ Best Practice:

DO: Use hue for categorical grouping, set diag_kind="kde" for smooth distributions, use corner=True for large datasets to show only lower triangle

❌ Common Mistake:

DON'T: Use with too many variables (>6 becomes cluttered), forget to specify hue for meaningful grouping, use with non-numeric data without preprocessing

Seaborn StatisticalCheat Sheet

1Distribution Plots

Import Seaborn

Histogram Plot

Box Plot

Violin Plot

Distribution Plot

Strip Plot

Swarm Plot

2Relationship Plots

Scatter Plot

Line Plot

Regression Plot

Linear Model Plot

Joint Plot

Pair Plot

1
Distribution Plots

2
Relationship Plots