Data Visualization in Python with Seaborn
Updated on December 05, 2025 8 minutes read
Data is produced at every click, purchase, and sensor reading. On its own, it is just numbers in a table that most people cannot interpret quickly or confidently.
Data visualization transforms those numbers into shapes, colors, and patterns that your brain can easily understand at a glance. Good visuals surface trends, highlight outliers, and support the story you want to tell.
In this guide, you will learn how to use Seaborn, a popular Python library, to create clear statistical graphics. We will walk through its main plot categories and the core options you need for exploratory data analysis (EDA).
What is Seaborn?
Seaborn is a Python data visualization library built on top of Matplotlib. It provides a high-level interface for statistical plots, so you spend less time on boilerplate code and more time understanding your data.
It integrates naturally with Pandas DataFrames and works smoothly with the rest of the scientific Python ecosystem. If you already use NumPy, Pandas, or scikit-learn, Seaborn fits in immediately.
To install Seaborn, use pip in your environment:
pip install seaborn
Getting started: basic Seaborn setup
The most common convention is to import Seaborn as sns. You will also usually import Pandas and Matplotlib to load data and display plots.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_theme(style="whitegrid") # give all plots a clean default style
Once this is in place, you can call functions such as sns.catplot, sns.histplot, or sns.relplot to build different kinds of charts. Most examples in this article assume you already have a DataFrame called dataset.
Three main plot families in Seaborn
Seaborn plotting functions fall into three broad categories that cover most everyday analytics work.
- Categorical plots compare values between groups or categories.
- Distribution plots show how a variable is distributed.
- Relational plots explore relationships between continuous variables.
Next, we will look at each family with the most useful functions for practical data science.
Categorical plots in Seaborn
Use categorical plots when you want to compare values between groups. You might be looking at revenue by region, satisfaction by plan type, or exam scores by class.
Seaborn catplot is a figure-level function that can draw several kinds of categorical plots by changing the kind argument. Under the hood, it calls more specific functions such as countplot, barplot, boxplot, and others.
Count plot
A count plot shows how many observations fall into each category. It is ideal for quickly checking class balance or the distribution of labels.
sns. catplot(
data=dataset,
kind="count",
x="variable",
)
plt.show()
Bar plot
A bar plot represents an estimate of a continuous value, often the mean, for each category. It answers questions like “What is the average income per region?” or “What is the mean score per class?.
sns. catplot(
data=dataset,
kind="bar",
x="variable_1", # categorical
y="variable_2", # continuous
estimator=np.mean,
ci="sd",
)
plt.show()
Behind the scenes, Seaborn calculates the estimator for each category and displays confidence intervals. You can pass your own function to estimator if you want a different summary, such as np.median.
Strip plot
A strip plot places each observation as a dot along an axis. For every category of variable_1, you see all the values of variable_2.
sn s.catplot(
data=dataset,
kind="strip",
x="variable_1",
y="variable_2",
jitter=0.15,
)
plt.show()
Strip plots are simple but reveal patterns such as clustering, gaps, or possible outliers within each group. They are helpful when you want to see individual data points instead of only summaries.
Swarm plot
A swarm plot is similar to a strip plot but avoids overlapping points by arranging them in a swarm shape. Each point still represents a single observation.
sns. catplot(
data=dataset,
kind="swarm",
x="variable_1",
y="variable_2",
)
plt.show()
When there are many observations, Seaborn may not draw every single point because it cannot fit them without overlap. Swarm plots are best for small to medium datasets where you want to see the shape of the distribution.
Box plot
A box plot summarises the distribution of a continuous variable for each category using quartiles. It is a compact way to compare spreads and identify potential outliers.
sn s.catplot(
data=dataset,
kind="box",
x="variable_1",
y="variable_2",
)
plt.show()
The box shows the interquartile range (IQR), with the line inside marking the median. Points drawn outside the whiskers are often treated as outliers that may deserve extra investigation.
Violin plot
A violin plot combines a box plot with a smoothed density curve. It gives you a fuller picture of the distribution shape for each category.
sns. catplot(
data=dataset,
kind="violin",
x="variable_1",
y="variable_2",
inner="quartile",
)
plt.show()
The width of the violin at each value indicates how many observations fall there. This makes it easier to see multi-modal distributions compared to a plain box plot.
Distribution plots in Seaborn
Distribution plots help you understand the shape of your data. They show where most values lie, how spread out they are, and whether there are long tails or multiple peaks.
In modern Seaborn, the recommended functions for distributions are histplot, kdeplot, and ecdfplot. Figure-level helpers such as displot combine these building blocks for more complex layouts.
Histogram with histplo.t.
A histogram groups data into bins and counts how many observations fall into each bin. It is often the first chart you create for a numeric variable.
sn s.histplot(
data=dataset,
x="variable",
bins=20,
)
plt.show()
You can adjust the number of bins or normalise counts to show densities instead of raw counts. Histograms are a fast way to spot skewed distributions or obvious outliers.
Kernel Density Estimate (KDE) with kdep.lot.
A KDE plot estimates the underlying probability density function of your data using Kernel Density Estimation. It produces a smooth curve instead of discrete bars.
sns.kdeplot(
data=dataset,
x="variable",
fill=True,
)
plt.show()
KDE plots are very good at revealing multi-modal distributions and comparing several groups on the same axes. For two continuous variables, you can create a bivariate KDE.
sns.kdeplot(
data=dataset,
x="variable_1",
y="variable_2",
fill=True,
)
plt.show()
ECDF with ecdfplot.
An Empirical Cumulative Distribution Function (ECDF) plot shows, for each value, the proportion of observations less than or equal to it.
sns. ecdfplot(
data=dataset,
x="variable",
)
plt.show()
ECDFs are powerful for comparing distributions because every data point contributes directly. Unlike histograms and KDEs, they do not depend on bin sizes or bandwidth choice, and they work well even with small sample sizes.
Relational plots in Seaborn
Relational plots reveal how two or more continuous variables move together. They play a crucial role in exploratory data analysis and feature engineering within data science projects.
Seaborn provides relplot, which can draw scatter plots and line plots by setting the kind argument. Because it is a figure-level function, you can easily add faceting and additional encodings.
Scatter plot
A scatter plot displays individual data points for two continuous variables. It helps you spot correlation, clusters, and potential outliers.
sns. relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
)
plt.show()
Scatter plots are often the starting point when you suspect a relationship between variables, such as price versus size or hours studied versus exam score.
Line plot
A line plot connects data points in order. This makes it ideal for time series or sequences where the x-axis has a natural order.
sns .relplot(
data=dataset,
kind="line",
x="variable_1",
y="variable_2",
)
plt.show()
Line plots are commonly used to follow trends over time, such as monthly revenue, daily active users, or sensor readings. They make it easy to see seasonality and long-term trends.
Adding more information with hue, size, style, row, and col
Real datasets often have more than two variables you want to see at once. Seaborn lets you add extra dimensions using color, size, marker style, and faceting into multiple small plots.
Hue
hue maps a third variable to color. All points from the same category share the same color, which makes comparisons between groups very intuitive.
sn s.relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
hue="variable_3",
)
plt.show()
Hue is especially useful when you want to compare several categories at once, such as user segments, product lines, or experiment groups on the same axes.
Size
size maps a variable to the size of each marker. This works well when the extra variable is numericbut can also be used for a small number of categories.
sns.relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
size="variable_3",
sizes=(50, 200),
)
plt.show()
Larger markers indicate higher values, or simply different categories in some cases. Be careful not to use too many distinct sizes; otherwise, the plot becomes hard to read,d, and the legend becomes crowded.
Style
style maps a categorical variable to marker shape, for example, circles versus triangles. This is especially useful for black and white printing or for color-blind-friendly charts.
sns .relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
style="variable_3",
markers=["X", "*"],
)
plt.show()
You can combine style with hue to show a fourth variable, but it is important to keep the total number of encodings manageable so that the plot stays readable.
Faceting with col and r.ow.
Instead of encoding everything in a single figure, you can split the data into multiple small plots, known as facets. col creates columns of plots and row creates rows.
s n s.relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
col="variable_3",
)
plt.show()
sns.relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
row="variable_3",
)
plt.show()
You can combine hue, size, and style with faceting to show several variables at once. In practice, try to keep the total number of encoded variables small so the figure remains informative and not overwhelming.
sns.relplot(
data=dataset,
kind="scatter",
x="variable_1",
y="variable_2",
hue="variable_3",
size="variable_4",
)
plt.show()
Bringing it all together
In thi ,,e you have seen how Seaborn organises its plots into categorical, distribution, and relational families. You also learned how options such as hue, size, style, row, and col let you express more dimensions in a single visual.
The next step is to practise with a real dataset, such as a Kaggle competition or an internal company dataset. Try to answer concrete questions and choose the plot type that best supports the story you want to tell.
If you want guided, hands-on practice with Seaborn, Pandas, and machine learning, explore the Code Labs Academy Data Science and AI Bootcamp.
Master Data Science and AI with Code Labs Academy. Join the online bootcamp with flexible part-time and full-time options designed for busy professionals.