Visualize Distributions with Seaborn

Sumangali Tamilselvan
Analytics Vidhya
Published in
5 min readJul 21, 2020

--

What is Seaborn?

Seaborn is an incredible Python data visualization library built on-top of matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Seaborn is used to visualize random distributions.

It’s one of the best visualization packages in any tool or language. It gives us the capability to create improved data visuals.

Installation of Seaborn:

If you have python and pip already installed on a system, install it using the below command;

pip install seaborn

If you use Jupyter, install seaborn using the command as follows;

!pip install seaborn

Import Matplotlib:

Import the pyplot object of the matplotlib module;

import matplotlib.pyplot as plt

Import seaborn:

Import seaborn module in your code using the following command;

import seaborn as sns

distplot:

displot stands for distribution plot.

It takes inputs as arrays and plots curve corresponding to the distribution point in the array.

To plot:

We use sns.displot() to plot the graphs for corresponding distributions.

To view:

We us plt.show() to view the graph.

Random Distribution:

Random Distribution is a set of random numbers that follows a certain probability density function.

Some of the Random Distributions are as follows;

· Normal Distribution

· Binomial Distribution

· Poisson Distribution

· Uniform Distribution

· Logistic Distribution

· Multinomial Distribution

· Exponential Distribution

· Chi-square Distribution

Now let us visualize random distributions one by one;

Normal Distribution:

Normal Distribution is one of the most important distributions. It is also called the Gaussian Distribution. It fits the probability distribution of many events, e.g. Height, Blood pressure, Heartbeat, etc.

We can use random.normal() to get a normal distribution.

It has 3 parameters:

loc — (Mean) where peaks of the bell exist.

scale- (Standard deviation) how flat the distribution should be can be provided here.

size — the shape of the returned array.

Visualization of Normal Distribution:

The curve of a Normal Distribution is also known as the Bell Curve because of the bell-shaped curve.

Binomial Distribution:

Binomial Distribution is a Discrete Distribution. It describes the outcome of binary scenarios, e.g. tossing a coin(head or tails).

We can use random.binomial() to get binomial distribution.

It has 3 parameters:

n — number of trials

p — probability of occurrence

size — shape of returned array

Visualization of Binomial Distribution:

Note: Binomial distributions are discrete distributions.

Poisson Distribution:

Poisson Distribution is also a Discrete Distribution. It estimates how many times an event can happen in a specified time. e.g. If someone eats twice a day what is probability he will eat thrice?

We can use random.poisson() to get poisson distribution.

It has 2 parameters:

lam — rate the number of occurrence

size — the shape of returned array

Visualization of Poisson Distribution:

Uniform Distribution:

Uniform Distribution is used when all the elements has equal chance of occurrence.

E.g. Generation of random numbers.

We can use random.uniform() to get uniform distribution.

It has 3 parameters:

a — lower bound; default-0

b — upper bound; default-1

size — shape of returned array

Visualization of Uniform Distribution:

Logistic Distribution:

Logistic Distribution is used to describe growth. It is used extensively in machine learning in logistic regression

We can use random.logistic() to get logistic distribution.

It has 3 parameters:

loc — mean, where the peak is; default-0

scale — standard deviation, the flatness; default-1

size — shape of returned array

Visualization of Logistic Distribution:

Multinomial Distribution:

Multinomial Distribution is a generalization of binomial distribution. It describes outcomes of multinomial scenarios where scenarios must be only one of two. e.g. dice roll outcome.

We can use random.multinomial() to get this distribution.

It has 3 parameters:

n — number of possible outcomes

pval — list probability of outcomes

size — the shape of returned array

Multinomial samples will NOT produce a single value! They will produce one value for each pval.

Visualization of Multinomial Distribution:

Exponential Distribution:

Exponential Distribution is used for describing time till next event, e.g. win/lose.

We can use random.exponential() to get exponential distribution.

It has 2 parameters:

scale — inverse of rate; default — 0

size — shape of the returned array

Visualization of Exponential Distribution:

Chi-Square Distribution:

Chi-Square Distribution is used as a basis to verify the hypothesis.

We can use random.chisquare() to get chi-square distribution.

It has 2 parameters:

df — degree of freedom

size — shape of the returned array

Visualization of Chi-Square Distribution:

With this we have come to the end of this article.

Happy coding….😊😊😊

--

--