Python Pandas:

Pandas is a software library written for the Python programming language for data manipulation and analysis.

Pandas was developed by Wes McKinney in 2008.

What is Pandas?

Pandas is an open source, BSD-licensed python library.

The name is derived from the term “panel data”, an econometrics term for multidimensional structured data sets.

It is fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.


Panda library uses most of the functionalities of Numpy. I suggest the beginners to go through my previous articles (Numpy) before proceeding.

Why Pandas?

1. Fast and efficient data…

What is Seaborn?

Seaborn is an incredible Python data visualization library built on-top of matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Seaborn is used to visualize random distributions.

It’s one of the best visualization packages in any tool or language. It gives us the capability to create improved data visuals.

Installation of Seaborn:

If you have python and pip already installed on a system, install it using the below command;

pip install seaborn

If you use Jupyter, install seaborn using the command as follows;

!pip install seaborn

Import Matplotlib:

Import the pyplot object of the matplotlib module;

import matplotlib.pyplot …

What is Data distribution?

A data distribution is a function or a listing which shows all the possible values (or intervals) of the data and how often each value occurs.

Data distribution is important when working with statistics and data science.

The random module offers method that returns randomly generated data distribution.

Random Distribution:

A random distribution is a set of random numbers that follows a certain probability density function.

What is Probability Density Function?

A function that describes continuous probability, (i.e.) probability of all values in an array.

We can generate this using the choice() method,.

Probability is set by a number between 0 and…

What is a Random Number?

A different number every time is called a random number???

NOOOO…… then??🤔

Random numbers are numbers that cannot be predicted logically.

Computers work on programs, and programs are set of instructions, which means there must be some algorithm to generate a random number. If there is a program to generate random number it can be predicted, thus it is not truly random. These are called pseudo randoms.

Python’s random module is pseudo random.

Numpy Random

Numpy has random module to work with random numbers.

These random numbers can be generated in integers or float.

Generate Random Numbers:

For integer; we use randint()

NumPy has quite a few handy statistical functions for finding minimum, maximum, mean, median and standard deviation, etc. from the given elements in the array.

The functions are described in this article.

Statistical Functions

To find Averages and variances:

mean: This will return the arithmetic mean along the specified axis.

median: This will return the median along the specified axis.

average: This will return the weighted average along the specified axis.

std: This will return the standard deviation along the specified axis.

var: This will return the variance along the specified axis.

Example1: Without axis (1-D array)

Numpy Sort

Sorting means putting elements in an ordered sequence.

Ordered sequence means sequencing in any order corresponding to elements like numeric or alpha; ascending or descending.

Numpy has a function called sort(), that will sort a specified array.

Numpy Join:

Joining means putting content of two or more arrays in a single array.

We do this using concatenate () along with axis; if axis is not passed; it is taken as 0.

Iterating means going through elements one by one.

As we use multi-dimensional array in python (numpy), we can use for loop.

Iterating 1-D:

For 1-D array, iteration happens one by one.

import numpy as np
print("Original array:",arr)
print("After iteration:")
for x in arr:

While executing the functions, some of them return a copy of the input array, while some return the view.

When the contents are physically stored in another location, it is called Copy. On the other hand, a different view of the same memory content, we call it as View.


The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not be affected to the copy.


The view does not own the data and any changes made to the view will affect the original array…

In our previous article we have seen how to create an array using numpy. Once the creation is done, we must be able to access them. In this article we’ll see how to access an array by indexing, slicing of an array and some other functions that are involved in the creation of array.

Numpy Indexing:

We can access an array through array indexing, by referring to its index number.

Index in NumPy starts with 0.

Sumangali Tamilselvan

Aspiring Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store