Image for post
Image for post

Python Pandas:

Pandas is a software library written for the Python programming language for data manipulation and analysis.

Pandas was developed by Wes McKinney in 2008.

What is Pandas?

Pandas is an open source, BSD-licensed python library.

The name is derived from the term “panel data”, an econometrics term for multidimensional structured data sets.

It is fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Note:

Panda library uses most of the functionalities of Numpy. I suggest the beginners to go through my previous articles (Numpy) before proceeding.

Why Pandas?

1. Fast and efficient data frame objects. …


Image for post
Image for post

What is Seaborn?

Seaborn is an incredible Python data visualization library built on-top of matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

Seaborn is used to visualize random distributions.

It’s one of the best visualization packages in any tool or language. It gives us the capability to create improved data visuals.

Installation of Seaborn:

If you have python and pip already installed on a system, install it using the below command;

pip install seaborn

If you use Jupyter, install seaborn using the command as follows;

!pip install seaborn

Import Matplotlib:

Import the pyplot object of the matplotlib module;

import matplotlib.pyplot …


Image for post
Image for post

What is Data distribution?

A data distribution is a function or a listing which shows all the possible values (or intervals) of the data and how often each value occurs.

Data distribution is important when working with statistics and data science.

The random module offers method that returns randomly generated data distribution.

Random Distribution:

A random distribution is a set of random numbers that follows a certain probability density function.

What is Probability Density Function?

A function that describes continuous probability, (i.e.) probability of all values in an array.

We can generate this using the choice() method,.

Probability is set by a number between 0 and…


Image for post
Image for post

What is a Random Number?

A different number every time is called a random number???

NOOOO…… then??🤔

Random numbers are numbers that cannot be predicted logically.

Computers work on programs, and programs are set of instructions, which means there must be some algorithm to generate a random number. If there is a program to generate random number it can be predicted, thus it is not truly random. These are called pseudo randoms.

Python’s random module is pseudo random.

Numpy Random

Numpy has random module to work with random numbers.

These random numbers can be generated in integers or float.

Generate Random Numbers:

For integer; we use randint()

For float; we use rand(), returns a random float between 0 and 1. …


Image for post
Image for post

NumPy has quite a few handy statistical functions for finding minimum, maximum, mean, median and standard deviation, etc. from the given elements in the array.

The functions are described in this article.

Statistical Functions

To find Averages and variances:

mean: This will return the arithmetic mean along the specified axis.

median: This will return the median along the specified axis.

average: This will return the weighted average along the specified axis.

std: This will return the standard deviation along the specified axis.

var: This will return the variance along the specified axis.

Example1: Without axis (1-D array)

Image for post
Image for post

Example2: With axis

MEAN:

Image for post
Image for post

MEDIAN:

Image for post
Image for post

To find minimum and maximum:

amin: This will return the minimum of an array.

amax: This will return the maximum of an array.

Image for post
Image for post

To find percentile:

percentile: This will return the percentile of an array.

Image for post
Image for post

With this we have come to the end of this article.

Happy coding….😊😊😊


Image for post
Image for post

Numpy Sort

Sorting means putting elements in an ordered sequence.

Ordered sequence means sequencing in any order corresponding to elements like numeric or alpha; ascending or descending.

Numpy has a function called sort(), that will sort a specified array.

Image for post
Image for post

This method returns the copy, leaving the original unchanged.

We can also sort array that has string or any other datatype.

Image for post
Image for post


Image for post
Image for post

Numpy Join:

Joining means putting content of two or more arrays in a single array.

We do this using concatenate () along with axis; if axis is not passed; it is taken as 0.

Image for post
Image for post

With axis;

Image for post
Image for post

Joining array using stack:

Stacking is used to join arrays of same dimension along a new axis.

Stack are of three types:

· Horizontal stacking

· Vertical stacking

· Height stacking

Horizontal stacking:

Horizontal stacking is done along the rows.

Image for post
Image for post

Vertical stacking:

Vertical stacking is done along the columns.

Image for post
Image for post

Height stacking:

Height stacking is used to stack along the height.

Image for post
Image for post

Numpy split:

Split function is the opposite of join operation.

Join combines multiple arrays into one whereas splitting breaks one array into multiple arrays. …


Image for post
Image for post

Iterating means going through elements one by one.

As we use multi-dimensional array in python (numpy), we can use for loop.

Iterating 1-D:

For 1-D array, iteration happens one by one.

import numpy as np
arr=np.array([1,2,3])
print("Original array:",arr)
print("After iteration:")
for x in arr:
print(x)
Image for post
Image for post

Iterating 2-D:

For 2-D array, iteration happens passing through all the row.

import numpy as np
arr=np.array([[1,2,3],[4,5,6]])
print("Original array:\n",arr)
print("After iteration:")
for x in arr:
print(x)

# to iterate on each scalar:
print("After iterating on each scalar:")
for x in arr:
for y in x:
print(y)
Image for post
Image for post

Iterating 3-D:

For 3-D array, iteration happens passing through all the 2-D array.

import numpy as np
arr=np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print("Original array:\n",arr)
print("After iteration:")
for x in arr:
print(x)

# to iterate on each…


Image for post
Image for post

While executing the functions, some of them return a copy of the input array, while some return the view.

When the contents are physically stored in another location, it is called Copy. On the other hand, a different view of the same memory content, we call it as View.

Copy:

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not be affected to the copy.

View:

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original will affect the view. …


Image for post
Image for post

In our previous article we have seen how to create an array using numpy. Once the creation is done, we must be able to access them. In this article we’ll see how to access an array by indexing, slicing of an array and some other functions that are involved in the creation of array.

Numpy Indexing:

We can access an array through array indexing, by referring to its index number.

Index in NumPy starts with 0.

Image for post
Image for post

we can also access and add the elements of an array;

Image for post
Image for post

Accessing 2-D array:

To access 2-D array we can use comma separated integers representing the dimensions and index of the element. …

About

Sumangali Tamilselvan

Aspiring Data Scientist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store