Pandas is a software library written for the Python programming language for data manipulation and analysis.
Pandas was developed by Wes McKinney in 2008.
Pandas is an open source, BSD-licensed python library.
The name is derived from the term “panel data”, an econometrics term for multidimensional structured data sets.
It is fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
Panda library uses most of the functionalities of Numpy. I suggest the beginners to go through my previous articles (Numpy) before proceeding.
1. Fast and efficient data…
Seaborn is an incredible Python data visualization library built on-top of matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
Seaborn is used to visualize random distributions.
It’s one of the best visualization packages in any tool or language. It gives us the capability to create improved data visuals.
If you have python and pip already installed on a system, install it using the below command;
pip install seaborn
If you use Jupyter, install seaborn using the command as follows;
!pip install seaborn
Import the pyplot object of the matplotlib module;
import matplotlib.pyplot …
A data distribution is a function or a listing which shows all the possible values (or intervals) of the data and how often each value occurs.
Data distribution is important when working with statistics and data science.
The random module offers method that returns randomly generated data distribution.
A random distribution is a set of random numbers that follows a certain probability density function.
What is Probability Density Function?
A function that describes continuous probability, (i.e.) probability of all values in an array.
We can generate this using the choice() method,.
Probability is set by a number between 0 and…
A different number every time is called a random number???
Random numbers are numbers that cannot be predicted logically.
Computers work on programs, and programs are set of instructions, which means there must be some algorithm to generate a random number. If there is a program to generate random number it can be predicted, thus it is not truly random. These are called pseudo randoms.
Python’s random module is pseudo random.
Numpy has random module to work with random numbers.
These random numbers can be generated in integers or float.
Generate Random Numbers:
For integer; we use randint()
NumPy has quite a few handy statistical functions for finding minimum, maximum, mean, median and standard deviation, etc. from the given elements in the array.
The functions are described in this article.
mean: This will return the arithmetic mean along the specified axis.
median: This will return the median along the specified axis.
average: This will return the weighted average along the specified axis.
std: This will return the standard deviation along the specified axis.
var: This will return the variance along the specified axis.
Example1: Without axis (1-D array)
Sorting means putting elements in an ordered sequence.
Ordered sequence means sequencing in any order corresponding to elements like numeric or alpha; ascending or descending.
Numpy has a function called sort(), that will sort a specified array.
Iterating means going through elements one by one.
As we use multi-dimensional array in python (numpy), we can use for loop.
For 1-D array, iteration happens one by one.
import numpy as np
print("Original array:",arr)print("After iteration:")
for x in arr:
While executing the functions, some of them return a copy of the input array, while some return the view.
When the contents are physically stored in another location, it is called Copy. On the other hand, a different view of the same memory content, we call it as View.
The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not be affected to the copy.
The view does not own the data and any changes made to the view will affect the original array…
In our previous article we have seen how to create an array using numpy. Once the creation is done, we must be able to access them. In this article we’ll see how to access an array by indexing, slicing of an array and some other functions that are involved in the creation of array.
We can access an array through array indexing, by referring to its index number.
Index in NumPy starts with 0.
Aspiring Data Scientist