Está en la página 1de 23

PANDAS FOUNDATIONS

Reading and
cleaning the data
pandas Foundations

Case study
Comparing observed weather data from two sources

Source: National Oceanic & Atmospheric Administration, www.noaa.gov/climate


pandas Foundations

Climate normals of Austin, TX from 1981-2010

Source: National Oceanic & Atmospheric Administration, www.noaa.gov/climate


pandas Foundations

Weather data of Austin, TX from 2011

Source: National Oceanic & Atmospheric Administration, www.noaa.gov/climate


pandas Foundations

Reminder: read_csv()
Useful keyword options
names: assigning column labels
index_col: assigning index
parse_dates: parsing datetimes
na_values: parsing NaNs
PANDAS FOUNDATIONS

Lets practice!
PANDAS FOUNDATIONS

Statistical
exploratory
data analysis
pandas Foundations

Reminder: time series


Index selection by date time
Partial datetime selection
Slicing ranges of datetimes

In [1]: climate2010['2010-05-31 22:00:00'] # datetime

In [2]: climate2010['2010-06-01'] # Entire day

In [3]: climate2010['2010-04'] # Entire month

In [4]: climate2010[2010-09':'2010-10'] # 2 months


pandas Foundations

Reminder: statistics methods


Methods for computing statistics:
describe(): summary
mean(): average
count(): counting entries
median(): median
std(): standard deviation
PANDAS FOUNDATIONS

Lets practice!
PANDAS FOUNDATIONS

Visual exploratory
data analysis
pandas Foundations

Line plots in pandas


pandas Foundations

Line plots in pandas


In [1]: import matplotlib.pyplot as plt

In [2]: climate2010.Temperature['2010-07'].plot()

In [3]: plt.title('Temperature (July 2010)')

In [4]: plt.show()
pandas Foundations

Histograms in pandas
pandas Foundations

Histograms in pandas
In [5]: climate2010['DewPoint'].plot(kind= 'hist', bins=30)

In [6]: plt.title('Dew Point distribution (2010)')

In [7]: plt.show()
pandas Foundations

Box plots in pandas


pandas Foundations

Box plots in pandas


In [8]: climate2010['DewPoint'].plot(kind='box')

In [9]: plt.title('Dew Point distribution (2010)')

In [10]: plt.show()
pandas Foundations

Subplots in pandas
pandas Foundations

Subplots in pandas
In [11]: climate2010.plot(kind='hist', normed=True, subplots=True)

In [12]: plt.show()
PANDAS FOUNDATIONS

Lets practice!
PANDAS FOUNDATIONS

Final thoughts
pandas Foundations

You can now


Import many types of datasets and deal with import issues
Export data to facilitate collaborative data science
Perform statistical and visual EDA natively in pandas
PANDAS FOUNDATIONS

See you in the


next course!

También podría gustarte