Pandas

Pandas provides data structures and functionality to quickly manipulate and analyze data. The key to understanding Pandas for machine learning is understanding the Series and DataFrame data structures.

import numpy
import pandas

Series

A series is a one dimensional array of data where the rows are labeled using a time axis.

myarray = numpy.array([1,2,3])
rowname = ['a', 'b', 'c']
myseries = pandas.Series(myarray, index=rowname)
print(myseries)
a    1
b    2
c    3
dtype: int64
print(myseries[0])
1
print(myseries['a'])
1

DataFrame

myarray = numpy.array([[1,2,3], [4,5,6]])
rows = ['a', 'b']
columns = ['one', 'two', 'three']
mydataframe = pandas.DataFrame(myarray, index=rows, columns=columns)
print(mydataframe)
   one  two  three
a    1    2      3
b    4    5      6
print(f'Method 1:\nColumn one:\n{mydataframe.one}')
Method 1:
Column one:
a    1
b    4
Name: one, dtype: int64
print('Method 2:\nColumn one:\n%s' % mydataframe['one'])
Method 2:
Column one:
a    1
b    4
Name: one, dtype: int64