import numpy
import pandas
Pandas
Pandas provides data structures and functionality to quickly manipulate and analyze data. The key to understanding Pandas for machine learning is understanding the Series and DataFrame data structures.
Series
A series is a one dimensional array of data where the rows are labeled using a time axis.
= numpy.array([1,2,3])
myarray = ['a', 'b', 'c'] rowname
= pandas.Series(myarray, index=rowname) myseries
print(myseries)
a 1
b 2
c 3
dtype: int64
print(myseries[0])
1
print(myseries['a'])
1
DataFrame
= numpy.array([[1,2,3], [4,5,6]]) myarray
= ['a', 'b'] rows
= ['one', 'two', 'three'] columns
= pandas.DataFrame(myarray, index=rows, columns=columns) mydataframe
print(mydataframe)
one two three
a 1 2 3
b 4 5 6
print(f'Method 1:\nColumn one:\n{mydataframe.one}')
Method 1:
Column one:
a 1
b 4
Name: one, dtype: int64
print('Method 2:\nColumn one:\n%s' % mydataframe['one'])
Method 2:
Column one:
a 1
b 4
Name: one, dtype: int64