import numpy
import pandasPandas
Pandas provides data structures and functionality to quickly manipulate and analyze data. The key to understanding Pandas for machine learning is understanding the Series and DataFrame data structures.
Series
A series is a one dimensional array of data where the rows are labeled using a time axis.
myarray = numpy.array([1,2,3])
rowname = ['a', 'b', 'c']myseries = pandas.Series(myarray, index=rowname)print(myseries)a 1
b 2
c 3
dtype: int64
print(myseries[0])1
print(myseries['a'])1
DataFrame
myarray = numpy.array([[1,2,3], [4,5,6]])rows = ['a', 'b']columns = ['one', 'two', 'three']mydataframe = pandas.DataFrame(myarray, index=rows, columns=columns)print(mydataframe) one two three
a 1 2 3
b 4 5 6
print(f'Method 1:\nColumn one:\n{mydataframe.one}')Method 1:
Column one:
a 1
b 4
Name: one, dtype: int64
print('Method 2:\nColumn one:\n%s' % mydataframe['one'])Method 2:
Column one:
a 1
b 4
Name: one, dtype: int64