Datasets

Iris Flower Dataset

https://www.kaggle.com/datasets/arshid/iris-flower-dataset

Boston Housing Dataset

https://www.kaggle.com/code/prasadperera/the-boston-housing-dataset/data

Oil Spill Dataset

https://www.kaggle.com/datasets/ashrafkhan94/oil-spill?select=oil-spill.csv

Horse Colic Dataset

https://raw.githubusercontent.com/jbrownlee/Datasets/master/horse-colic.csv

https://www.kaggle.com/datasets/uciml/horse-colic

Breast Cancer Categorical Dataset

https://raw.githubusercontent.com/jbrownlee/Datasets/master/breast-cancer.csv

Pima Indians Dataset

The Pima Indians dataset is used to demonstrate data loading in this lesson. It will also be used in many of the lessons to come. This dataset describes the medical records for Pima Indians and whether or not each patient will have an onset of diabetes within five years. As such it is a classification problem. It is a good dataset for demonstration because all of the input attributes are numeric and the output variable to be predicted is binary (0 or 1).