OOP on Data Science pipeline on toy dataset(Titanic Survival Prediction)

Lucas chang
Jan 30, 2021

Class: everything is like a class, human is a class, animal is a class, data processing is a class. The class contains attributes and functions.

Calculator class have attribute like calculator name, function like add, minus, times, division.

Data processing have attribute like numeric data processing, function like data.shape, data columns, data pairplot, data cont and cate vars cleaning.

__init__ method can set initial things in class, you can set initial attributes and initial values here.

Make a Pipeline class, and this class will contain three functions as below.

  • __init__ function : set raw data as attribute in the Pipeline class.
  • display data : show the column names, shape, descriptive stats….
  • plot : eda using pairplot
  • change column dytpe : just change the dtype of columns

--

--

Lucas chang

graduate from applied statistic in Taiwan Good at Machine Learning, Text mining, Deep Learning, Data Analysis....