Cleansing the dataset

Preface

If you created a dataset to create a classification model, you must perform cleansing of the data. After you create the dataset, you should do the following:

Cleansing the dataset
- Optional removal of variables including missing values
- Remove a variable with one unique number
- Remove categorical variables with a large number of levels
- Convert a character variable to a categorical variable
Split the data into a train set and a test set
Modeling and Evaluate, Predict

The alookr package makes these steps fast and easy:

How to perform cleansing the dataset

For information on how to perform cleansing the dataset, refer to the following website.

Cleansing the dataset

Cleansing the dataset

Choonghyun Ryu

2026-01-11

Preface

How to perform cleansing the dataset