O'Reilly logo

Machine Learning Solutions by Jalaj Thanaki

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Understanding the dataset

Here, we are going to discuss our input dataset in order to develop the application. You can find the dataset at https://github.com/jalajthanaki/credit-risk-modelling/tree/master/data.

Let's discuss the dataset and its attributes in detail. Here, in the dataset, you can find the following files:

  • cs-training.csv
    • Records in this file are used for training, so this is our training dataset.
  • cs-test.csv
    • Records in this file are used for testing our machine learning models, so this is our testing dataset.
  • Data Dictionary.xls
    • This file contains information about each of the attributes of the dataset. So, this file is referred to as our data dictionary.
  • sampleEntry.csv
    • This file gives us an idea about the format in which we need to generate ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required