When Amazon ML created the data source, it carried out a basic statistical analysis of the different variables. For each variable, it estimated the following information:
- Correlation of each attribute to the target
- Number of missing values
- Number of invalid values
- Distribution of numeric variables with histogram and box plot
- Range, mean, and median for numeric variables
- Most and least frequent categories for categorical variables
- Word counts for text variables
- Percentage of true values for binary variables
Go to the Datasource dashboard, and click on the new datasource you just created in order to access the data summary page. The left side menu lets you access data statistics for the target and different attributes, ...