This session was recorded live at Data Modeling Zone 2016. Daniel D. Gutierrez is the speaker.
Data science involves understanding and preparing the data, defining the statistical learning model, and following the Data Science Process. Statistical learning models can assume many shapes and sizes, depending on their complexity and the application for which they are designed. The first step is to understand what questions you are trying to answer for your organization. The level of detail and complexity of your questions will increase as you be-come more comfortable with the data science process.
In this session, Daniel covers the most important steps in the data science process – a general formula followed by data scientists in striving to achieve best practices with a data science project: understanding the goal of the project, data access, data munging, exploratory data analysis, feature engineering, model selection, model validation, data visualization, communicate the results and deploy the solution to production.