Imputing in-stream mean or median

Filling missing values with the mean or median is a common approach to removing missing values. Modeler has mechanisms for computing and filling missing values using either the Set Globals node or the Data Audit node. Unfortunately, both of these are terminal nodes and therefore require the user to run them as a separate step or as a script. Moreover, the options for which values to impute with are limited to the mean, mid-point, or (in the case of the Data Audit node) a constant.

In this recipe we will impute missing values with the median of a variable in-stream, without the use of @GLOBAL variables.

Getting ready

This recipe uses the following files:

  • Datafile: cup98lrn_reduced_vars3.sav
  • Stream file: Recipe - impute ...

Get IBM SPSS Modeler Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.