The modern machine learning methods that we have studied shot to being mainstream mainly in the 1990s. The binding factor among them was that they all use one layer of representations. For instance, decision trees just create one set of rules and apply them. Even if you add ensemble approaches, the ensembling is often shallow and only combines several ML models directly.
Here is a better-worded interpretation of these differences: