Feature Selection: Filter method, Wrapper method and Embedded method

In this post, let us explore:
  • What is feature selection?
  • Why we need to perform feature selection?
  • Methods

What is Feature Selection?


Feature selection means selecting and retaining only the most important features in the model. Feature selection is different from feature extraction. In feature selection, we subset the features whereas in feature extraction, we create a new feature from the existing features.

Why Feature Selection is important?

  • It simplifies the model: data reduction, less storage, Occam's razor and better visualization
  • Reduces training time
  • Avoids over-fitting
  • Improves accuracy of the model
  • Avoids curse of dimensionality.

Methods


Feature selection methods can be grouped into three categories: filter method, wrapper method and embedded method.

Three methods of feature selection

  • Filter method
In this method, features are filtered based on general characteristics (some metric such as correlation) of the dataset such correlation with the dependent variable. Filter method is performed without any predictive model. It is faster and usually the better approach when the number of features are huge. Avoids overfitting but sometimes may fail to select best features.

  • Wrapper method
In wrapper method, the feature selection algorithm exits as a wrapper around the predictive model algorithm and uses the same model to select best features (more on this from this excellent research paper). Though computationally expensive and prone to overfitting, gives better performance.

  • Embedded method
In embedded method, feature selection process is embedded in the learning or the model building phase. It is less computationally expensive than wrapper method and less prone to overfitting.

Three feature selection methods in simple words

The following graphic shows the popular examples for each of these three feature selection methods.

Examples for three methods of feature selection

In the following table, let us explore the comparison of these three methods of feature selection.

Comparison of three methods of feature selection

Summary


Filter method is faster and useful when there are more number of features. Wrapper method gives better performance while the embedded method lies in between the other two methods.

References

http://clopinet.com/isabelle/Projects/ETH/lecture9.pdf
https://sebastianraschka.com/faq/docs/feature_sele_categories.html
https://machinelearningmastery.com/an-introduction-to-feature-selection/
https://en.wikipedia.org/wiki/Feature_selection#Embedded_method
https://www.analyticsvidhya.com/blog/2016/12/introduction-to-feature-selection-methods-with-an-example-or-how-to-select-the-right-variables/
https://towardsdatascience.com/feature-selection-techniques-in-machine-learning-with-python-f24e7da3f36e
https://towardsdatascience.com/the-5-feature-selection-algorithms-every-data-scientist-need-to-know-3a6b566efd2