Family Rank: A graphical domain knowledge informed feature ranking algorithm

Authors:

Michelle Saul 1 2, Valentin Dinu 1

Abstract

Motivation

When designing prediction models built with many features and relatively small sample sizes, feature selection methods often overfit training data, leading to selection of irrelevant features. One way to potentially mitigate overfitting is to incorporate domain knowledge during feature selection. Here, a feature ranking algorithm called ‘Family Rank’ is presented in which features are ranked based on a combination of graphical domain knowledge and feature scores computed from empirical data.

External Link