Feature Selection

Feature Selection is one of the core concepts in machine learning which hugely impacts the performance of your model. Feature Selection is the process where you automatically or manually select those features which contribute most to your prediction variable or output in which you are interested in. Having irrelevant features in your data can decrease the accuracy of the models and make your model learn based on irrelevant features.

Factor analysis of mixed data (FAMD) is a principal component method dedicated to analyze a data set containing both quantitative and qualitative variables. It makes it possible to analyze the similarity between individuals by taking into account a mixed types of variables. Additionally, one can explore the association between all variables, both quantitative and qualitative variables.

Package Used :- PCAmixdata

Based on the squared loading, obtained after Principal Component Analysis; the 5 best selected features/columns for clustering are as follows:-

Weight

Potass

Fiber

Calories

Rating