Home About us MoEF Contact us Sitemap Tamil Website  
About Envis
Whats New
Research on Microbes
Microbiology Experts
Online Submission
Access Statistics

Site Visitors

blog tracking

Knowledge-Based Systems
Volume 232, 2021, 107460

A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

K.Aditya Shastry, SanjayH.A

Nitte Meenakshi Institute of Technology, Bengaluru 64, India.


Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on “Mutual Information” (MutInf), and “Root Mean Square Error” (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using “weighted principal component analysis” (wgt-PCA). The extracted features were then fed into different ML models viz. “Regression” (Reg), “Artificial Neural Networks” (ArtNN), “Adaptive Neuro Fuzzy Inference System” (ANFIS), “Ensemble of Trees” (EnT), and “Support Vector Regression” (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and “mean absolute error” (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on “benchmark” and “real-world” farming datasets.

Copyright © 2005 ENVIS Centre ! All rights reserved
This site is optimized for 1024 x 768 screen resolution