A New Approach for Feature Subset Selection Based on Hadoop
Author(s):
Ramya P V, Shashikala B
Keywords:
Genetic algorithms, Hadoop, MapReduce,Parallel GAs.
Abstract
Feature Selection is a method of identifying a subset of features that are useful for model construction which gives compatible results. In the contemporary world, the data repository consists of redundant and irrelevant features which will have harmful effect on the solution. Irrelevant or redundant features must be avoided in order to reduce the negative effect on the accuracy of the classifier. There are methods for implementing feature selection which include Exhaustive, Best fit, Simulated annealing, Genetic Algorithm, Greedy forward selection and many other methods. Genetic Algorithms (GAs) is a meta heuristic search technique which belongs to the family of evolutionary algorithms, mostly used to find approximate solutions. These heuristics is a general method for solving a particular problem mainly in the areas of optimization and searchproblems. The GAs involves calculating power both in time and resources. No frameworks exist for the development of GAs to be executed in parallel, even though some sequential ones exist.Therefore, these kinds of problems can be solved using Hadoop. Apache Hadoop is one of the common services that can be exploited for parallel applications. Apache Hadoop is a software framework which stores and processes the Big Data on clusters of commodity hardware without using complex programming models. The Hadoop Distributed File System (HDFS) is fault tolerant which holds very large amount of data across multiple machines. Hadoop renders a command interface to interact with HDFS. The project sharpens on depicting a new approach for feature selection utilizing parallel GAs on the Hadoop platform, following MapReduce paradigm.
Article Details
Unique Paper ID: 142390

Publication Volume & Issue: Volume 2, Issue 1

Page(s): 244 - 251
Article Preview & Download


Share This Article

Join our RMS

Conference Alert

NCSEM 2024

National Conference on Sustainable Engineering and Management - 2024

Last Date: 15th March 2024

Call For Paper

Volume 10 Issue 10

Last Date for paper submitting for March Issue is 25 June 2024

About Us

IJIRT.org enables door in research by providing high quality research articles in open access market.

Send us any query related to your research on editor@ijirt.org

Social Media

Google Verified Reviews