A New Approach for Feature Subset Selection Based on Hadoop

Q: How many days will it take for my paper to be published?

The review time for papers is not fixed. However, if the paper is accepted and the author completes the processing charges formalities, the paper will be published within a few working days.

Q: I would like to receive a hard copy of the journal materials. Are there any additional charges?

You can log in to the author portal and pay 500 INR to receive the hard copy materials.

Ramya P V; Shashikala B

A New Approach for Feature Subset Selection Based on Hadoop

Authors: Ramya P V, Shashikala B

Unique Paper ID: 142390
Volume: 2
Issue: 1
PageNo: 244-251

Keywords: Genetic algorithms Hadoop MapReduce Parallel GAs.

Abstract:
Feature Selection is a method of identifying a subset of features that are useful for model construction which gives compatible results. In the contemporary world, the data repository consists of redundant and irrelevant features which will have harmful effect on the solution. Irrelevant or redundant features must be avoided in order to reduce the negative effect on the accuracy of the classifier. There are methods for implementing feature selection which include Exhaustive, Best fit, Simulated annealing, Genetic Algorithm, Greedy forward selection and many other methods. Genetic Algorithms (GAs) is a meta heuristic search technique which belongs to the family of evolutionary algorithms, mostly used to find approximate solutions. These heuristics is a general method for solving a particular problem mainly in the areas of optimization and searchproblems. The GAs involves calculating power both in time and resources. No frameworks exist for the development of GAs to be executed in parallel, even though some sequential ones exist.Therefore, these kinds of problems can be solved using Hadoop. Apache Hadoop is one of the common services that can be exploited for parallel applications. Apache Hadoop is a software framework which stores and processes the Big Data on clusters of commodity hardware without using complex programming models. The Hadoop Distributed File System (HDFS) is fault tolerant which holds very large amount of data across multiple machines. Hadoop renders a command interface to interact with HDFS. The project sharpens on depicting a new approach for feature selection utilizing parallel GAs on the Hadoop platform, following MapReduce paradigm.

email to a friend

Cite This Article

ISSN: 2349-6002
Volume: 2
Issue: 1
PageNo: 244-251

A New Approach for Feature Subset Selection Based on Hadoop

Available:https://ijirt.org/article?manuscript=142390

Impact Factor
8.01 (Year 2024)

UGC Approved
Journal no 47859

Join Our IPN

IJIRT Partner Network

Submit your research paper and those of your network (friends, colleagues, or peers) through your IPN account, and receive 800 INR for each paper that gets published.

Join Now