Cloud Platforms AWS Azure GCP , Big Data Environments, Cloudera, Metadata Migration, PySpark, SparkSQL
Abstract
his research presents an innovative metadata framework design for big data models in Cloudera Data Lakes across AWS, Azure, and GCP cloud platforms. The study focuses on migrating metadata using Data Vault data models, utilizing PySpark and SparkSQL for analysis.
As big data environments grow in complexity, accurate metadata migration becomes crucial. This study explores best practices and automation tools for efficient metadata migration in large-scale environments.
The research evaluates unique features of AWS, Azure, and GCP, including data storage, processing, security, and cost-effectiveness. It also assesses scalability and usability for managing big data in Cloudera Data Lakes with Data Vault data models.
Findings show that AWS offers extensive services and tools, while Azure and GCP provide cost-effective options. AWS benefits from a large partner and developer network, aiding in managing big data in Cloudera Data Lakes with Data Vault models.
This study provides innovative insights into metadata framework design and the capabilities of AWS, Azure, and GCP for big data management in Cloudera Data Lakes, aiding organizations in selecting the appropriate cloud platform.
Article Details
Unique Paper ID: 159872
Publication Volume & Issue: Volume 9, Issue 12
Page(s): 536 - 551
Article Preview & Download
Share This Article
Conference Alert
NCSST-2023
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2023
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT