Designing a metadata framework for bigdata models in Cloudera Data Lakes across AWS, Azure, and GCP

  • Unique Paper ID: 159872
  • Volume: 9
  • Issue: 12
  • PageNo: 536-551
  • Abstract:
  • his research presents an innovative metadata framework design for big data models in Cloudera Data Lakes across AWS, Azure, and GCP cloud platforms. The study focuses on migrating metadata using Data Vault data models, utilizing PySpark and SparkSQL for analysis. As big data environments grow in complexity, accurate metadata migration becomes crucial. This study explores best practices and automation tools for efficient metadata migration in large-scale environments. The research evaluates unique features of AWS, Azure, and GCP, including data storage, processing, security, and cost-effectiveness. It also assesses scalability and usability for managing big data in Cloudera Data Lakes with Data Vault data models. Findings show that AWS offers extensive services and tools, while Azure and GCP provide cost-effective options. AWS benefits from a large partner and developer network, aiding in managing big data in Cloudera Data Lakes with Data Vault models. This study provides innovative insights into metadata framework design and the capabilities of AWS, Azure, and GCP for big data management in Cloudera Data Lakes, aiding organizations in selecting the appropriate cloud platform.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 9
  • Issue: 12
  • PageNo: 536-551

Designing a metadata framework for bigdata models in Cloudera Data Lakes across AWS, Azure, and GCP

Related Articles