[GCP] DataLake
GCP DataLake Tools
솔루션이 너무 많아 정리를 따로 해봤다.
참조페이지 : gcp 공식 docs - https://cloud.google.com/docs/data
Data analysis
Query massive datasets with SQL, geospatial analytics, and BI tools that support ad hoc and programmatic analysis as well as data sharing.
- Analytics Hub - Share data and insights at scale across organizational boundaries with a robust security and privacy framework. 
 
- BigQuery Analytics - Maximize your data analysis investments when running analytic queries on large datasets. 
 
- BigQuery ML - Create and run machine learning (ML) models by using GoogleSQL queries, and access LLMs and Cloud AI APIs to perform artificial intelligence (AI) tasks like text generation or machine translation. 
 
- Dataproc - Perform batch processing, querying, and streaming using a managed Apache Spark and Hadoop service. 
 
- Earth Engine - Google Earth Engine is a geospatial processing service. With Earth Engine, you can perform geospatial processing at scale, powered by Google Cloud Platform. 
 
- Looker - Access, analyze, and act on an up-to-date, trusted version of your data 
 
- Looker Studio - Tell great data stories to support better business decisions. 
 
Data governance
Control and manage quality throughout the lifecycle of your data as you share it across and outside your organization.
- Data Catalog - Discover and understand your data using a fully managed and scalable data discovery and metadata management service. 
 
- Dataplex - Organize your data into lakes and zones, and automate data management and governance to power analytics at scale. 
 
- Sensitive Data Protection - Discover and redact sensitive data. 
 
- Introduction to data governance in BigQuery - Implement and enforce BigQuery data governance policies. 
 
Data ingestion
Migrate, stream, and batch-load data into a serverless, high-throughput storage architecture.
- BigQuery Data Transfer Service - Automates data movement into BigQuery on a scheduled, managed basis. Lay the foundation for a BigQuery data warehouse without writing code. 
 
- Cloud Data Fusion - Quickly build and manage data pipelines using fully managed, code-free data integration with a graphical interface. 
 
- Dataflow - Develop real-time batch and stream data processing pipelines. 
 
- Dataproc - Perform batch processing, querying, and streaming using a managed Apache Spark and Hadoop service. 
 
- Dataproc Metastore - Use a fully managed Apache Hive metastore (HMS) that runs on Google Cloud to manage your data lake and metadata. 
 
- Dataproc Serverless - Use Dataproc Serverless to run Spark batch workloads without provisioning and managing your own cluster. 
 
- Datastream - A serverless and easy-to-use change data capture (CDC) and replication service. 
 
- Google Cloud Managed Service for Apache Kafka - A managed cloud service that lets you ingest Kafka streams directly into Google Cloud. 
 
- Pub/Sub - Ingest event streams from anywhere, at any scale. 
 
- Storage Transfer Service - Transfer data between Cloud Storage services such as AWS S3 and Cloud Storage. 
 
- Transfer Appliance - Ship large volumes of data to Google Cloud using rackable storage. 
 
Data orchestration
Organize and optimize your workload management chain with seamless connections across data sources and processes.
- Cloud Composer - Create, schedule, monitor, and manage workflows using a fully managed orchestration service built on Apache Airflow. 
 
- Dataform - Dataform offers an end-to-end experience that helps data teams build, version control, and orchestrate SQL workflows in BigQuery. 
 
DataBase
Relational databases
Expand this section to see relevant products and documentation.
- AlloyDB for PostgreSQL - A fully-managed, PostgreSQL-compatible database for demanding transactional workloads. 
 
- Spanner - Back your apps with a mission-critical, global-scale database service. 
 
- Cloud SQL - General information about the Cloud SQL options, each a fully-managed database service that helps you set up, maintain, manage, and administer your relational databases. 
 
- Cloud SQL for MySQL - A fully-managed database service that helps you set up, maintain, manage, and administer your MySQL relational databases on Google Cloud. 
 
- Cloud SQL for PostgreSQL - A fully-managed database service that helps you set up, maintain, manage, and administer your PostgreSQL relational databases on Google Cloud. 
 
- Cloud SQL for SQL Server - A managed database service that helps you set up, maintain, manage, and administer your SQL Server databases on Google Cloud. 
 
Non-relational databases
Expand this section to see relevant products and documentation.
- Bigtable - Store terabytes or petabytes of data using a NoSQL wide-column database service. 
 
- Datastore - A NoSQL document database built for automatic scaling, high performance, and ease of application development. 
 
- Firestore - Add NoSQL document database access to mobile and web apps. 
 
- Memorystore for Memcached - Applications running on Google Cloud can achieve extreme performance by leveraging the scalable, available, secure, and managed Memcached service. 
 
- Memorystore for Redis - Achieve extreme performance using a managed in-memory data store service. 
 
- Memorystore for Redis Cluster - Achieve extreme performance using a managed in-memory Redis Cluster service. 
 
- Spanner Graph - Get high-performance graph database capabilities with unparalleled scalability, availability, and consistency. 
 
Last updated