Google Cloud: Difference between revisions
From charlesreid1
(Created page with "Notes for google cloud data engineer certification.") |
No edit summary |
||
| Line 1: | Line 1: | ||
Notes for google cloud data engineer certification. | Notes for google cloud data engineer certification. | ||
==Technology stack== | |||
The following list is based on the sample case study for the GCDE certification exam: https://cloud.google.com/certification/guides/data-engineer/casestudy-flowlogistic | |||
The case study focuses on a logistics company tracking orders and shipments via rail, truck, aircraft, and ships. | |||
Goals: | |||
* Implement real-time inventory tracking system that tracks locations | |||
* Perform data analytics on order and shipment logs (structured/unstructured data) to make decisions about deploying resources, targeting customers, and expanding into markets | |||
* Predict delays in shipments | |||
Requirements: | |||
* Reliable, reproducible environment that scales | |||
* Aggregated data in centralized data lake | |||
* Historical data used to perform predictive analytics on future shipments | |||
* Accurate tracking of worldwide shipments (proprietary technology) | |||
* Improvement of business agility and speed of innovation via rapid provisioning of new resources | |||
* Analysis and optimization for performance in the cloud | |||
* Migration to cloud, if all other requirements met | |||
Data center description: | |||
Databases: | |||
* SQL DB storing user data, static data | |||
* Cassandra DB storing metadata, tracking messages | |||
* Kafka servers tracking message aggregation and batch insert | |||
Applications: | |||
* Customer frontend, middleware for orders and customs | |||
* Tomcat for Java services | |||
* Nginx for static content | |||
* Batch servers (?) | |||
Storage: | |||
* iSCSI (internet small-computer-system interface) to manage VM hosts | |||
* Fiber channel network for SQL server storage | |||
* NAS (network attached storage) for image storage, logs, and backups | |||
Analytics: | |||
* Hadoop/Spark servers | |||
* Core data lake | |||
* Data analysis workloads | |||
Miscellaneous servers: | |||
* Jenkins | |||
* Monitoring of servers | |||
* Bastion hosts | |||
* Security scanners | |||
* Billing software | |||
Revision as of 23:57, 11 September 2017
Notes for google cloud data engineer certification.
Technology stack
The following list is based on the sample case study for the GCDE certification exam: https://cloud.google.com/certification/guides/data-engineer/casestudy-flowlogistic
The case study focuses on a logistics company tracking orders and shipments via rail, truck, aircraft, and ships.
Goals:
- Implement real-time inventory tracking system that tracks locations
- Perform data analytics on order and shipment logs (structured/unstructured data) to make decisions about deploying resources, targeting customers, and expanding into markets
- Predict delays in shipments
Requirements:
- Reliable, reproducible environment that scales
- Aggregated data in centralized data lake
- Historical data used to perform predictive analytics on future shipments
- Accurate tracking of worldwide shipments (proprietary technology)
- Improvement of business agility and speed of innovation via rapid provisioning of new resources
- Analysis and optimization for performance in the cloud
- Migration to cloud, if all other requirements met
Data center description:
Databases:
- SQL DB storing user data, static data
- Cassandra DB storing metadata, tracking messages
- Kafka servers tracking message aggregation and batch insert
Applications:
- Customer frontend, middleware for orders and customs
- Tomcat for Java services
- Nginx for static content
- Batch servers (?)
Storage:
- iSCSI (internet small-computer-system interface) to manage VM hosts
- Fiber channel network for SQL server storage
- NAS (network attached storage) for image storage, logs, and backups
Analytics:
- Hadoop/Spark servers
- Core data lake
- Data analysis workloads
Miscellaneous servers:
- Jenkins
- Monitoring of servers
- Bastion hosts
- Security scanners
- Billing software