Google Cloud/Review
From charlesreid1
Review
Review in preparation for interview:
- Components of workflow and open source tools for each step
- Highlight each step with a data engineering repository
- Individual services offered on the cloud - know the idea behind, e.g., why so many database solutions
- What specific challenges, software, workflows do genomics researchers face/use?
Review Process
Case study
- Start by reviewing the logistics company case study
- https://charlesreid1.com/wiki/Google_Cloud/Case_Study
Software tools
- Basic software technologies: storage, databases, distributed computation, GPUs vs CPUs, Docker/containerization
- https://charlesreid1.com/wiki/Google_Cloud
- Google Cloud Genomics
Software Quality Assurance
- Github pages/10 things list (time machine)
GCDEC Review:
- 1 - https://charlesreid1.com/wiki/GCDEC/Fundamentals/Notes
- 2 - https://charlesreid1.com/wiki/GCDEC/Unstructured_Data/Notes
- 3a - https://charlesreid1.com/wiki/GCDEC/BigQuery/Notes
- 3b - https://charlesreid1.com/wiki/GCDEC/Dataflow/Notes
- 4a - https://charlesreid1.com/wiki/GCDEC/Building_Tensorflow/Notes
- 4b - https://charlesreid1.com/wiki/GCDEC/Deploying_Tensorflow/Notes
- 4c - https://charlesreid1.com/wiki/GCDEC/Engineering_Tensorflow/Notes
- 5 - https://charlesreid1.com/w/index.php?title=GCDEC/Streaming/Notes&action=edit&redlink=1
Google Quiklabs:
- Google Cloud Platform essentials - https://google.qwiklabs.com/quests/23?locale=en
- Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
- Data engineering - https://google.qwiklabs.com/quests/25?locale=en