2018/January/Data Engineering
From charlesreid1
notes from January 2018 data engineering work.
this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).
pages
Review page: Google Cloud/Review
Project 1: 2018/January/Data Engineering/Scientific Data Processing
Project 2: 2018/January/Data Engineering/Big Data Text Processing
Project 3: 2018/January/Data Engineering/Cosmos
procedure
Expanding data-engineering-scenarios
Start with pre-made examples
Work toward fabricated experimental data
An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept
Software tools list, (abstract) example for each: Google Cloud
- Storage/database/computation/GPUs vs CPUs/containerization
Software quality assurance:
- Github page - 10 things
- Apply style of later points to earlier points
- Clear out lorem ipsum (7-10)
links
links to notes
Notes review: GCDEC
- Case study - Google Cloud/Case Study
- 1 - GCDEC/Fundamentals/Notes
- 2 - GCDEC/Unstructured_Data/Notes
- 3a - GCDEC/BigQuery/Notes
- 3b - GCDEC/Dataflow/Notes
- 4a - GCDEC/Building_Tensorflow/Notes
- 4b - GCDEC/Deploying_Tensorflow/Notes
- 4c - GCDEC/Engineering_Tensorflow/Notes
- 5 - GCDEC/Streaming/Notes
links to codelabs
Google Codelabs:
- Main link - https://codelabs.developers.google.com/
- Kubernetes and Container Engine - https://codelabs.developers.google.com/codelabs/cloud-compute-kubernetes/index.html?index=..%2F..%2Findex#0
- Process Astronomy Data to Generate Images - https://codelabs.developers.google.com/codelabs/cloud-compute-the-cosmos/index.html?index=..%2F..%2Findex#0
- Kubernetes for Java apps - https://codelabs.developers.google.com/codelabs/cloud-springboot-kubernetes/index.html?index=..%2F..%2Findex#0
- Google Cloud Storage - https://codelabs.developers.google.com/codelabs/es003l-storage/index.html?index=..%2F..%2Findex
- Campaign finance with bigquery - https://codelabs.developers.google.com/codelabs/cloud-bq-campaign-finance/index.html?index=..%2F..%2Findex#0
- Text processing with big data - https://codelabs.developers.google.com/codelabs/cloud-dataflow-starter/index.html?index=..%2F..%2Findex#0
- Recommendations ML - https://codelabs.developers.google.com/codelabs/cloud-accelerate-dataproc/index.html?index=..%2F..%2Findex#0
- Spark + OpenCV - https://codelabs.developers.google.com/codelabs/cloud-dataproc-opencv/index.html?index=..%2F..%2Findex
- Speech to Text - https://codelabs.developers.google.com/codelabs/cloud-speech-intro/index.html?index=..%2F..%2Findex#0
- Translate Text - https://codelabs.developers.google.com/codelabs/cloud-translation-intro/index.html?index=..%2F..%2Findex#0
Google Qwiklabs:
- Google Cloud Platform essentials - https://google.qwiklabs.com/quests/23?locale=en
- Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
- Data engineering - https://google.qwiklabs.com/quests/25?locale=en