From charlesreid1

No edit summary
m (Replacing charlesreid1.com:3000 with git.charlesreid1.com)
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
notes from January 2018 data engineering work.
this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).
===pages===
Review page: [[Google Cloud/Review]]
Review page: [[Google Cloud/Review]]


Line 7: Line 13:
Project 3: [[2018/January/Data Engineering/Cosmos]]
Project 3: [[2018/January/Data Engineering/Cosmos]]


Expanding data-engineering-scenarios


Start with pre-made examples


Work toward fabricated experimental data
===procedure===
 
Expanding data-engineering-scenarios:
* Start with ready examples
* Work toward synthetic experimental data
* An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
* Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept
 
Software tools list, (abstract) example for each: [[Google Cloud]]
* Storage/database/computation/GPUs vs CPUs/containerization
 
Software quality assurance: https://git.charlesreid1.com/charlesreid1/scientific-software
* [[10 Best]]
* More informal
* Bullet points - things I've learned
* Apply style of later points to earlier points
* Github page - 10 things
* Clear out lorem ipsum (7-10)
 
===links===
 
====links to notes====
 
Notes review: GCDEC
* Case study - [[Google Cloud/Case Study]]
* 1 - [[GCDEC/Fundamentals/Notes]]
* 2 - [[GCDEC/Unstructured_Data/Notes]]
* 3a - [[GCDEC/BigQuery/Notes]]
* 3b - [[GCDEC/Dataflow/Notes]]
* 4a - [[GCDEC/Building_Tensorflow/Notes]]
* 4b - [[GCDEC/Deploying_Tensorflow/Notes]]
* 4c - [[GCDEC/Engineering_Tensorflow/Notes]]
* 5 - [[GCDEC/Streaming/Notes]]
 
====links to codelabs====
 
Google Codelabs:
* Main link - https://codelabs.developers.google.com/
* Kubernetes and Container Engine - https://codelabs.developers.google.com/codelabs/cloud-compute-kubernetes/index.html?index=..%2F..%2Findex#0
* Process Astronomy Data to Generate Images - https://codelabs.developers.google.com/codelabs/cloud-compute-the-cosmos/index.html?index=..%2F..%2Findex#0
* Kubernetes for Java apps - https://codelabs.developers.google.com/codelabs/cloud-springboot-kubernetes/index.html?index=..%2F..%2Findex#0
* Google Cloud Storage - https://codelabs.developers.google.com/codelabs/es003l-storage/index.html?index=..%2F..%2Findex
* Campaign finance with bigquery - https://codelabs.developers.google.com/codelabs/cloud-bq-campaign-finance/index.html?index=..%2F..%2Findex#0
* Text processing with big data - https://codelabs.developers.google.com/codelabs/cloud-dataflow-starter/index.html?index=..%2F..%2Findex#0
* Recommendations ML - https://codelabs.developers.google.com/codelabs/cloud-accelerate-dataproc/index.html?index=..%2F..%2Findex#0
* Spark + OpenCV - https://codelabs.developers.google.com/codelabs/cloud-dataproc-opencv/index.html?index=..%2F..%2Findex
* Speech to Text - https://codelabs.developers.google.com/codelabs/cloud-speech-intro/index.html?index=..%2F..%2Findex#0
* Translate Text - https://codelabs.developers.google.com/codelabs/cloud-translation-intro/index.html?index=..%2F..%2Findex#0
 
Google Qwiklabs:
* Google Cloud Platform essentials - https://google.qwiklabs.com/quests/23?locale=en
* Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
* Data engineering - https://google.qwiklabs.com/quests/25?locale=en


An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
===Flags===


Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept
[[Category:2018]]
[[Category:January 2018]]
[[Category:Data Engineering]]

Latest revision as of 03:09, 9 October 2019

notes from January 2018 data engineering work.

this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).

pages

Review page: Google Cloud/Review

Project 1: 2018/January/Data Engineering/Scientific Data Processing

Project 2: 2018/January/Data Engineering/Big Data Text Processing

Project 3: 2018/January/Data Engineering/Cosmos


procedure

Expanding data-engineering-scenarios:

  • Start with ready examples
  • Work toward synthetic experimental data
  • An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
  • Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept

Software tools list, (abstract) example for each: Google Cloud

  • Storage/database/computation/GPUs vs CPUs/containerization

Software quality assurance: https://git.charlesreid1.com/charlesreid1/scientific-software

  • 10 Best
  • More informal
  • Bullet points - things I've learned
  • Apply style of later points to earlier points
  • Github page - 10 things
  • Clear out lorem ipsum (7-10)

links

links to notes

Notes review: GCDEC

links to codelabs

Google Codelabs:

Google Qwiklabs:

Flags