From charlesreid1

m (Replacing charlesreid1.com:3000 with git.charlesreid1.com)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
notes from January 2018 data engineering work.
this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).
===pages===
===pages===


Line 13: Line 17:
===procedure===
===procedure===


Expanding data-engineering-scenarios
Expanding data-engineering-scenarios:
 
* Start with ready examples
Start with pre-made examples
* Work toward synthetic experimental data
 
* An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
Work toward fabricated experimental data
* Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept
 
An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
 
Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept
 
 
 
===procedure===


Software tools list, (abstract) example for each: [[Google Cloud]]
Software tools list, (abstract) example for each: [[Google Cloud]]
* Storage/database/computation/GPUs vs CPUs/containerization
* Storage/database/computation/GPUs vs CPUs/containerization


Software quality assurance:
Software quality assurance: https://git.charlesreid1.com/charlesreid1/scientific-software
* [[10 Best]]
* More informal
* Bullet points - things I've learned
* Apply style of later points to earlier points
* Github page - 10 things
* Github page - 10 things
* Apply style of later points to earlier points
* Clear out lorem ipsum (7-10)
* Clear out lorem ipsum (7-10)


===links===


 
====links to notes====
===links to notes===


Notes review: GCDEC
Notes review: GCDEC
Line 50: Line 49:
* 5 - [[GCDEC/Streaming/Notes]]
* 5 - [[GCDEC/Streaming/Notes]]


===links to codelabs===
====links to codelabs====


Google Codelabs:
Google Codelabs:
Line 69: Line 68:
* Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
* Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
* Data engineering - https://google.qwiklabs.com/quests/25?locale=en
* Data engineering - https://google.qwiklabs.com/quests/25?locale=en


===Flags===
===Flags===

Latest revision as of 03:09, 9 October 2019

notes from January 2018 data engineering work.

this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).

pages

Review page: Google Cloud/Review

Project 1: 2018/January/Data Engineering/Scientific Data Processing

Project 2: 2018/January/Data Engineering/Big Data Text Processing

Project 3: 2018/January/Data Engineering/Cosmos


procedure

Expanding data-engineering-scenarios:

  • Start with ready examples
  • Work toward synthetic experimental data
  • An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
  • Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept

Software tools list, (abstract) example for each: Google Cloud

  • Storage/database/computation/GPUs vs CPUs/containerization

Software quality assurance: https://git.charlesreid1.com/charlesreid1/scientific-software

  • 10 Best
  • More informal
  • Bullet points - things I've learned
  • Apply style of later points to earlier points
  • Github page - 10 things
  • Clear out lorem ipsum (7-10)

links

links to notes

Notes review: GCDEC

links to codelabs

Google Codelabs:

Google Qwiklabs:

Flags