Revision as of 22:34, 17 February 2018

Project Overview

The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics.

Stage 1: Collecting System Data

COMPLETED Phase 1a: Netdata

First, we set up Netdata.

Pros: Netdata has a fantastic dashboard with all kinds of stuff all ready to go.
Cons: Netdata is custom-built for monitoring compute nodes, and not for general visualization.
Netdata
Link to Netdata scripts: https://charlesreid1.com:3000/data/netdata

Netdata is a useful tool for monitoring an individual machine instance remotely and it works excellent.

NOPE Phase 1b: Prometheus

~~Second, we set up Netdata to dump to a Prometheus database.~~

Pros: Prometheus was fairly easy to integrate with Netdata.

Cons: Prometheus was not a particularly outstanding tool, don't know much about how to use it.

Prometheus

~~Need to get more involved with Prometheus and/or Grafana to monitor more than one machine.~~

COMPLETED Phase 2a: MongoDB and MongoExpress

We then set up MongoDB and MongoExpress in Docker containers. MongoDB listens for incoming data on the VPN. MongoExpress is connected to MongoDB and exposes a web interface to interact with MongoDB. We used MongoDB to store edit history and page graph data from the charlesreid1 wiki.

Pros and cons:

Pros: MongoDB is a containerized solution with persistent data. MongoDB had (has?) a high setup barrier, but a low usage barrier. Very easy to do basic CRUD operations, make new databases as needed, etc.
Cons: No visualization tools baked in, need to define own tools. Collectd cannot dump to MongoDB because of a bunch of installation stupidity.

Links:

MongoDB
MongoExpress
Pywikibot
Link to MongoDB docker files: https://charlesreid1.com:3000/docker/d-mongodb
Link to MongoExpress docker files: https://charlesreid1.com:3000/docker/d-mongoexpress
Link to wiki scraping scripts: https://charlesreid1.com:3000/wiki/charlesreid1-wiki-data

NOPE Phase 2b: Collectd

~~We struggled a LOT with Collectd, mainly because we wanted to use the collectd plugin to write to MongoDB. Unfortunately, this was the only plugin that seemed impossible to install.~~

~~See Collectd page.~~

(This is all installation stupidity. I tried installing collectd with aptitude, no plugins. Then the core, no plugins. Then installing from source, and MongoDB plugin did not work. Struggling to get collectd to link to MongoDB. Needed custom config or something. Then I just gave up, and re-installed collectd core, and the library was there, but it was complaining it couldn't find it. In the end, I totally abandoned the attempt to get collectd to talk to mongodb. Could probably use a collectd docker and fix this whole issue.)

NOPE Phase 3a: Graphite

~~Next, we deployed a Graphite container to hold time series from Collectd.~~

~~Pros and cons:~~

Pros: Containerized solution. Collectd graphite plugin worked fine.

Cons: Graphite comes with Carbon (web interface), which is utter awful. It provides the absolute bare minimum, and it looks like it's trapped in a miserable 1998 computer prison.
~~Links:~~

~~Graphite~~

~~Link to Graphite docker files: https://charlesreid1.com:3000/docker/d-graphite~~

NOPE Phase 3b: Visualizing Graphite

A few years back we explored Cubism and Cube (difference?) as a way of visualizing time series from Graphite. It took some effort to get a basic dashboard, and Cubism is (ultimately) D3, the most frustratingly stupidly over-designed and over-complicated library ever, implemented in a totally irrational programming language.

~~So, no.~~

~~We're going to focus on Mongo, which is more transparent and more flexible for all purposes.~~

Stage 1 Conclusion: Netdata, Not Collectd

All of the struggle to get collectd working with mongo was a waste of effort, and led to the graphite distraction in the first place. A broken build procedure (collectd) led to an unknown, mediocre tool (graphite).

Ultimately, if we need to run collectd, interface with the collectd API via Python: https://collectd.org/wiki/index.php/Plugin:Python

Stage 2: Finalizing Data Collection

Phase 4: Netdata and Mongo

Netdata provides a backend API that can be called to extract data from Netdata. MongoDB listens for API calls to store data in the database. All we need is software that will poll various Netdata instances using the Netdata API and dump that data into MongoDB. This gives much more fine-grained control over the process, schema, and storage format of the data.

See Netdata#Database_Backends for info on netdata backends.

See Netdata/MongoDB/API for script that calls APIs of Netdata and MongoDB to construct the time series database in MongoDB.

Link to script: https://charlesreid1.com:3000/data/netdata/src/master/netdata_mongo.py

File:NetdataMongodb.png

This is a (micro)service design pattern - small, lightweight, standalone daemons act as instruments that continuously read whatever they read, available to be queried but otherwise not saving or doing anything with the data themselves. The data is handled by an application that queries each service it manages to collect data about those services (and coordinate if necessary).

Stage 3: Visualizing Data

Phase 5: Grafana

Grafana container to create dashboards from it.

Link to Grafana docker files: https://charlesreid1.com:3000/docker/d-graphite

Need to fix grafana user on jupiter.

We're basically after something like this, except MongoDB instead of Prometheus: https://github.com/firehol/netdata/wiki/Netdata,-Prometheus,-and-Grafana-Stack

Next Steps

Bioconda and biocontainers - start to adopt their approach to doing things, and be able to replicate workflows

Slang some cluster workflows around

Flags

@@ Line 1: / Line 1: @@
 =Project Overview=
-The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards.
+The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics.
-==Phase 1a: Netdata (Done)==
+==Stage 1: Collecting System Data==
+===COMPLETED Phase 1a: Netdata===
 First, we set up [[Netdata]].
@@ Line 13: / Line 15: @@
 Netdata is a useful tool for monitoring an individual machine instance remotely and it works excellent.
-==Phase 1b: Prometheus (Nope)==
+===NOPE Phase 1b: Prometheus===
 <s>Second, we set up [[Netdata]] to dump to a [[Prometheus]] database.
@@ Line 22: / Line 24: @@
 Need to get more involved with Prometheus and/or Grafana to monitor more than one machine.</s>
-==Phase 2: MongoDB and MongoExpress (Done)==
+===COMPLETED Phase 2a: MongoDB and MongoExpress===
 We then set up [[MongoDB]] and [[MongoExpress]] in Docker containers. MongoDB listens for incoming data on the VPN. MongoExpress is connected to MongoDB and exposes a web interface to interact with MongoDB. We used MongoDB to store edit history and page graph data from the charlesreid1 wiki.
+Pros and cons:
 * '''Pros:''' MongoDB is a containerized solution with persistent data. MongoDB had (has?) a high setup barrier, but a low usage barrier. Very easy to do basic CRUD operations, make new databases as needed, etc.
 * '''Cons:''' No visualization tools baked in, need to define own tools. Collectd cannot dump to MongoDB because of a bunch of installation stupidity.
+Links:
 * [[MongoDB]]
 * [[MongoExpress]]
@@ Line 34: / Line 40: @@
 * Link to wiki scraping scripts: https://charlesreid1.com:3000/wiki/charlesreid1-wiki-data
-==Phase 2b: Collectd (Nope)==
+===NOPE Phase 2b: Collectd===
 <s>We struggled a LOT with [[Collectd]], mainly because we wanted to use the collectd plugin to write to MongoDB. Unfortunately, this was the only plugin that seemed impossible to install.
@@ Line 42: / Line 48: @@
 (This is all installation stupidity. I tried installing collectd with aptitude, no plugins. Then the core, no plugins. Then installing from source, and MongoDB plugin did not work. Struggling to get collectd to link to MongoDB. Needed custom config or something. Then I just gave up, and re-installed collectd core, and the library was there, but it was complaining it couldn't find it. In the end, I totally abandoned the attempt to get collectd to talk to mongodb. Could probably use a collectd docker and fix this whole issue.)</s>
-==Phase 3: Graphite (Nope)==
+===NOPE Phase 3a: Graphite===
 <s>Next, we deployed a [[Graphite]] container to hold time series from [[Collectd]].
+Pros and cons:
 * '''Pros:''' Containerized solution. Collectd graphite plugin worked fine.
 * '''Cons:''' Graphite comes with Carbon (web interface), which is utter awful. It provides the absolute bare minimum, and it looks like it's trapped in a miserable 1998 computer prison.
+Links:
 * [[Graphite]]
 * Link to Graphite docker files: https://charlesreid1.com:3000/docker/d-graphite</s>
-==Phase 3b: Visualizing Graphite (Nope)==
+===NOPE Phase 3b: Visualizing Graphite===
 <s>A few years back we explored [[Cubism]] and Cube (difference?) as a way of visualizing time series from Graphite. It took some effort to get a basic dashboard, and Cubism is (ultimately) D3, the most frustratingly stupidly over-designed and over-complicated library ever, implemented in a totally irrational programming language.
@@ Line 58: / Line 68: @@
 We're going to focus on Mongo, which is more transparent and more flexible for all purposes.</s>
-==Conclusion: Netdata, Not Collectd==
+===Stage 1 Conclusion: Netdata, Not Collectd===
 All of the struggle to get collectd working with mongo was a waste of effort, and led to the graphite distraction in the first place. A broken build procedure (collectd) led to an unknown, mediocre tool (graphite).
 Ultimately, if we need to run collectd, interface with the collectd API via Python: https://collectd.org/wiki/index.php/Plugin:Python
+==Stage 2: Finalizing Data Collection==
 ==Phase 4: Netdata and Mongo==
@@ Line 78: / Line 90: @@
 This is a (micro)service design pattern - small, lightweight, standalone daemons act as instruments that continuously read whatever they read, available to be queried but otherwise not saving or doing anything with the data themselves. The data is handled by an application that queries each service it manages to collect data about those services (and coordinate if necessary).
-==Phase 5: Grafana==
+==Stage 3: Visualizing Data==
+===Phase 5: Grafana===
 [[Grafana]] container to create dashboards from it.

2018/Data Project: Difference between revisions

From charlesreid1