From charlesreid1

 
(35 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Project Overview=
=Project Overview=


The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards.
The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics.


==Phase 1: Netdata and Prometheus==
==Stage 1: Collecting System Data==


First, we set up [[Netdata]] to dump to a [[Prometheus]] database.
See [[2018/Data Project/Stage 1]]
* '''Pros:''' Netdata has a fantastic dashboard with all kinds of stuff all ready to go. Prometheus was fairly easy to integrate with Netdata.
* '''Cons:''' Netdata is custom-built for monitoring compute nodes, and not for general visualization. Prometheus was not a particularly outstanding tool, don't know much about how to use it.
* [[Netdata]]
* [[Prometheus]]


Netdata is a useful tool for monitoring an individual machine instance remotely. Need to get more involved with Prometheus and/or Grafana to monitor more than one machine.
==Stage 2: Spy==


==Phase 2: MongoDB and MongoExpress==
dahak-spy project:
 
* lightweight server (may want larger disk, okay if non-free)
We then set up [[MongoDB]] and [[MongoExpress]] in Docker containers. MongoDB listens for incoming data on the VPN. MongoExpress is connected to MongoDB and exposes a web interface to interact with MongoDB. We used MongoDB to store edit history and page graph data from the charlesreid1 wiki.
* running mongodb
* '''Pros:''' MongoDB is a containerized solution with persistent data. MongoDB had (has?) a high setup barrier, but a low usage barrier. Very easy to do basic CRUD operations, make new databases as needed, etc.
* running mongoexpress
* '''Cons:''' No visualization tools baked in, need to define own tools. Collectd cannot dump to MongoDB because of a bunch of installation stupidity.
* running prometheus
* [[MongoDB]]
* running grafana
* [[MongoExpress]]
* running netdata
* [[Pywikibot]]
* Link to MongoDB docker files: https://charlesreid1.com:3000/docker/d-mongodb
* Link to MongoExpress docker files: https://charlesreid1.com:3000/docker/d-mongoexpress
* Link to wiki scraping scripts: https://charlesreid1.com:3000/wiki/charlesreid1-wiki-data
 
==Phase 2b: Collectd==
 
We struggled a LOT with [[Collectd]], mainly because we wanted to use the collectd plugin to write to MongoDB. Unfortunately, this was the only plugin that seemed impossible to install.
 
See [[Collectd]] page.
 
(This is all installation stupidity. I tried installing collectd with aptitude, no plugins. Then the core, no plugins. Then installing from source, and MongoDB plugin did not work. Struggling to get collectd to link to MongoDB. Needed custom config or something. Then I just gave up, and re-installed collectd core, and the library was there, but it was complaining it couldn't find it. In the end, I totally abandoned the attempt to get collectd to talk to mongodb. Could probably use a collectd docker and fix this whole issue.)
 
==Phase 3: Graphite and Grafana==
 
Next, we deployed a [[Graphite]] container to hold time series and a [[Grafana]] container to create dashboards from it.
* '''Pros:''' Containerized solution, like MongoDB
* Link to Graphite docker files: https://charlesreid1.com:3000/docker/d-graphite
* Link to Grafana docker files: https://charlesreid1.com:3000/docker/d-graphite


additional components before real world testing:
* netdata on the build node
* netdata python plugin from another process, monitoring.....???
* metrics:
** is snakemake running (binary yes/no)
** current stage of snakemake (adjust snakemake file to write into a dotfile)
** cpu/memory/network/disk io


netdata python plugin workflow?
* does it need to be installed and netdata restarted, or can it push data into netdata?


real yeti:
* get a yeti node
* debug the snakemake file one step at a time using already-downloaded files (faster step)
* let the snakemake file run with netdata and friends running


=Flags=
=Flags=
Line 59: Line 49:
[[Category:January 2018]]
[[Category:January 2018]]
[[Category:February 2018]]
[[Category:February 2018]]
<!--
==Stage 2: Finalized Data Collection System==
===Phase 4: Netdata and Mongo===
Netdata provides a backend API that can be called to extract data from Netdata. MongoDB listens for API calls to store data in the database. All we need is software that will poll various Netdata instances using the Netdata API and dump that data into MongoDB. This gives much more fine-grained control over the process, schema, and storage format of the data.
See [[Netdata#Database_Backends]] for info on netdata backends.
See [[Netdata/MongoDB/API]] for script that calls APIs of Netdata and MongoDB to construct the time series database in MongoDB.
Link to script: https://charlesreid1.com:3000/data/netdata/src/master/netdata_mongo.py
[[Image:NetdataMongodb.png|500px]]
This is a (micro)service design pattern - small, lightweight, standalone daemons act as instruments that continuously read whatever they read, available to be queried but otherwise not saving or doing anything with the data themselves. The data is handled by an application that queries each service it manages to collect data about those services (and coordinate if necessary).
-->
<!--
==Stage 3: Visualizing Data==
===Phase 5: Grafana===
[[Grafana]] container to create dashboards from it.
Link to Grafana docker files: https://charlesreid1.com:3000/docker/d-grafana
Need to fix grafana user on jupiter.
We're basically after something like this: https://github.com/firehol/netdata/wiki/Netdata,-Prometheus,-and-Grafana-Stack
-->

Latest revision as of 08:18, 3 March 2018

Project Overview

The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics.

Stage 1: Collecting System Data

See 2018/Data Project/Stage 1

Stage 2: Spy

dahak-spy project:

  • lightweight server (may want larger disk, okay if non-free)
  • running mongodb
  • running mongoexpress
  • running prometheus
  • running grafana
  • running netdata

additional components before real world testing:

  • netdata on the build node
  • netdata python plugin from another process, monitoring.....???
  • metrics:
    • is snakemake running (binary yes/no)
    • current stage of snakemake (adjust snakemake file to write into a dotfile)
    • cpu/memory/network/disk io

netdata python plugin workflow?

  • does it need to be installed and netdata restarted, or can it push data into netdata?

real yeti:

  • get a yeti node
  • debug the snakemake file one step at a time using already-downloaded files (faster step)
  • let the snakemake file run with netdata and friends running

Flags