Alerting and monitoring system
Existing systems
- #monitoring-unms/UISP
- Grafana/Prometheus
- public, setup 4 years ago: https://stats.nycmesh.net
- Mesh only, Omni's etc: http://10.70.90.82:3000/dashboards
- support report generator
Requirements
Must:Must:- alert Slack team when key infrasture goes offline within 5 minutes
- Should:
- be easy to update for new equipment
- should be easy to configure to notify new volunteers
Questions
- frequency? ~1 point/hour
Proposed software
Log
- prompted by this Slack discussion on Grand St. outage