A How to Guide on Modern Monitoring and Alerting

At our organization (CCIN2P3) we are building an event-based infrastructure to push structured messages to different subsystems for alerting, reporting and storage. Using syslog-ng, each message is normalized into a structured event, optionally correlated with other messages, and conditionally routed to the next systems, including:

  • a synchronous web-dashboard,
  • different asynchronous alerting systems, and
  • a searchable storage backend.

The events which are collected are essentially system and application logs. Here’s a few examples of interesting messages: