mardi 22 septembre 2020

How would you design a alert management system?

I am interested to gain more insights about best practices while designing an alert management system. This is a system which triggers notifications once a given matching condition is fulfilled.

Examples:

  • If number of http errors for last 15 min is more 10% of all request send me a slack message
  • If number of executions for last 1 hour less than 1 send me an email

Moreover design is technology agnostic but a references to open source pieces are welcome.

Base Requirements: (these are just clarification and an answer should not follow them)

  • Model - policies and events. Policy defines a rule which is fired matching a specific condition. Event is an exact point in time when a given policy is triggered.
  • Storage - place where policies and events are kept
  • Scheduling - a mechanism to schedule different tasks checking policies

Questions: (Some guiding questions)

  • How many base component will you define ?
  • How would you model base entities ?
  • What kind of storage would you use ?
  • How will you query the storage ?
  • How would use design scheduling mechanism ?
  • What is the biggest challenge when amount of data grows ?

I am looking moreover for solution with horizontal scaling. Note: Every answer should not strictly following any of suggested questions above.

Aucun commentaire:

Enregistrer un commentaire