Commit graph

84 commits

Author SHA1 Message Date
Alexandre Iooss 9e4b8c2509 prometheus: remove ipmi target
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-01-01 17:15:11 +01:00
Alexandre Iooss a24b473566 prometheus: reduce iLO SNMP timeout
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-01-01 16:45:32 +01:00
Alexandre Iooss 70c8e0ebe0 prometheus: monitor iLO resilient mem and battery 2022-01-01 16:45:10 +01:00
Alexandre Iooss 5ab3dcdac2 prometheus: use enums for iLO SNMP
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-01-01 12:04:01 +01:00
Alexandre Iooss 40d9108b37 prometheus: add iLO alert rules
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 21:26:27 +01:00
Alexandre Iooss 733e9f555d prometheus: add _snmp suffix to ilo target 2021-12-31 20:03:04 +01:00
Alexandre Iooss bcded46ed6 prometheus: remove JSON targets cleanup
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 19:40:22 +01:00
Alexandre Iooss fdeaa355ad prometheus: use longer timeout for iLO scraping 2021-12-31 19:39:23 +01:00
Alexandre Iooss 8c7031d059 prometheus: add iLO SNMP target
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 18:31:58 +01:00
Alexandre Iooss 50d9282316 prometheus: show failing job when machine is down
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 17:26:26 +01:00
Alexandre Iooss 265bd5fbb7 prometheus: use static targets
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 15:08:44 +01:00
Alexandre Iooss 944e200394 prometheus: add ipmi job
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 14:45:05 +01:00
Alexandre Iooss f50778ca96 prometheus: commit production alert configuration 2021-12-31 14:44:50 +01:00
Alexandre Iooss bc4dc03029 prometheus: add newline at the end of targets JSON 2021-12-31 14:44:19 +01:00
Alexandre Iooss cc2ba9ff7b prometheus: deploy targets_ipmi.json 2021-12-31 14:43:47 +01:00
Alexandre Iooss ce04f937db prometheus: call update_motd role in play 2021-11-27 19:20:32 +01:00
pz2891 cc3b4294ae Kepp federated datas 4 months (120 days)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-08-20 19:31:04 +02:00
pz2891 0bfc631465 Remove unused files
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-08-20 17:00:19 +02:00
pz2891 c5e6fbcfdf Configuration for monitoring APC PDU 2021-08-20 16:58:28 +02:00
pz2891 54b073bd02 Typo in unhealthy disk rule 2021-08-18 18:53:27 +02:00
pz2891 e6b6790f63 New rule for unhealthy disks 2021-08-13 15:24:12 +02:00
jeltz 4f2f0ffe64 Increase swap alert threshold 2021-05-19 15:32:33 +02:00
otthorn 3a600d9061 Give a name to unnamed tasks 2021-04-17 17:43:49 +02:00
pz2891 11d0b46ef0 Remove port for docker instances. Remove 'remove old files' tasks 2021-04-14 20:00:16 +02:00
pz2891 013743f910 typo in docker rules
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:54:37 +02:00
pz2891 1b0bff4c51 Fix deployment and add prometheus groups for hosts
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:51:47 +02:00
pz2891 fde52f2e42 Alerts repository owned by prometheus
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:29:12 +02:00
pz2891 e4d2416722 fix typo 2021-04-14 19:27:13 +02:00
pz2891 226b55b0d1 Update alerts (remove instance, translations) 2021-04-14 19:10:42 +02:00
pz2891 5d9a6599e8 Fix some typos, in accordance to Solal's comments 2021-04-12 11:10:15 +02:00
pz2891 3320e3e0c6 Update the labels for the alert (make complete tenses) 2021-04-12 11:01:43 +02:00
pz2891 676cc716cf Modify label for the alert 2021-04-12 11:00:31 +02:00
pz2891 954e3e0892 End of yaml file (bad copy/paste) 2021-04-12 10:58:59 +02:00
pz2891 1908deee9c fix CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-12 10:01:39 +02:00
pz2891 6c64bb214c fix CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-11 22:01:21 +02:00
pz2891 c48fe1ae17 7% rollback for the warning 2021-04-11 20:57:53 +02:00
pz2891 9d18ebb7f1 Fix docker rules
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-11 17:18:32 +02:00
pz2891 6775d9ecde Add docker rules 2021-04-11 16:43:34 +02:00
pz2891 9ebdf15bb9 Splite alerts on some files 2021-04-11 15:58:35 +02:00
pz2891 dd48302585 Configure Prometheus and Prometheus federate to scrape Postgres Exporter
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-10 18:01:55 +02:00
jeltz 6b2bc60589 Merge branch 'master' into add_rives_vm_master
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-06 19:37:57 +02:00
jeltz 91817b324c Increase the alert threshold for temperatures
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-03 08:04:10 +02:00
jeltz 1c3127dbbe Add more node-exporter alerts
All checks were successful
continuous-integration/drone/push Build is passing
Source: https://awesome-prometheus-alerts.grep.to/rules.html
2021-04-02 22:55:51 +02:00
jeltz f80435cb31 Differentiate alerts for servers and Wi-Fi APs
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-02 21:54:38 +02:00
jeltz 06f101527d Use a dynamic interval for UPS output voltage alerts
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-02 13:57:34 +02:00
jeltz 83f5b35e59 Fix a filename typo
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-01 18:24:21 +02:00
jeltz 35286a661a Change an alert description 2021-04-01 18:24:03 +02:00
jeltz 11335a6077 Fix typo in alert description
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-01 18:15:22 +02:00
jeltz bc35cd8e90 Move templates of the prometheus role 2021-04-01 09:40:22 +02:00
jeltz 5bcc428895 Remove 'instance' from description and fix typos 2021-04-01 09:36:11 +02:00