Commit graph

657 commits

Author SHA1 Message Date
Alexandre Iooss
70c8e0ebe0 prometheus: monitor iLO resilient mem and battery 2022-01-01 16:45:10 +01:00
Alexandre Iooss
5ab3dcdac2 prometheus: use enums for iLO SNMP
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2022-01-01 12:04:01 +01:00
Alexandre Iooss
40d9108b37 prometheus: add iLO alert rules
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 21:26:27 +01:00
Alexandre Iooss
2830558545 prometheus_federation: add ilo_snmp and remove django
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 20:04:24 +01:00
Alexandre Iooss
733e9f555d prometheus: add _snmp suffix to ilo target 2021-12-31 20:03:04 +01:00
Alexandre Iooss
bcded46ed6 prometheus: remove JSON targets cleanup
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 19:40:22 +01:00
Alexandre Iooss
860a26a8dc prometheus: federate ilo metrics
All checks were successful
continuous-integration/drone/push Build is passing
2021-12-31 19:39:38 +01:00
Alexandre Iooss
fdeaa355ad prometheus: use longer timeout for iLO scraping 2021-12-31 19:39:23 +01:00
Alexandre Iooss
8c7031d059 prometheus: add iLO SNMP target
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 18:31:58 +01:00
Alexandre Iooss
50d9282316 prometheus: show failing job when machine is down
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 17:26:26 +01:00
Alexandre Iooss
265bd5fbb7 prometheus: use static targets
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 15:08:44 +01:00
Alexandre Iooss
944e200394 prometheus: add ipmi job
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-31 14:45:05 +01:00
Alexandre Iooss
f50778ca96 prometheus: commit production alert configuration 2021-12-31 14:44:50 +01:00
Alexandre Iooss
bc4dc03029 prometheus: add newline at the end of targets JSON 2021-12-31 14:44:19 +01:00
Alexandre Iooss
cc2ba9ff7b prometheus: deploy targets_ipmi.json 2021-12-31 14:43:47 +01:00
1b9fc70649 Merge branch 'bashrc_root'
All checks were successful
continuous-integration/drone/push Build is passing
2021-12-16 05:56:57 +01:00
8dca876bbc Add a very simple bashrc for root
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-16 05:48:39 +01:00
515222f404 Merge pull request 'Fix SSH CA deployment' (#86) from use_ssh_ca into master
All checks were successful
continuous-integration/drone/push Build is passing
Reviewed-on: #86
2021-12-15 17:31:29 +01:00
2f3612fd8e Deploy SSH CA everywhere and set root password
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-15 17:15:57 +01:00
7db282fffb Fix sshd.service → ssh.service 2021-12-15 16:17:11 +01:00
11937776c8 Merge branch 'master' into borgmatic_hourly
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-12-14 07:58:13 +01:00
e6363e9668 Use the Users CA for authentication
Some checks failed
continuous-integration/drone/push Build is failing
2021-12-12 05:56:54 +01:00
a56cea369c Remove 'dokuwiki' role 2021-11-28 11:17:47 +01:00
Alexandre Iooss
3c85a2bfb2 passbolt: remove role 2021-11-28 11:13:34 +01:00
Alexandre Iooss
fd0cb811a7 postgres: fix defaults 2021-11-28 11:07:13 +01:00
Alexandre Iooss
4bd431f9c3 postgresql: variables in dict 2021-11-28 11:01:29 +01:00
Alexandre Iooss
a818fd8ed9 Rename postgresql_server to postgresql 2021-11-28 10:20:17 +01:00
Alexandre Iooss
0979370418 Add motd for most plays 2021-11-27 22:16:29 +01:00
Alexandre Iooss
14b6a68040 base: configure motd 2021-11-27 20:05:14 +01:00
Alexandre Iooss
cc6f96bbc8 borgbackup-client: call update_motd role in play 2021-11-27 20:04:05 +01:00
Alexandre Iooss
07a0429ae0 nginx: call update_motd role in play 2021-11-27 20:02:08 +01:00
Alexandre Iooss
ce04f937db prometheus: call update_motd role in play 2021-11-27 19:20:32 +01:00
Alexandre Iooss
1009298023 borgbackup_server: call update_motd role in play 2021-11-27 19:16:24 +01:00
Alexandre Iooss
ea394a01db prometheus-federate: call update_motd role in play 2021-11-27 19:16:11 +01:00
Alexandre Iooss
b82afd13d9 update_motd: use update_motd dict 2021-11-27 19:14:39 +01:00
Alexandre Iooss
a791cda652 grafana: move Aurore specific variables out of the role 2021-11-27 18:29:05 +01:00
Alexandre Iooss
fdfed1a05a grafana: remove trailing lines 2021-11-27 18:17:57 +01:00
Alexandre Iooss
e2acfd4031 grafana: single quote LDAP password 2021-11-27 18:17:57 +01:00
Alexandre Iooss
c7f94b54c8 grafana: validate gpg key 2021-11-27 18:17:57 +01:00
Alexandre Iooss
aba0370c5b Add grafana playbook and machine 2021-11-27 18:17:57 +01:00
Alexandre Iooss
3efc8179bc logrotate: restore Debian formatting 2021-11-22 18:08:25 +01:00
Alexandre Iooss
3a56439fac update_motd: remove become true 2021-11-22 18:03:09 +01:00
Alexandre Iooss
94b8f37302 rsyslog_common: remove become true 2021-11-22 18:02:53 +01:00
Alexandre Iooss
1392e3fe64 Remove cached motd 2021-11-22 18:01:21 +01:00
Alexandre Iooss
11b3738fcd ldap_client: Add one extra line to follow Debian 2021-11-22 18:00:57 +01:00
8b54121a87 Install prometheus-node-exporter-collectors 2021-09-24 01:41:01 +02:00
5d3d965112
the service does not need to be enabled 2021-09-23 19:02:26 +02:00
73e522f0c6
add exporter on bullseye 2021-09-23 18:54:06 +02:00
b31f9bd952 Retention time is now a file that will be copied
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-08-21 11:25:39 +02:00
cc3b4294ae Kepp federated datas 4 months (120 days)
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-08-20 19:31:04 +02:00
f17e7f7524 Add snmp pdu password to generate config 2021-08-20 18:22:00 +02:00
0bfc631465 Remove unused files
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-08-20 17:00:19 +02:00
c5e6fbcfdf Configuration for monitoring APC PDU 2021-08-20 16:58:28 +02:00
54b073bd02 Typo in unhealthy disk rule 2021-08-18 18:53:27 +02:00
e6b6790f63 New rule for unhealthy disks 2021-08-13 15:24:12 +02:00
b7ead19d50 Remove mail from re2o bug report
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-05-25 23:11:30 +02:00
bb97bca456 Increase RandomizedDelaySec when hourly = 0
Some checks reported errors
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build encountered an error
2021-05-23 14:09:01 +02:00
9296a2ed91 Add caradoc.adm.auro.re
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-05-23 14:02:20 +02:00
4f2f0ffe64 Increase swap alert threshold 2021-05-19 15:32:33 +02:00
c8a877282f Add 9 & 10 for Debian distribution
Some checks failed
continuous-integration/drone/push Build is failing
2021-05-19 15:29:40 +02:00
c6b768e1bb Don't run borgmatic every hour if not needed
Some checks failed
continuous-integration/drone/push Build is failing
2021-05-10 13:02:45 +02:00
ceaf75f0ad Merge pull request 'Use a disk assisted queue for rsyslog' (#56) from rsyslog_queues into master
All checks were successful
continuous-integration/drone/push Build is passing
Reviewed-on: Aurore/ansible#56
2021-05-04 00:54:40 +02:00
b29e9c0e45 Configure a disk-assisted queue for output actions 2021-04-30 16:49:00 +02:00
3a600d9061 Give a name to unnamed tasks 2021-04-17 17:43:49 +02:00
11d0b46ef0 Remove port for docker instances. Remove 'remove old files' tasks 2021-04-14 20:00:16 +02:00
013743f910 typo in docker rules
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:54:37 +02:00
1b0bff4c51 Fix deployment and add prometheus groups for hosts
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:51:47 +02:00
fde52f2e42 Alerts repository owned by prometheus
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-14 19:29:12 +02:00
e4d2416722 fix typo 2021-04-14 19:27:13 +02:00
226b55b0d1 Update alerts (remove instance, translations) 2021-04-14 19:10:42 +02:00
fd5ad8d5ac Merge branch 'prometheus_postgres_exporter' of https://gitea.auro.re/Aurore/ansible into prometheus_postgres_exporter
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-12 11:10:31 +02:00
5d9a6599e8 Fix some typos, in accordance to Solal's comments 2021-04-12 11:10:15 +02:00
3320e3e0c6 Update the labels for the alert (make complete tenses) 2021-04-12 11:01:43 +02:00
676cc716cf Modify label for the alert 2021-04-12 11:00:31 +02:00
954e3e0892 End of yaml file (bad copy/paste) 2021-04-12 10:58:59 +02:00
pz2891
8c666151d6 Merge branch 'master' into prometheus_postgres_exporter
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-12 10:10:17 +02:00
1908deee9c fix CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-12 10:01:39 +02:00
e2b1f8eae5 Allow root to connect using peer authentication
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-11 22:08:11 +02:00
6c64bb214c fix CI
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-11 22:01:21 +02:00
764f0f106d Install postgres exporter when it is bullseye or buster
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-11 21:38:29 +02:00
c48fe1ae17 7% rollback for the warning 2021-04-11 20:57:53 +02:00
304437da97 Remove .save file 2021-04-11 20:56:40 +02:00
9d18ebb7f1 Fix docker rules
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-11 17:18:32 +02:00
6775d9ecde Add docker rules 2021-04-11 16:43:34 +02:00
9ebdf15bb9 Splite alerts on some files 2021-04-11 15:58:35 +02:00
dd48302585 Configure Prometheus and Prometheus federate to scrape Postgres Exporter
Some checks failed
continuous-integration/drone/push Build is failing
2021-04-10 18:01:55 +02:00
45041be2ab Install postgres exporter 2021-04-10 17:29:50 +02:00
jeltz
6b2bc60589 Merge branch 'master' into add_rives_vm_master
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-04-06 19:37:57 +02:00
91817b324c Increase the alert threshold for temperatures
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-03 08:04:10 +02:00
1c3127dbbe Add more node-exporter alerts
All checks were successful
continuous-integration/drone/push Build is passing
Source: https://awesome-prometheus-alerts.grep.to/rules.html
2021-04-02 22:55:51 +02:00
f80435cb31 Differentiate alerts for servers and Wi-Fi APs
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-02 21:54:38 +02:00
06f101527d Use a dynamic interval for UPS output voltage alerts
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-02 13:57:34 +02:00
83f5b35e59 Fix a filename typo
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-04-01 18:24:21 +02:00
35286a661a Change an alert description 2021-04-01 18:24:03 +02:00
11335a6077 Fix typo in alert description
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-01 18:15:22 +02:00
083fc4da9a Fix permissions on prometheus.yml 2021-04-01 18:15:09 +02:00
a743ce09fb Move templates of the prometheus_federate role
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-01 09:42:54 +02:00
bc35cd8e90 Move templates of the prometheus role 2021-04-01 09:40:22 +02:00
5bcc428895 Remove 'instance' from description and fix typos 2021-04-01 09:36:11 +02:00
eeaf0f8486 Fix syntax errors
All checks were successful
continuous-integration/drone/push Build is passing
2021-04-01 06:02:40 +02:00
e247aa3f70 Uniform labels for alerts 2021-04-01 05:21:08 +02:00
jeltz
424aa80d8f Merge pull request 'Use update_motd everywhere' (#44) from use_update_motd_everywhere into master
All checks were successful
continuous-integration/drone/push Build is passing
Reviewed-on: Aurore/ansible#44
2021-03-30 10:12:14 +02:00
ac05da7173 Use update_motd everywhere
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-03-30 10:08:21 +02:00
dff0d9922c Store log.adm.auro.re local logs in /var/log/remote 2021-03-30 10:06:25 +02:00
dd274891a5 resolve conflicts
Some checks failed
continuous-integration/drone/push Build is failing
continuous-integration/drone/pr Build is failing
2021-03-30 09:30:06 +02:00
2952c39f70 Fix issues for installing radius-rives (baq package for postgresql-client) 2021-03-30 09:20:31 +02:00
85e691a0a2 Don't store journald logs to disk
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
As they are already stored on disk by rsyslog.
2021-03-30 07:46:06 +02:00
606df65535 Cleanup logrotate role 2021-03-30 07:45:52 +02:00
3030d3bfab Fix typo: use 'Reload' instead of 'Restart' 2021-03-30 07:42:46 +02:00
f59d9ee6f0 WIP: add logrotate config for rsyslog-managed files 2021-03-30 06:01:43 +02:00
jeltz
6d74f04db4 Merge pull request 'Better distribution of backups over time' (#49) from backups into master
Reviewed-on: Aurore/ansible#49
2021-03-24 02:12:53 +01:00
21eaeb2d42 Better distribution of backups over time 2021-03-24 02:10:11 +01:00
jeltz
789c11c3e3 Merge pull request 'Cleanup borgmatic related roles' (#47) from backups into master
Reviewed-on: Aurore/ansible#47
2021-03-18 22:19:39 +01:00
a1533b7efd Fix issues for installing radius-rives (baq package for postgresql-client) 2021-03-17 20:41:46 +01:00
f662e4bd47 Remove bullseye for radius role. Add the oid for temperature of ups 2021-03-16 21:13:45 +01:00
3000f46c46 Randomize borgmatic timer 2021-03-16 15:05:29 +01:00
8524b9fa99 Fix typo 2021-03-16 14:13:12 +01:00
37582abfe1 Remove useless tasks from borgmatic_client 2021-03-16 13:47:14 +01:00
96a498c6de Break long lines in borgmatic.service unit 2021-03-16 13:46:46 +01:00
1be92bad62 Log source port for NGinx 2021-03-16 09:43:13 +01:00
01bca6597d Run borgmatic every hour 2021-03-16 09:38:51 +01:00
21a3d5af2a Add bullseye support in 'prometheus_node' 2021-03-15 10:50:40 +01:00
jeltz
4305a60639 Merge pull request 'Backups with borg and borgmatic' (#39) from backups into master
Reviewed-on: Aurore/ansible#39
2021-03-15 07:53:33 +01:00
3f3f688da4 Use 'present' instead of 'latest' (ansible-lint) 2021-03-15 07:51:48 +01:00
6713b550b6 Merge branch 'master' into backups 2021-03-15 07:50:11 +01:00
cb3ec07121 Use 'inventory_hostname' instead of 'ansible_fqdn'
While 'ansible_fdqn' can be changed by a compromised host,
'inventory_hostname' can't (hopefully).

It should therefore no longer be possible for the said host to access
the backups of another host.
2021-03-15 07:25:09 +01:00
243ec1fe9d [borgbackup_client] VaRi0u5 f1X3s 2021-03-15 01:04:42 +01:00
f15b222cdc Allow root to log as postgres 2021-03-14 23:45:36 +01:00
7480a7c565 [borgbackup_client] precedence rules and sain defaults for borg config 2021-03-14 22:02:34 +01:00
b14b359027 [borgbackup_client] add exlude path to conf 2021-03-14 19:21:15 +01:00
33a1ec02f3 [borgbackup_client] update config directory to be homogeneous 2021-03-14 19:07:02 +01:00
ebfc4f2a26 [borgbackup_client] do update cache 2021-03-14 19:03:44 +01:00
86f8b31159 Delegate facts for borgbackup_client 2021-03-14 18:44:13 +01:00
d9f1104309 Move id_remote to /etc/borgmatic 2021-03-14 18:42:26 +01:00
c6cae75031 [borgbackup_server] fix /borg permissions 2021-03-14 18:29:33 +01:00
46d10022ea [borgbackup_client] fix rentention date to int and list correctly source directories 2021-03-14 18:24:36 +01:00
ff750c5b63 [borgbackup_client] remove 1 minute sleep and fix verbosity 2021-03-14 18:23:44 +01:00
2651432582 [WIP] various fixes 2021-03-14 18:22:52 +01:00
d928c7f7f0 [borgbackup_client] rename variable correclty 2021-03-14 16:11:40 +01:00
021a5ef1e8 [borgbackup_client] various fixes for ssh keys 2021-03-14 16:11:18 +01:00
c99b611b8f Various fixes 2021-03-14 14:17:36 +01:00
8112788396 [borgbackup_client] Add 'user:' in authorized_key 2021-03-14 13:18:30 +01:00
2f2f71422f [borgbackup_client] Move some handlers to tasks 2021-03-14 13:16:08 +01:00
637b74a2ad Fix some linter issues 2021-03-13 05:05:30 +01:00
f45cd77510 Merge branch 'master' into logs-first-phase 2021-03-13 05:02:30 +01:00
f6e1949c21 Adding master VM for Rives and adapt radius role for bullseye
Some checks failed
continuous-integration/drone/push Build is failing
2021-03-12 12:29:52 +01:00
965bbe62a4 [borgbackup_client] configure encryption passphrase and storage 2021-03-12 01:46:35 +01:00
3f8ffbe164 [borgbackup_client] Add borg username and group defaults 2021-03-12 00:01:11 +01:00
531f7593d2 [borgbackup_client] fix identation
All checks were successful
continuous-integration/drone/push Build is passing
continuous-integration/drone/pr Build is passing
2021-03-11 23:37:55 +01:00
313314a674 [borgbackup_client] fix risky file permission on apt config for pinning version 2021-03-11 23:36:27 +01:00