---
- name: Ensure node_exporter wrapper exists
become: yes
copy:
dest: /usr/local/bin/node_exporter_wrapper
owner: root
group: root
mode: '0755'
content: |
#!/bin/sh
# wrapper created by ansible - exec node_exporter with warn log level
exec /usr/local/bin/node_exporter --collector.textfile.directory={{ p4prometheus_metrics_dir }} --web.listen-address=:9100 --log.level=warn
- name: Ensure systemd drop-in dir exists
become: yes
file:
path: /etc/systemd/system/node_exporter.service.d
state: directory
owner: root
group: root
mode: '0755'
- name: Install systemd override to use wrapper
become: yes
copy:
dest: /etc/systemd/system/node_exporter.service.d/override.conf
owner: root
group: root
mode: '0644'
content: |
[Service]
ExecStart=
ExecStart=/usr/local/bin/node_exporter_wrapper
- name: Reload systemd
become: yes
ansible.builtin.systemd:
daemon_reload: yes
- name: Restart node_exporter service
become: yes
ansible.builtin.systemd:
name: node_exporter
state: restarted
enabled: true
- name: Wait for metrics port 9100 to be accepting connections
ansible.builtin.wait_for:
host: 127.0.0.1
port: 9100
state: started
timeout: 15
- name: Verify node_exporter ExecStart uses wrapper
command: systemctl show node_exporter --property=ExecStart
register: node_exec
changed_when: false
failed_when: node_exec.stdout is not search('/usr/local/bin/node_exporter_wrapper')
- name: Verify wrapper enforces --log.level=warn
command: grep -q -- '--log.level=warn' /usr/local/bin/node_exporter_wrapper
register: warn_check
changed_when: false
failed_when: warn_check.rc != 0
| # | Change | User | Description | Committed | |
|---|---|---|---|---|---|
| #2 | 32507 | Russell C. Jackson (Rusty) |
Fix monitoring role bugs and add health check, network latency, and disk space monitoring. - Fix circular symlink (src and dest were identical) - Fix force_apt_get on generic package module (split by OS family) - Add missing become:yes on privileged tasks - Add perforce_location defaults to prevent undefined variable errors - Make case sensitivity check query live server via p4 info - Remove redundant tasks already handled by install_p4prom.sh - Remove unused handlers - Add p4 health check probe (p4 info liveness and response time) - Add network latency monitoring (ping commit server) - Add disk space monitoring with configurable warn/crit thresholds |
||
| #1 | 32488 | Russell C. Jackson (Rusty) | Ansible scaffolding for the sdp - Needs work. |