cloudstack/scripts/vm/hypervisor/kvm
James Peru d603b260c4 KVM: make storage heartbeat fence action configurable
The KVM agent's storage heartbeat scripts (kvmheartbeat.sh and
kvmspheartbeat.sh) hard-code an immediate kernel-level reboot via
'echo b > /proc/sysrq-trigger' when a heartbeat write to primary storage
times out. This bypasses all OS-level shutdown protections, drops every
running VM on the host instantly, and triggers HA cascades onto
surviving hosts.

For NFS shared storage the binary "heartbeat-write-failed = host-is-dead"
heuristic is reasonable. For LINSTOR/DRBD or other replicated local
storage, the same disk serves application I/O, replication I/O and
heartbeat I/O simultaneously - so a transient I/O contention spike can
time out the heartbeat write without the host actually being unhealthy.
The result is false-positive sysrq fencing.

Adds a new agent.properties option:

    kvm.heartbeat.fence.action = reboot | graceful-reboot
                               | restart-agent | log-only

Default value is "reboot" so existing deployments keep their current
behavior. Operators on replicated storage backends can choose a less
destructive action:

  - graceful-reboot: 'systemctl reboot' instead of sysrq, allowing VMs
    a chance to shut down cleanly
  - restart-agent: restart cloudstack-agent only, preserving running VMs
  - log-only: log + alert, no automatic action

The existing 'reboot.host.and.alert.management.on.heartbeat.timeout'
boolean continues to function as a complete Java-side bypass.

Refs: https://github.com/apache/cloudstack/issues/13089
2026-05-01 03:08:35 +03:00
..
gpudiscovery.sh Fix detection of Mi3xx GPUs (#11715) 2025-09-30 18:34:58 +05:30
kvmheartbeat.sh KVM: make storage heartbeat fence action configurable 2026-05-01 03:08:35 +03:00
kvmspheartbeat.sh KVM: make storage heartbeat fence action configurable 2026-05-01 03:08:35 +03:00
kvmvmactivity.sh Improve logs on kvmvmactivity.sh (#4704) 2021-05-10 16:26:55 +02:00
nasbackup.sh Standardize and auto add license headers for Shell files with pre-commit (#12070) 2025-11-14 14:23:41 +01:00
nsrkvmbackup.sh Standardize and auto add license headers for Shell files with pre-commit (#12070) 2025-11-14 14:23:41 +01:00
nsrkvmrestore.sh Standardize and auto add license headers for Shell files with pre-commit (#12070) 2025-11-14 14:23:41 +01:00
patch.sh Adding AutoScaling for cks + CKS CoreOS EOL update + systemvmtemplate improvements (#4329) 2021-10-06 21:17:41 +05:30
setup_agent.sh pre-commit: add hook to trim trailing whitespace (#8205) 2024-05-28 09:01:30 +02:00