cloudstack/plugins
Wido den Hollander b130e55088 CLOUDSTACK-9397: Add Watchdog timer to KVM Instance (#1707)
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

In the agent.properties the action to be taken can be defined:

vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

vm.watchdog.model=i6300esb

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

  <watchdog model='i6300esb' action='reset'/>

In the agent.properties the action to be taken can be defined:

  vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

  vm.watchdog.model=i6300esb

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-09-28 13:56:15 +05:30
..
acl Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
affinity-group-processors Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
alert-handlers Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
api solidfire: Add NULL checks for various objects in SolidFire integration test API (#2205) 2017-07-28 11:58:05 +02:00
ca/root-ca CLOUDSTACK-9993: Securing Agents Communications (#2239) 2017-08-28 12:15:11 +02:00
database Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
dedicated-resources Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
deployment-planners Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
event-bus Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
file-systems/netapp Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
ha-planners/skip-heurestics Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
host-allocators/random Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
hypervisors CLOUDSTACK-9397: Add Watchdog timer to KVM Instance (#1707) 2017-09-28 13:56:15 +05:30
metrics Merge branch '4.10' 2017-07-24 12:44:25 +02:00
network-elements Merge pull request #2109 from Accelerite/CLOUDSTACK-9922 2017-08-30 15:15:19 +05:30
outofbandmanagement-drivers CLOUDSTACK-9782: Improve scheduling of jobs 2017-08-30 18:06:48 +02:00
storage CLOUDSTACK-8865: Adding SR doesn't create Storage_pool_host_ref entry for disabled host (#876) 2017-09-21 10:49:11 +05:30
storage-allocators/random Updating pom.xml version numbers for release 4.11.0.0-SNAPSHOT 2017-07-12 12:09:38 +05:30
user-authenticators CLOUDSTACK-9993: Securing Agents Communications (#2239) 2017-08-28 12:15:11 +02:00
pom.xml CLOUDSTACK-9782: Nested-oobm CloudStack plugin 2017-08-30 18:06:48 +02:00