Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.
The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.
The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.
The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows root administrators to define new roles and associate API
permissions to them.
A limited form of role-based access control for the CloudStack management server
API is provided through a properties file, commands.properties, embedded in the
WAR distribution. Therefore, customizing API permissions requires unpacking the
distribution and modifying this file consistently on all servers. The old system
also does not permit the specification of additional roles.
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Dynamic+Role+Based+API+Access+Checker+for+CloudStack
DB-Backed Dynamic Role Based API Access Checker for CloudStack brings following
changes, features and use-cases:
- Moves the API access definitions from commands.properties to the mgmt server DB
- Allows defining custom roles (such as a read-only ROOT admin) beyond the
current set of four (4) roles
- All roles will resolve to one of the four known roles types (Admin, Resource
Admin, Domain Admin and User) which maintains this association by requiring
all new defined roles to specify a role type.
- Allows changes to roles and API permissions per role at runtime including additions or
removal of roles and/or modifications of permissions, without the need
of restarting management server(s)
Upgrade/installation notes:
- The feature will be enabled by default for new installations, existing
deployments will continue to use the older static role based api access checker
with an option to enable this feature
- During fresh installation or upgrade, the upgrade paths will add four default
roles based on the four default role types
- For ease of migration, at the time of upgrade commands.properties will be used
to add existing set of permissions to the default roles. cloud.account
will have a new role_id column which will be populated based on default roles
as well
Dynamic-roles migration tool: scripts/util/migrate-dynamicroles.py
- Allows admins to migrate to the dynamic role based checker at a future date
- Performs a harder one-way migrate and update
- Migrates rules from existing commands.properties file into db and deprecates it
- Enables an internal hidden switch to enable dynamic role based checker feature
Deprecate commands.properties
- Fixes apidocs and marvin to be independent of commands.properties usage
- Removes bundling of commands.properties in deb/rpm packaging
- Removes file references across codebase
Reviewed-by: John Burwell <john.burwell@shapeblue.com>
QA-by: Boris Stoyanov <boris.stoyanov@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8762: Check to confirm disk activity before starting a VMImplements a VM volume/disk file activity checker that checks if QCOW2 file
has been changed before starting the VM. This is useful as a pessimistic
approach to save VMs that were running on faulty hosts that CloudStack could
try to launch on other hosts while the host was not cleanly fenced. This is
optional and available only if you enable the settings in agent.properties
file, on per-host basis.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* pr/754:
CLOUDSTACK-8762: Confirm disk activity before starting a VM
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Implements a VM volume/disk file activity checker that checks if QCOW2 file
has been changed before starting the VM. This is useful as a pessimistic
approach to save VMs that were running on faulty hosts that CloudStack could
try to launch on other hosts while the host was not cleanly fenced. This is
optional and available only if you enable the settings in agent.properties
file, on per-host basis.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When dumping XML use appropriate flags:
1, VIR_DOMAIN_XML_SECURE (dump security sensitive information too)
8, VIR_DOMAIN_XML_MIGRATABLE (dump XML suitable for migration)
Source:
https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainXMLFlags
This fixes CVE 2015-3252: VNC password lost during VM migration across KVM
hosts. The issue is also seen when a VM is rebooted.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This way we update the DB with the actual size of the disk after deployment from template
(cherry picked from commit 4b4c52ea77)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
KVM hosts which are actuall up, but if their agents are shutdown should be put
in disconnected state. This would avoid getting the VMs HA'd and other commands
such as deploying a VM will exclude that host and save us from errors.
The improvement is that, we first try to contact the KVM host itself. If it fails
we assume that it's disconnected, and then ask its KVM neighbours if they can
check its status. If all of the KVM neighbours tell us that it's Down and we're
unable to reach the KVM host, then the host is possibly down. In case any of the
KVM neighbours tell us that it's Up but we're unable to reach the KVM host then
we can be sure that the agent is offline but the host is running.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#340
Pull average Cpu util report between polling intervals instead of since boot
instead of using values since uptime
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#289
Passing the file argument to the xml break for EL 7.1, the fix removes
the argument as just passing rombar='off' with its file arg to be empty string.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#290
EL7 has a different output to 'free', use /proc/meminfo instead of a tool to be
more consistent across distros
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Linux kernel supports vmxnet3, allowing it in KVM plugin would allow us to
run ESX hosts on KVM hosts using CloudStack with vmxnet3 nic which can be
passed as VM's nicAdapter detail
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This improvements checks for "guest.cpu.features" property which is a space
separated list of cpu features that is specific for a host. When added, it
will add <feature policy='require' name='{{feature-you-listed}}'/> in the
<cpu> section of the generated vm spec xml.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Dont evict template when a delete command has been sent to VMware resource for deletion of volume.
(cherry picked from commit f45e6b94ed)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The only artifact resolved from libvirt.org was org.libvirt:libvirt:0.5.1
this artifact is now available from maven's default central repository
This closes#180
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 9cf31b0714)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
MigrateVMWithVolumes-
1. If ESXi host version is below 5.1, ensure destination datastore(s) is mounted on the source host, then migrate the storage and then finally migrate the VM.
If destination storage(s) is not mounted on the source host,
- In case of NFS storage mount the storage(s).
- In case of VMFS storage fail the request for migration.
2. If EXi host version is 5.1 or above, simultaneously migrate the VM and its storage to the destination host and storage(s) respectively for both NFS and VMFS storage.
(cherry picked from commit adc836cc5e)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Correctly register node info for a newly created VMware context.
(cherry picked from commit 13bdc1cef4)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
As suggested by Wido on the dev ML changing the repo to eu.ceph.com to avoid
build failures. Will revert if ceph.com is up again.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CentOS 7 does not ship with ifconfig anymore. We should use ip commands instead.
This also works on older versions, like CentOS 6 and Ubuntu 12.x/14.x, that we
support.
(cherry picked from commit ac06ec02eb)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/IvsVifDriver.java
- use sticky chmod 1777 on the mountpoint
- remove dead code
- port improved code for moving disk into corresponding folder from master
- for dummy worker case, check case for powered off vm state
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
It causes extreme long vm start time when managment server has slow connection to secondary storage
(cherry picked from commit 49c01e2ae4)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
While looking for an ongoing VM snapshot task, check the task status to identify if the task is still running.
(cherry picked from commit 25a4f0dc53)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Handling spaces in datastore name while extracting vmdk base name
Signed-off-by: Sateesh Chodapuneedi <sateesh@apache.org>
(cherry picked from commit aa84b05491)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>