* scaleio: prototype storage plugin
- plugin skeleton
- add storage pool, create/attach data disk
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* kvm: attach disk example
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Updated ScaleIO storage plugin to support Volume operations
* ScaleIO storage plugin - Support for VM operations and other updates
* ScaleIO storage pool plugin changes
- Added validation to check existing ScaleIO storage pool and update capacity details
- Updated resize volume for ScaleIO to pick the rounded 8GB boundary size
- Added support for setting ScaleIO storage pool statistics (bandwidthLimitInKbps, iopsLimit)
* Fixed IOPS validation and volume size update when resizing ScaleIO volume
* Removed connect/disconnect disk changes from ScaleIO storage adaptor
- ScaleIO datastore driver does map/unmap ScaleIO volume (from MS) using grant/revoke access
- Not required to map/unmap ScaleIO volume from the storage adaptor
* Updated connect disk, to wait for ScaleIO volume to become available in the KVM host
* Updated ScaleIO storage provider, pool type, url scheme and related paramters to the new "PowerFlex" brand
* Fixed size rounding issue while creating PowerFlex volume and added validations to PowerFlex Gateway API client
* Updated host sdc connection check for ScaleIO/PowerFlex pool on host connect
* Updated volume snapshots support for volumes on ScaleIO/PowerFlex storage pool and Added some validations for ScaleIO disks in host
* Added primary storage level configurable setting "storage.pool.disk.wait" to wait for disk availability
- Confiure the disk availability wait time, mainly introduced for ScaleIO/PowerFlex storage pool (can be used for other managed storages), to wait for the disk to become available in the host before performing any operation on it
* Enabled template spooling to ScaleIO/PowerFlex storage pool and create VM from the spooled template.
Added ScaleIO SDC limits support for volumes using offering parameters: bandwidthLimitInKbps, iopsLimit.
* Added support for VM snapshots on ScaleIO/PowerFlex storage pool
Minor improvements for IOPS (SDC Limits) configuration
* Updated access for ScaleIO/PowerFlex volumes on VM Start and Stop
Added primary storage level configurable setting "storage.pool.client.timeout" for storage API client
Enabled cluster wide storage pool support for ScaleIO/PowerFlex storage
Minor improvements for ScaleIO/PowerFlex disk access in the KVM host
* Added support for direct download of templates (raw, qcow2) on ScaleIO/PowerFlex storage pool
* Added support for config drives in host cache for KVM
- Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
- Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
- Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
- Added new parameter "vm.configdrive.host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path for config drives
* Updated disk access while migrating the VM with volumes on ScaleIO/PowerFlex storage pool
Changed the parameter "vm.configdrive.host.cache.location" to "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties to specify the host cache path
Changes to create config drives on the "/config" directory on the host cache path
Changes to suppport migrate VM with config drive on the host cache path
* Additonal changes to support migrate VM with config drive on the host cache
* Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
Updated full deployment destination for preparing the network(s) on VM start
* Propagate the direct download certificates uploaded to the newly added KVM hosts
* Code improvements for ScaleIO/PowerFlex storage plugin
* Updated storage stats collection and tests for ScaleIO/PowerFlex storage plugin
* Fix for template size of direct download templates on capacity check for ScaleIO/PowerFlex storage pool
Updated data object grant and revoke access for connected SDCs to ScaleIO/PowerFlex storage pool
* Discover the template size for direct download templates using any available host from the zones specified on template registration
When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
* Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
* Ensure the volume to be expunged, is expunge ready on storage cleanup
* Do not set the storage migration flag for the volumes on zone wide PowerFlex/ScaleIO pool when listing the hosts available for cross-cluster migration
* Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
* Added alerts for PowerFlex/ScaleIO SDC disconnection on the host(s)
* Retry VM deployment/start when the host cannot access volume/template on the ScaleIO/PowerFlex storage
* Changes to find a potential host that can access the ScaleIO/PowerFlex storage pool
* Updated ScaleIO/PowerFlex storage pool stats for checking the available capacity and usage
* Updated ScaleIO/PowerFlex volumes naming convention to avoid the naming conflicts on sharing
* Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
- Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
* Updated ScaleIO/PowerFlex storage pool capacity stats
* Cleanup unused templates and host entries on PowerFlex/ScaleIO storage pool deletion
* Check the router filesystem is writable or not, before performing health checks
- Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
- The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
* Updated the router filesystem writable check using script, instead cmd execution
- Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
* Update volume stats (physical and virtual size) for the volumes on PowerFlex/ScaleIO storage pool
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Currently in cloudstack, when we click on "Acquire New Ip", it will
randomly acquire IP from the pool. With this enhancement, it is
possible to select the IP from the drop down IP list of that network.
Same thing applies for a VPC as well.
* Service layer changes for new way of tracking maintanence progress
* Fixes after offline code review
* Fix marvin tests
* Change state name and add documentation
* Fix test
* Fix and add more unit tests for different caseS
* Fix and enhance Marvin Tests
* Fixes for corner cases
* More fixes and logging
* UI fixes
* Some minor changes and reducing VMs on host for more contained tests
* Fixed ssh client auth problem causing test failure
* Code review changes + fixes + some more logging
* Fix flaky tests by adding delays between host states
* Added fetching only enabled hosts for tests
* Make port blocking KVM specific and refactor to handle failure
* Make failing migrations due to tagged host instead of port blocking
* Added additional check for migrating VMs
* Refactor to use single place for methods checking maintenance states
* Add missing HA config keys
* Change time value to seconds
* Change Integer to Long
* Using ConfigKey defaultValue
* Do some code refactoring
* Simplify code
Feature Specification: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653548
Live storage migration on KVM under these conditions:
From source and destination hosts within the same cluster
From NFS primary storage to NFS cluster-wide primary storage
Source NFS and destination NFS storage mounted on hosts
In order to enable this functionality, database should be updated in order to enable live storage capacibilty for KVM, if previous conditions are met. This is due to existing conflicts between qemu and libvirt versions. This has been tested on CentOS 6 hosts.
Additional notes:
To use this feature set the storage_motion_supported=1 in the hypervisor_capability table for KVM. This is done by default as the feature may not work in some environments, read below.
This feature of online storage+VM migration for KVM will only work with CentOS6 and possible Ubuntu as KVM hosts but not with CentOS7 due to:
https://bugs.centos.org/view.php?id=14026https://bugzilla.redhat.com/show_bug.cgi?id=1219541
On CentOS7 the error we see is: " error: unable to execute QEMU command 'migrate': this feature or command is not currently supported" (reference https://ask.openstack.org/en/question/94186/live-migration-unable-to-execute-qemu-command-migrate/). Reading through various lists looks like the migrate feature with qemu may be available with paid versions of RHEL-EV but not centos7 however this works with CentOS6.
Fix for CentOS 7:
Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
Reboot host
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Allows creating storage offerings associated with particular domain(s) and zone(s). In create disk/storage offfering form UI, a mult-select control has been addded to select desired zone(s) and domain select element has been made multi-select.
createDiskOffering API has been modified to allow passing list of domain and zone IDs with keys domainids and zoneids respectively. These lists are stored in DB in cloud.disk_offering_details table with 'domainids' and 'zoneids' key as string of comma separated list of IDs. Response for create, update and list disk offering APIs will return domainids, domainnames, zoneids and zonenames in details object of offering.
listDiskOfferings API has been modified to allow passing zoneid to return only offerings which are associated with the zone.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Keep connection alive when on maintenance
* Refactor cancel maintenance and unit tests
* Add marvin tests
* Refactor
* Changing the way we get ssh credentials
* Add check on SSH restart and improve marvin tests
* Migrate template to target host if needed.
Fix KVM VM local storage live migration by migrating its template to the
target host if needed.
* Address reviewer and add method that updates the DB template reference
* Remove deprecated Config.PrimaryStorageDownloadWait
* Code formating of @Inject to follow checkstyle
* - Offline VM and Volume migration on Vmware hypervisor hosts
- Also add VM disk consolidation call on successful VM migrations
* Fix indentation of marvin test file and reformat against PEP8
* * Fix few comment typos
* Refactor debug messages to use String.format() when debug log level is enabled.
* Send list of commands returned by hypervisor Guru instead of explicitly selecting the first one
* Fix unhandled NPE during VM migration
* Revert back to distinct event descriptions for VM to host or storage pool migration
* Reformat test_primary_storage file against PEP-8 and Remove unused imports
* Revert back the deprecation messages in the custom StringUtils class to favour the use of the ApacheUtils
A corner case was found on 4.11.2 for #2493 leading to an infinite loop in state PrepareForMaintenance
To prevent such cases, in which failed migrations are detected but still running on the host, this feature adds a new cluster setting host.maintenance.retries which is the number of retries before marking the host as ErrorInMaintenance if migration errors persist.
How Has This Been Tested?
- 2 KVM hosts, pick one which has running VMs as H
- Block migrations ports on H to simulate failures on migrations:
iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 49152:49215 -m comment --comment 'test block migrations' iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 16509 -m comment --comment 'test block migrations
- Put host H in Maintenance
- Observe that host is indefinitely in PrepareForMaintenance state (after this fix it goes into ErrorInMaintenance after retrying host.maintenance.retries times)
* CLOUDSTACK-9473: storage pool capacity check when volume is resized or migrated
Storage pool checker is not being called on resize and migrate volume.
This may lead to allocated percentage of storage above 100%.
Setup:
1 VMware cluster with 2 Hosts.
Executed Steps:
Applied the following global settings:
storage.overprovisioning.factor = 1
pool.storage.allocated.capacity.disablethreshold = 1
pool.storage.capacity.disablethreshold = 1
Restarted management server
Executed Resize and migrate pool and Observed that Storage pool checker is not performed on resizeVolume and migrateVolume.
Result:
Root cause analysis shows storage pool checker is not called when doing migration and resizing.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Cleaup and code-formatting POM files
* Remove obsolete mycila license-maven-plugin
* Remove obsolete console-proxy/plugin project
* Move console-proxy-rdbconsole under console-proxy parent
* Use correct parent path for rdpconsole
* Order alphabetally items in setnextversion.sh
* Unifiy License header in POMs
* Alphabetic order of modules definition
* Extract all defined versions into parent pom
* Remove obsolete files: version-info.in, configure-info.in
* Remove redundant defaultGoal
* Remove useless checkstyle plugin from checkstyle project
* Order alphabetally items in pom.xml
* Add aditional SPACEs to fix debian build
* Don't execute checkstyle on parent projects
* Use UTF-8 encoding in building checkstyle project
* Extract plugin versions into properties
* Execute PMD plugin on all the projects with -Penablefindbugs
* Upgrade maven plugins to latest version
* Make sure to always look for apache parent pom from repository
* Fix incorrect version grep in debian packaging
* Fix rebase conflicts
* Fix rebase conflicts
* Remove PMD for now to be fixed on another PR
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.
Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
* [CLOUDSTACK-10323] Allow changing disk offering during volume migration
This is a continuation of work developed on PR #2425 (CLOUDSTACK-10240), which provided root admins an override mechanism to move volumes between storage systems types (local/shared) even when the disk offering would not allow such operation. To complete the work, we will now provide a way for administrators to enter a new disk offering that can reflect the new placement of the volume. We will add an extra parameter to allow the root admin inform a new disk offering for the volume. Therefore, when the volume is being migrated, it will be possible to replace the disk offering to reflect the new placement of the volume.
The API method will have the following parameters:
* storageid (required)
* volumeid (required)
* livemigrate(optional)
* newdiskofferingid (optional) – this is the new parameter
The expected behavior is the following:
* If “newdiskofferingid” is not provided the current behavior is maintained. Override mechanism will also keep working as we have seen so far.
* If the “newdiskofferingid” is provided by the admin, we will execute the following checks
** new disk offering mode (local/shared) must match the target storage mode. If it does not match, an exception will be thrown and the operator will receive a message indicating the problem.
** we will check if the new disk offering tags match the target storage tags. If it does not match, an exception will be thrown and the operator will receive a message indicating the problem.
** check if the target storage has the capacity for the new volume. If it does not have enough space, then an exception is thrown and the operator will receive a message indicating the problem.
** check if the size of the volume is the same as the size of the new disk offering. If it is not the same, we will ALLOW the change of the service offering, and a warning message will be logged.
We execute the change of the Disk offering as soon as the migration of the volume finishes. Therefore, if an error happens during the migration and the volume remains in the original storage system, the disk offering will keep reflecting this situation.
* Code formatting
* Adding a test to cover migration with new disk offering (#4)
* Adding a test to cover migration with new disk offering
* Update test_volumes.py
* Update test_volumes.py
* fix test_11_migrate_volume_and_change_offering
* Fix typo in Java doc
* CLOUDSTACK-8855 Improve Error Message for Host Alert State
* [CLOUDSTACK-9846] create column to save the content of alert messages
Remove declaration of throws CloudRuntimeException
I also removed some unused variables and comments left behind
This closes#837
* Isolate a problematic test "smoke/test_certauthority_root"
* [CLOUDSTACK-10314] Add Text-Field to each ACL Rule
It is interesting to have a text field (e.g. CHAR-256) added to each ACL rule, which allows to enter a "reason" for each FW Rule created. This is valuable for customer documentation, as well as best practice for an evidence towards auditing the system
* Formatting to make check style happy and code clean ups