* scaleio: prototype storage plugin
- plugin skeleton
- add storage pool, create/attach data disk
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* kvm: attach disk example
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Updated ScaleIO storage plugin to support Volume operations
* ScaleIO storage plugin - Support for VM operations and other updates
* ScaleIO storage pool plugin changes
- Added validation to check existing ScaleIO storage pool and update capacity details
- Updated resize volume for ScaleIO to pick the rounded 8GB boundary size
- Added support for setting ScaleIO storage pool statistics (bandwidthLimitInKbps, iopsLimit)
* Fixed IOPS validation and volume size update when resizing ScaleIO volume
* Removed connect/disconnect disk changes from ScaleIO storage adaptor
- ScaleIO datastore driver does map/unmap ScaleIO volume (from MS) using grant/revoke access
- Not required to map/unmap ScaleIO volume from the storage adaptor
* Updated connect disk, to wait for ScaleIO volume to become available in the KVM host
* Updated ScaleIO storage provider, pool type, url scheme and related paramters to the new "PowerFlex" brand
* Fixed size rounding issue while creating PowerFlex volume and added validations to PowerFlex Gateway API client
* Updated host sdc connection check for ScaleIO/PowerFlex pool on host connect
* Updated volume snapshots support for volumes on ScaleIO/PowerFlex storage pool and Added some validations for ScaleIO disks in host
* Added primary storage level configurable setting "storage.pool.disk.wait" to wait for disk availability
- Confiure the disk availability wait time, mainly introduced for ScaleIO/PowerFlex storage pool (can be used for other managed storages), to wait for the disk to become available in the host before performing any operation on it
* Enabled template spooling to ScaleIO/PowerFlex storage pool and create VM from the spooled template.
Added ScaleIO SDC limits support for volumes using offering parameters: bandwidthLimitInKbps, iopsLimit.
* Added support for VM snapshots on ScaleIO/PowerFlex storage pool
Minor improvements for IOPS (SDC Limits) configuration
* Updated access for ScaleIO/PowerFlex volumes on VM Start and Stop
Added primary storage level configurable setting "storage.pool.client.timeout" for storage API client
Enabled cluster wide storage pool support for ScaleIO/PowerFlex storage
Minor improvements for ScaleIO/PowerFlex disk access in the KVM host
* Added support for direct download of templates (raw, qcow2) on ScaleIO/PowerFlex storage pool
* Added support for config drives in host cache for KVM
- Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
- Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
- Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
- Added new parameter "vm.configdrive.host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path for config drives
* Updated disk access while migrating the VM with volumes on ScaleIO/PowerFlex storage pool
Changed the parameter "vm.configdrive.host.cache.location" to "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties to specify the host cache path
Changes to create config drives on the "/config" directory on the host cache path
Changes to suppport migrate VM with config drive on the host cache path
* Additonal changes to support migrate VM with config drive on the host cache
* Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
Updated full deployment destination for preparing the network(s) on VM start
* Propagate the direct download certificates uploaded to the newly added KVM hosts
* Code improvements for ScaleIO/PowerFlex storage plugin
* Updated storage stats collection and tests for ScaleIO/PowerFlex storage plugin
* Fix for template size of direct download templates on capacity check for ScaleIO/PowerFlex storage pool
Updated data object grant and revoke access for connected SDCs to ScaleIO/PowerFlex storage pool
* Discover the template size for direct download templates using any available host from the zones specified on template registration
When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
* Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
* Ensure the volume to be expunged, is expunge ready on storage cleanup
* Do not set the storage migration flag for the volumes on zone wide PowerFlex/ScaleIO pool when listing the hosts available for cross-cluster migration
* Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
* Added alerts for PowerFlex/ScaleIO SDC disconnection on the host(s)
* Retry VM deployment/start when the host cannot access volume/template on the ScaleIO/PowerFlex storage
* Changes to find a potential host that can access the ScaleIO/PowerFlex storage pool
* Updated ScaleIO/PowerFlex storage pool stats for checking the available capacity and usage
* Updated ScaleIO/PowerFlex volumes naming convention to avoid the naming conflicts on sharing
* Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
- Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
* Updated ScaleIO/PowerFlex storage pool capacity stats
* Cleanup unused templates and host entries on PowerFlex/ScaleIO storage pool deletion
* Check the router filesystem is writable or not, before performing health checks
- Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
- The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
* Updated the router filesystem writable check using script, instead cmd execution
- Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
* Update volume stats (physical and virtual size) for the volumes on PowerFlex/ScaleIO storage pool
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
BackupSync task would switch between databases to update backup usage
metrics in the cloud_usage.usage_backup table. The current framework
and the usage in ManagedContext causes database connection
(LegacyTransaction) leaks. When the thread runs faster, the issue is
easily reproducible and checking via heap dump analysis or using JMX
MBeans. This fixes by moving the task of backup data updation for
usage data to the usage server by publishing usage events instead of
switching between databases in a local thread while in a
ManagedContextRunnable.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit b54d19b3b9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes#4090
When trying to migrate a VM across 2 clusters, if a snapshot has been deleted and garbage collection has run to update the removed field, it is not possible to migrate the instance due to a null pointer.
(cherry picked from commit 6a683dcf77)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Update AncientDataMotionStrategy.java
fix When secondary storage usage is> 90%, VOLUME migration across primary storage will cause the migration to fail and lose VOLUME
* Update AncientDataMotionStrategy.java
Volume is migrated across Primary storage. If no secondary storage is available(Or used capacity> 90% ), the migration is canceled.
Before modification, if secondary storage cannot be found, copyVolumeBetweenPools return NUll
copyAsync considers answer = null to be a sign of successful task execution, so it deletes the VOLUME on the old primary storage. This is the root cause of data loss, because VOLUME did not perform the migration at all.
* code in comment removed
Co-authored-by: div8cn <35140268+div8cn@users.noreply.github.com>
Co-authored-by: Daan Hoogland <dahn@onecht.net>
* 4.13:
Snapshot deletion issues (#3969)
server: Cannot list affinity group if there are hosts dedicated… (#4025)
server: Search zone-wide storage pool when allocation algothrim is firstfitleastconsumed (#4002)
* Fixes snapshot deletion
* Remove legacy '@Component', it is not necessary in this bean/class.
* Fix log message missing %d and remove snapshot on DB
* Remove "dummy" boolean return statement
* Manage snapshot deletion for KVM + NFS (primary storage)
* checkstyle trailing spaces
* rename options strings to *_OPTION
* Fix typo on deleteSnapshotOnSecondaryStorage and enhance log message
* Move the snapshotDao.remove(snapshotId); (#4006)
* Fix deletesnapshot worflow to handle both snapshots created in primary storage and snapshots backed up to secondary storage
* Fix extra space
* refactor out separate handling methods for secondary and primary (reducing returns)
* return false on unexpected error or log when expected
* != instead of ==
* secondary instead of backup storage
* init to null
* Handle snapshot deletion on primary storage. When primary store ref not found for snapshot do not fail the operation.
* Fix debug levels on log messages
Co-authored-by: GabrielBrascher <gabriel@apache.org>
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
* Remove constraint for NFS storage
* Add new property on agent.properties
* Add free disk space on the host prior template download
* Add unit tests for the free space check
* Fix free space check - retrieve avaiable size in bytes
* Update default location for direct download
* Improve the method to retrieve hosts to retry on depending on the destination pool type and scope
* Verify location for temporary download exists before checking free space
* In progress - refactor and extension
* Refactor and fix
* Last fixes and marvin tests
* Remove unused test file
* Improve logging
* Change default path for direct download
* Fix upload certificate
* Fix ISO failure after retry
* Fix metalink filename mismatch error
* Fix iso direct download
* Fix for direct download ISOs on local storage and shared mount point
* Last fix iso
* Fix VM migration with ISO
* Refactor volume migration to remove secondary storage intermediate
* Fix simulator issue
As previously described by PR #3929:
If vm has attached ISO, the migration fails with error message "org.libvirt.LibvirtException: Cannot access storage file /mnt/b33e5a1d-e4ea-3465-b6ac-c98dc8ff8af0/207-2-cc5fd717-2d57-3bb3-bcf6-2c930268db6c.iso"
By default, once we create a security group we cant change its name.
In this feature, we introduce a new API command "updateSecurityGroup"
which allows us to rename the security group name. Although we can't
change the name of the "default" security group.
This adds support for JDK11 in CloudStack 4.14+:
- Fixes code to build against JDK11
- Bump to Debian 9 systemvmtemplate with openjdk-11
- Fix Travis to run smoketests against openjdk-11
- Use maven provided jdk11 compatible mysql-connector-java
- Remove old agent init.d scripts
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Enable PVLAN support on L2 networks
* Fix prevent null pointer on details
* Add marvin tests
* Fixes from comments
* Fix: missing pvlan type on plugniccommand
* Fix checks on network creation for vlans overlap
* Fix remove prefix from secondary vlan id
* Improve checks on physical network for pvlans
* Fix compatibility with previous pvlan creation
* Fix shared networks backwards pvlan compatibility
* Add ui fix for pvlan type not passed to api
* Add check for isolated vlan id overlap
* Include check for dynamic vlan reserved for secondary vlan
* Fix marvin tests errors
* Fix redundant imports
* Skip marvin test for pvlan if dvswitch is not present
* spelling
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
* server: fix resource count of primary storage if some volumes are Expunged but not removed
Steps to reproduce the issue
(1) create a vm and stop it. check resource count of primary storage
(2) download volume. resource count of primary storage is not changed.
(3) expunge the vm, the volume will be Expunged state as there is a volume snapshot on secondary storage. The resource count of primary storage decreased.
(4) update resource count of the account (or domain), the resource count of primary storage is reset to the value in step (2).
* New feature: Add support to destroy/recover volumes
* Add integration test for volume destroy/recover
* marvin: check resource count of more types
* messages translate to JP
* Update messages for CN
* translate message for NL
* fix two issues per Daan's comments
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
The VM ingestion feature allows CloudStack to discover, on-board, import existing VMs in an infra. The feature currently works only for VMware, with a hypervisor agnostic framework which may be extended for KVM and XenServer in future.
Add a global setting to disable creating networks with same name in an account
Add a global setting to disable creating network without
mentioning the start and end IPv4 or IPv6 address
By default we can create networks with the same name in the account.
Sometimes we should not create the networks with same name.
This change adds a global setting which prevents creating the network with same name.
The default value is true and set it to false to prevent creating network with same names.
Also its possible to create a shared network without mentioning the
start and the end IPv4 or IPv6 address.
This change adds a global setting which prevents creating a shared
network without specifying the start and the end IPv4 or IPv6 address
If the disk size of the vm to be created is greater
than the volume size, then the exception message should
display the numeric value instead of variable name
Fixes#3191
When a template is registered, code stores md5sum of the downloaded file in the vm_template table. However, this downloaded file could be deleted after template installation if it is not an actual (.qcow2, .ova, etc.) file. When the user copies a template using copyTemplate API, the actual template file will be copied across the image stores. Matching checksum for the copied templated file and the stored value from the vm_template table will result in a mismatch.
Changes will set an empty checksum value for the copied template while passing to download service which allows skipping wrong checksum check for the copied while install.
However, this results in a change in checksum value for concerned template entry in vm_template table post template install.
Co-authored-by: dahn <daan.hoogland@gmail.com>
* marvin: check resource count of more types
* New feature: add flag resource.count.running.vms.only to count resource consumption of only running vms
Stopped VMs do not use CPU/RAM actually.
A new global configuration resource.count.running.vms.only is added to determine whether resource (cpu/memory) of only running vms (including Starting/Stopping) will be taken into calculation of resource consumption.
* Add integration test for resource count of only running vms
Stop asking user (in the upgrade documentation) to remove a trailing slash for local KVM pool - do it here in upgrade path - so not needed in DOC for the upgrade to 4.14 and onwards.
When start a vm or migrate a vm (away from a host in host maintenance), cloudstack will check capacity of all hosts and choose one. If there are hundreds of hosts on the platform, it will take some seconds. When cloudstack choose a host and start/migrate vm to it, the resource consumption of the host might have been changed. This normally happens when we start/migrate multiple vms.
It would be better to double check the host capacity when start vm on a host.
This PR includes the fix for cpucore capacity when start/migrate a vm.