* scaleio: prototype storage plugin
- plugin skeleton
- add storage pool, create/attach data disk
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* kvm: attach disk example
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Updated ScaleIO storage plugin to support Volume operations
* ScaleIO storage plugin - Support for VM operations and other updates
* ScaleIO storage pool plugin changes
- Added validation to check existing ScaleIO storage pool and update capacity details
- Updated resize volume for ScaleIO to pick the rounded 8GB boundary size
- Added support for setting ScaleIO storage pool statistics (bandwidthLimitInKbps, iopsLimit)
* Fixed IOPS validation and volume size update when resizing ScaleIO volume
* Removed connect/disconnect disk changes from ScaleIO storage adaptor
- ScaleIO datastore driver does map/unmap ScaleIO volume (from MS) using grant/revoke access
- Not required to map/unmap ScaleIO volume from the storage adaptor
* Updated connect disk, to wait for ScaleIO volume to become available in the KVM host
* Updated ScaleIO storage provider, pool type, url scheme and related paramters to the new "PowerFlex" brand
* Fixed size rounding issue while creating PowerFlex volume and added validations to PowerFlex Gateway API client
* Updated host sdc connection check for ScaleIO/PowerFlex pool on host connect
* Updated volume snapshots support for volumes on ScaleIO/PowerFlex storage pool and Added some validations for ScaleIO disks in host
* Added primary storage level configurable setting "storage.pool.disk.wait" to wait for disk availability
- Confiure the disk availability wait time, mainly introduced for ScaleIO/PowerFlex storage pool (can be used for other managed storages), to wait for the disk to become available in the host before performing any operation on it
* Enabled template spooling to ScaleIO/PowerFlex storage pool and create VM from the spooled template.
Added ScaleIO SDC limits support for volumes using offering parameters: bandwidthLimitInKbps, iopsLimit.
* Added support for VM snapshots on ScaleIO/PowerFlex storage pool
Minor improvements for IOPS (SDC Limits) configuration
* Updated access for ScaleIO/PowerFlex volumes on VM Start and Stop
Added primary storage level configurable setting "storage.pool.client.timeout" for storage API client
Enabled cluster wide storage pool support for ScaleIO/PowerFlex storage
Minor improvements for ScaleIO/PowerFlex disk access in the KVM host
* Added support for direct download of templates (raw, qcow2) on ScaleIO/PowerFlex storage pool
* Added support for config drives in host cache for KVM
- Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
- Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
- Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
- Added new parameter "vm.configdrive.host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path for config drives
* Updated disk access while migrating the VM with volumes on ScaleIO/PowerFlex storage pool
Changed the parameter "vm.configdrive.host.cache.location" to "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties to specify the host cache path
Changes to create config drives on the "/config" directory on the host cache path
Changes to suppport migrate VM with config drive on the host cache path
* Additonal changes to support migrate VM with config drive on the host cache
* Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
Updated full deployment destination for preparing the network(s) on VM start
* Propagate the direct download certificates uploaded to the newly added KVM hosts
* Code improvements for ScaleIO/PowerFlex storage plugin
* Updated storage stats collection and tests for ScaleIO/PowerFlex storage plugin
* Fix for template size of direct download templates on capacity check for ScaleIO/PowerFlex storage pool
Updated data object grant and revoke access for connected SDCs to ScaleIO/PowerFlex storage pool
* Discover the template size for direct download templates using any available host from the zones specified on template registration
When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
* Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
* Ensure the volume to be expunged, is expunge ready on storage cleanup
* Do not set the storage migration flag for the volumes on zone wide PowerFlex/ScaleIO pool when listing the hosts available for cross-cluster migration
* Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
* Added alerts for PowerFlex/ScaleIO SDC disconnection on the host(s)
* Retry VM deployment/start when the host cannot access volume/template on the ScaleIO/PowerFlex storage
* Changes to find a potential host that can access the ScaleIO/PowerFlex storage pool
* Updated ScaleIO/PowerFlex storage pool stats for checking the available capacity and usage
* Updated ScaleIO/PowerFlex volumes naming convention to avoid the naming conflicts on sharing
* Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
- Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
* Updated ScaleIO/PowerFlex storage pool capacity stats
* Cleanup unused templates and host entries on PowerFlex/ScaleIO storage pool deletion
* Check the router filesystem is writable or not, before performing health checks
- Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
- The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
* Updated the router filesystem writable check using script, instead cmd execution
- Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
* Update volume stats (physical and virtual size) for the volumes on PowerFlex/ScaleIO storage pool
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
BackupSync task would switch between databases to update backup usage
metrics in the cloud_usage.usage_backup table. The current framework
and the usage in ManagedContext causes database connection
(LegacyTransaction) leaks. When the thread runs faster, the issue is
easily reproducible and checking via heap dump analysis or using JMX
MBeans. This fixes by moving the task of backup data updation for
usage data to the usage server by publishing usage events instead of
switching between databases in a local thread while in a
ManagedContextRunnable.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit b54d19b3b9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* 4.13:
Snapshot deletion issues (#3969)
server: Cannot list affinity group if there are hosts dedicated… (#4025)
server: Search zone-wide storage pool when allocation algothrim is firstfitleastconsumed (#4002)
Though VMware does not support security groups, but in a basic zone with VMware and no isolation VMs should be able to deploy.
Root cause:
In case of VMware and basic zone control nic is set to 0.0.0.0 assuming control network will be shared with guest network.
But to have access to VMware instances management/private needs to be assigned to it.
Solution:
Assing a private ip even in case of basic zone VMware.
The listZonesMetrics does not return same keys are listZones as the
default response view is restricted. This fixes that by ensuring that
for root admin full response view is used.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Validate API IOPS normal and maximum read/write values.
Ensures that normal read/write cannot be greater than Maximum
read/write. Additionally, it was added a global settings
'iops.maximum.rate.length'.
'iops.maximum.rate.length' sets the maximum IOPS read/write length
(seconds) accepted; thus, preventing irrealistic values for a disk
offering (e.g. hours or days of burst IOPS). The default value is 0
(zero) and allows any IOPS maximum rate length. Example:
iops.maximum.rate.length = 3600 sets the maximum IOPS length
accepted for a disk offering as 3600 seconds (60 minutes).
* Fix log String.format message from %s to %d
* Add bytes rate validation
* Refactoring to cover Read/Write Bytes and IOPS length validation
* Fix "copy-paste" issue with bytes write rate max length
Change Response view to Full for Admin user
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Remove constraint for NFS storage
* Add new property on agent.properties
* Add free disk space on the host prior template download
* Add unit tests for the free space check
* Fix free space check - retrieve avaiable size in bytes
* Update default location for direct download
* Improve the method to retrieve hosts to retry on depending on the destination pool type and scope
* Verify location for temporary download exists before checking free space
* In progress - refactor and extension
* Refactor and fix
* Last fixes and marvin tests
* Remove unused test file
* Improve logging
* Change default path for direct download
* Fix upload certificate
* Fix ISO failure after retry
* Fix metalink filename mismatch error
* Fix iso direct download
* Fix for direct download ISOs on local storage and shared mount point
* Last fix iso
* Fix VM migration with ISO
* Refactor volume migration to remove secondary storage intermediate
* Fix simulator issue
As previously described by PR #3929:
If vm has attached ISO, the migration fails with error message "org.libvirt.LibvirtException: Cannot access storage file /mnt/b33e5a1d-e4ea-3465-b6ac-c98dc8ff8af0/207-2-cc5fd717-2d57-3bb3-bcf6-2c930268db6c.iso"
* Userdata to display static NAT as public ip instead of VR ip
If static nat is enabled on VM then metadata service should return
the static nat instead of gateway IP.
If static not is not enabled then it should return the gateway IP
as the public IP
Test results:
Step to reproduce:
1. Create a vm
2. Ssh to vm.
3. Run the below command inside the vm
wget http://<VR public ip>/latest/meta-data/public-ipv4
Note down the output of the above command
4. Now acquire a new public and enable static NAT on that IP to this vm
5. Now run the same command mentioned above in the VM
This should display the static NAT ip instead of VR public IP
Output:
Before enabling static nat
wget http://10.10.10.40/latest/meta-data/public-ipv4
$ cat public-ipv4
10.10.10.29
After enabling static nat
wget http://10.10.10.40/latest/meta-data/public-ipv4
$ cat public-ipv4
10.11.10.30
* server: apply vm user data when release a public ip
Co-authored-by: Wei Zhou <ustcweizhou@gmail.com>
in 4.13, list sshkeypairs with keyword will ignore the search by name if name is specifed
Fixes an issue in #3098
for example,
(local) > list sshkeypairs name=wei keyword=wei filter=name
{
"count": 3,
"sshkeypair": [
{
"name": "wei3"
},
{
"name": "wei2"
},
{
"name": "wei"
}
]
}
with this patch ,it gives correct result.
(local) > list sshkeypairs name=wei keyword=wei filter=name
{
"count": 1,
"sshkeypair": [
{
"name": "wei"
}
]
}
When both routers of VPC is in MASTER state
then multiple alerts are sent equally to the number of tiers in the VPC.
If the VPC has 3 tiers then 6 alerts will be sent. This is not good
if VPC has more than 10 networks in it.
Instead of checking the router status for all the tiers in the VPC,
just check the status of the router for one tier in a VPC so that
multiple duplicate alerts can be avoided
Currently, the cloudstack sends VM password only to the first
router in the network even if its the backup and return the result.
In some cases the first router will be back up and the second will be master.
Since password server is not running in backup, when the user resets the password,
it is sent to the first router which can be backup.
In that case, the new password is not stored in the password server and users cant log in with a new password.
This change ensures that we send the password to both the routers instead
of the first router so that a new password is stored in the master router.