* Use cryptsetup w/o zeroing for encrypted scaleio - faster
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Pass storage scope during KVM volume migration to avoid remotely mounting local storage
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Add method to choose template pool based on scope
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Clean up null check when creating migration options
Signed-off-by: Marcus Sorensen <mls@apple.com>
* ScaleIO enhancements - thin/thick encrypted, online resize
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
This PR introduces volume encryption option to service offerings and disk offerings. Fixes#136
There is a hypervisor component and a storage pool component. Hypervisors are responsible for being capable of running/using the encrypted volumes. Storage pools are responsible for being able to create, copy, resize, etc. Hypervisors will report encryption support in their details, storage pools are marked for encryption support by pool type.
The initial offering for experimental release of this feature will have support for encryption on Local, NFS, SharedMountPoint, and ScaleIO storage types.
When volumes choosing an encrypted offering are allocated to a pool, the pool type must be capable of supporting encryption and this is enforced.
When VMs are started and they have an encrypted volume, the hypervisor must be capable of supporting encryption. Also, if volumes are attached to running VMs, the attach will only work if the hypervisor supports encryption.
This change includes a few other minor changes - for example the ability to force the KVM hypervisor private IP. This was necessary in my testing of ScaleIO, where the KVM hypervisors had multiple IPs and the ScaleIO storage only functions if the hypervisor as a ScaleIO client matches IPs with what CloudStack sees as the hypervisor IP.
For experimental release of this feature, some volume workflows like extract volume and migrate volume aren't supported for encrypted volumes. In the future we could support these, as well as migrating from unencrypted to encrypted offerings, and vice versa.
It may also be possible to configure encryption specifics in the future, perhaps at the pool level or the offering level. Currently, there is only one workable encryption offering for KVM that is supported by Libvirt and Qemu for raw and qcow2 disk files, LUKS version 1. This PR ensures we at least store this encryption format associated with each volume, with the expectation that later we may have LUKS v2 volumes or something else. Thus we will have the information necessary to use each volume with Libvirt if/when other formats are introduced.
I think the most disruptive change here is probably a refactoring of the QemuImg utility to support newer flags like --object. I've tested the change against the basic Qemu 1.5.3 that comes with EL7 and I believe it is good, but it will be nice to see the results of some functional tests. Most of the other changes are limited to changing behavior only if volume encryption is requested.
Working on documentation for the CloudStack docs. One thing to note is that hypervisors that run the stock EL7 version of Qemu will not support encryption. This is tested to be detected and report properly via the CloudStack API/UI. I intend to like to have a support matrix in the CloudStack docs.
I may add a few more unit tests. I'd also like some guidance on having functional tests. I'm not sure if there's a separate framework, or if Marvin is still used, or what the current thing is.
* Add Qemu object flag to QemuImg create
* Add apache license header to new files
* Add Qemu object flag to QemuImg convert
* Set host details if hypervisor supports LUKS
* Add disk encrypt flag to APIs, diskoffering
* Schema upgrade 4.16.0.0 to 4.16.1.0 to support vol encryption
* Add Libvirt secret on disk attach, and refer to it in disk XML
* Add implementation of luks volume encryption to QCOW2 and RAW disk prep
* Start VMs that have encrypted volumes
* Add encrypt option to service offering and root volume provisioning
* Refactor volume passphrase into its own table and object
* CryptSetup, use key files to pass keys instead of command line
* Update storage types and allocators to select encryption support
* Allow agent.properties to define the hypervisor's private IP
* Implement createPhysicalDisk for ScaleIOStorageAdaptor
* UI: Add encrypt options to offerings
* UI module security updates
* Revert "UI module security updates" - belongs in base
This reverts commit a7cb7cf7f57aad38f0b5e5d67389c187b88ffd94.
* Add --target-is-zero support for QemuImg
* Allow qemu image options to be passed, API support convert encrypted
* Switch hypervisor encryption support detection to use KeyFiles
* Fixes for ScaleIO root disk encryption
* Resize root disk if it won't fit encryption header
* Use cryptsetup to prep raw root disks, when supported
* Create qcow2 formatting if necessary during initial template copy to ScaleIO
* Allow setting no cache for qemu-img during disk convert
* Use 1M sparse on qemu-img convert for zero target disks
* UI: Add volume encryption support to hypervisor details
* QemuImg use --image-opts and --object depending on version
* Only send storage commands that require encryption to hosts that support encryption
* Move host encryption detail to a static constant
* Update host selection to account for volume encryption support
Only attach volumes if encryption requirements are met
* Ensure resizeVolume won't allow changing encryption
* Catch edge cases for clearing passphrase when volume is removed
* Disable volume migration and extraction for encrypted volumes
* Register volume secret on destination host during live migration
* Fix configdrive path editing during live migration
* Ensure configdrive path is edited properly during live migration
* Pass along and store volume encryption format during creation
* Fixes for rebase
* Fix tests after rebase
* Add unit tests for DeploymentPlanningManagerImpl to support encryption
* Deployment planner tests for encryption support on last host
* Add deployment tests for encryption when calling planner
* Added Libvirt DiskDef test for encryption details
* Add test for KeyFile utility
* Add CryptSetup tests
* Add QemuImageOptionsTest
* add smoke tests for API level changes on create/list offerings
* Fix schema upgrade, do disk_offering_view first
* Fix UI to show hypervisor encryption support
* Load details into hostVO before trying to query them for encryption
* Remove whitespace in CreateNetworkOfferingTest
* Move QemuImageOptions to use constants for flag keys
* Set physical disk encrypt format during createDiskFromTemplate in KVM Agent
* Whitespace in AbstractStoragePoolAllocator
* Fix whitespace in VolumeDaoImpl
* Support old Qemu in convert
* Log how long it takes to generate a passphrase during volume creation
* Move passphrase generation to async portion of createVolume
* Revert "Allow agent.properties to define the hypervisor's private IP"
This reverts commit 6ea9377505f0e5ff9839156771a241aaa1925e70.
* Updated ScaleIO/PowerFlex storage plugin to support separate (storage) network for Host(KVM) SDC connection. (#144)
* Added smoke tests for volume encryption (in KVM). (#149)
* Updated ScaleIO pool unit tests.
* Some improvements/fixes for code smells (in ScaleIO storage plugin).
* Updated review changes for ScaleIO improvements.
* Updated host response parameter 'encryptionsupported' in the UI.
* Move passphrase generation for the volume to async portion, while deploying VM (#158)
* Move passphrase generation for the volume to async portion, while deploying VM.
* Updated logs, to include volume details.
* Fix schema upgrade, create passphrase table first
* Fixed the DB upgrade issue (as noticed in the logs below.)
DEBUG [c.c.u.d.ScriptRunner] (main:null) (logid:) CALL `cloud`.`IDEMPOTENT_ADD_FOREIGN_KEY`('cloud.volumes', 'passphrase', 'id')
ERROR [c.c.u.d.ScriptRunner] (main:null) (logid:) Error executing: CALL `cloud`.`IDEMPOTENT_ADD_FOREIGN_KEY`('cloud.volumes', 'passphrase', 'id')
ERROR [c.c.u.d.ScriptRunner] (main:null) (logid:) java.sql.SQLException: Failed to open the referenced table 'passphrase'
ERROR [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Unable to execute upgrade script
* Fixes for snapshots with encrypted qcow2
Fixes#159#160#163
* Support create/delete encrypted snapshots of encrypted qcow2 volumes
* Select endpoints that support encryption when snapshotting encrypted volumes
* Update revert snapshot to be compatible with encrypted snapshots
* Disallow volume and template create from encrypted vols/snapshots
* Disallow VM memory snapshots on encrypted vols. Fixes#157
* Fix for TemplateManagerImpl unit test failure
* Support offline resize of encrypted volumes. Fixes#168
* Fix for resize volume unit tests
* Updated libvirt resize volume unit tests
* Support volume encryption on kvm only, and passphrase generation refactor (#169)
* Fail deploy VM when ROOT/DATA volume's offering has encryption enabled, on non-KVM hypervisors
* Fail attach volume when volume's offering has encryption enabled, on non-KVM hypervisors
* Refactor passphrase generation for volume
* Apply encryption to dest volume for live local storage migration
fixes#161
* Apply encryption to data volumes during live storage migration
Fixes#161
* Use the same encryption passphrase id for migrating volumes
* Pass secret consumer during storage migration prepare
Fix for #161
* Fixes create / delete volume snapshot issue, for stopped VMs
* Block volume snapshot if encrypted and VM is running
Fixes#159
* Block snap schedules on encrypted volumes
Fix for #159
* Support cryptsetup where luks type defaults to 2
Fixes#170
* Modify domain XML secret UUID when storage migrating VM
Fix for #172
* Remove any libvirt secrets on VM stop and post migration
Fix for #172
* Update disk profile with encryption requirement from the disk offering (#176)
Update disk profile with encryption requirement from the disk offering
and some code improvements
* Updated review changes / javadoc in ScaleIOUtil
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
* Randomize managed volume copy host
* Managed volume copy was always returning first host that could see storage pools
* Fix null value in logging for ScaleIOPrimaryDataStoreDriver due to if/else logic error
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Use String.format for ScaleIO debug message
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Update debug message for ScaleIO copy methods
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Fix resize volume and migrate volume to update volume path if DRS is applied on volume in datastore cluster
* Change in constructors
* Naming changes
* Remove commented code
* Refactor code for more readability
* Addressed review comments on code refactor
This adds a volume(primary) storage plugin for the Linstor SDS.
Currently it can create/delete/migrate volumes, snapshots should be possible,
but currently don't work for RAW volume types in cloudstack.
* plugin-storage-volume-linstor: notify libvirt guests about the resize
* server: do not remove volume from DB if fail to expunge it from primary storage or secondary storage
* server/VolumeApiServiceImpl.java: move to method
* update #5373
* Fix of creating volumes from snapshots without backup
When few snaphots are created onyl on primary storage, and try to create
a volume or a template from the snapshot only the first operation is
successful. Its because the snapshot is backup on secondary storage with
wrong SQL query. The problem appears on Ceph/NFS but may affects other
storage plugins.
Bypassing secondary storage is implemented only for Ceph primary storage
and it didn't cover the functionality to create volume from snapshot
which is kept only on Ceph
* Address review
* Added disk provisioning type support for VMWare
* Review changes
* Fixed unit test
* Review changes
* Added missing licenses
* Review changes
* Update StoragePoolInfo.java
Removed white space
* Review change - Getting disk provisioning strictness setting using the zone id and not the pool id
* Delete __init__.py
* Merge fix
* Fixed failing test
* Added comment about parameters
* Added error log when update fails
* Added exception when using API
* Ordering storage pool selection to prefer thick disk capable pools if available
* Removed unused parameter
* Reordering changes
* Returning storage pool details after update
* Removed multiple pool update, updated marvin test, removed duplicate enum
* Removed comment
* Removed unused import
* Removed for loop
* Added missing return statements for failed checks
* Class name change
* Null pointer
* Added more info when a deployment fails
* Null pointer
* Update api/src/main/java/org/apache/cloudstack/api/BaseListCmd.java
Co-authored-by: dahn <daan.hoogland@gmail.com>
* Small bug fix on API response and added missing bracket
* Removed datastore cluster code
* Removed unused imports, added missing signature
* Removed duplicate config key
* Revert "Added more info when a deployment fails"
This reverts commit 2486db78dc.
Co-authored-by: dahn <daan.hoogland@gmail.com>
- Added connection manager to the gateway client.
- Renew the client session on '401 Unauthorized' response.
- Refactored the gateway client calls, for GET and POST methods.
- Consume the http entity content after login/(re)authentication and close the content stream if exists.
- Updated storage pool client connection timeout configuration 'storage.pool.client.timeout' to non-dynamic.
- Added storage pool client max connections configuration 'storage.pool.client.max.connections' (default: 100) to specify the maximum connections for the ScaleIO storage pool client.
- Updated unit tests.
and blocked the attach volume operation for uploaded volume on ScaleIO/PowerFlex storage pool
* server: destroy ssvm, cpvm on last host maintenance
When a single or last UP host enters into maintenance just stopping SSVM and CPVM will leave behind VMs on hypervisor side. As these system vms will be recreated they can be destroyed.
Fixes#3719
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix methods
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* immediately destroy systemvms
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix destroy
Added bypassHostMaintenance flag in Comma.java class to allow command to be handled by host agent even when host is in maintenace.
Flag is set true only for delete commands for ssvm and cpvm.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unit test fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing return statement
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix
VM should be stopped with cleanup before calling expunge else it server may through error with host in PrepareForMaintenance state.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* rename
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ
Documentation PR: apache/cloudstack-documentation#169
This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack
Other improvements addressed in addition to PowerFlex/ScaleIO support:
- Added support for config drives in host cache for KVM
=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
- Updated full deployment destination for preparing the network(s) on VM start
- Propagate the direct download certificates uploaded to the newly added KVM hosts
- Discover the template size for direct download templates using any available host from the zones specified on template registration
=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
- Retry VM deployment/start when the host cannot grant access to volume/template
- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
- Check the router filesystem is writable or not, before performing health checks
=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)
* Addressed some issues for few operations on PowerFlex storage pool.
- Updated migration volume operation to sync the status and wait for migration to complete.
- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.
- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).
- Updated resize volume error message string.
- Blocked the below operations on PowerFlex storage pool:
-> Extract Volume
-> Create Snapshot for VMSnapshot
* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.
- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html
Other fixes included:
- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)
- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.
- Perform basic tests (for connectivity and file system) on router before updating the health check config data
=> Validate the basic tests (connectivity and file system check) on router
=> Cleanup the health check results when router is destroyed
* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0
* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool
and Minor improvements in PowerFlex/ScaleIO storage plugin code
* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.
- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.
* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py
* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)
* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.
* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration
* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure
* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
This PR addresses an error that appears when you try to add a new host. I don't even understand why there was a cast to String in the first place. I will assume some classes send HypervisorType and some send a string (empty or otherwise). Shouldn't this be addressed to use the same type everywhere? With this fix adding a new xenserver host works fine.
Co-authored-by: dahn <daan.hoogland@gmail.com>
This PR adds outputting human readable byte sizes in the management server logs, agent logs, and usage records. A non-dynamic global variable is added (display.human.readable.sizes) to control switching this feature on and off. This setting is sent to the agent on connection and is only read from the database when the management server is started up. The setting is kept in memory by the use of a static field on the NumbersUtil class and is available throughout the codebase.
Instead of seeing things like:
2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"106496","bytesReceived":"0","result":"true","details":"","wait":"0",}}] }
The KB MB and GB values will be printed out:
2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"(104.00 KB) 106496","bytesReceived":"(0 bytes) 0","result":"true","details":"","wait":"0",}}] }
FS: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Human+Readable+Byte+sizes
Ceph used to use port 6789 (no need to specify it), but with the messenger v2
from Ceph it switched to port 3300 while 6789 still works.
librados/librbd/libvirt will automatically figure out the ports to use if none is
specified.
Therefor there is no need for CloudStack to explicitely define the port in the XML
passed to Libvirt or Qemu.
Leave blank if no port number has been defined by the user.
* create vags per cluster
* vagname in solidfire utils vag object
* fix string compare
* refactor to make use of existing map
* fix typos
* rebuild vag to iqn map after creating cluster vag
* refactor loop using java 8 stream api
* update null entry in vag to iqn map
* remove null vag to iqn mapping when creating cluster id vag
* add initiator to sf vag when adding hosts
* use cluster uuid instead of cluster id and refactor
* update null entry in vagtoiqnmap
* update sfvag list after creating new vag
* pass clusterDao to handleVagForHost
* check if initiator is not already added to the vag
* factor logic into methods
* fix typo and camel case
* fix listing clusters by zone id
Co-authored-by: Sid Kattoju <siddharthakattoju@gmail.com>
Features:
Zone-wide and cluster-wide primary storage support
VM template caching automatically on Datera, the subsequent VMs can be created instantaneously by fast cloning the root volume.
Rapid storage-native snapshot
Multiple managed primary storages can be created with a single Datera cluster to provide better management of
Total provisioned capacity
Default storage QoS values
Replica size ( 1 to 5 )
IP pool assignment for iSCSI target
Volume Placement ( hybrid, single_flash, all_flash )
Volume snapshot to VM template
Volume to VM template
Volume size increase using service policy
Volume QoS change using service policy
Enabled KVM support
New Datera app_instance name format to include ACS volume name
VM live migration
* skip geting used bytes for volumes that are not in Ready state
* updated log message
* filter snapshots by state backedup
* removed * import
* filter templates by state 'DOWNLOADED'
* refactored getUsedBytes to use O(1) queries
* querying for ready volumes instead filtering in memory
* make listByStoreIdInReadyState more generic ex listByStoreIdAndState
* updated snapshot search criteria for listByStoreIdAndState
* updated template search criteria for listByPoolIdAndState
* fixed typo in search criteria for listByTemplateAndState
* fixed typo in search criteria for templates in listByPoolIdAndState
* Migrate template to target host if needed.
Fix KVM VM local storage live migration by migrating its template to the
target host if needed.
* Address reviewer and add method that updates the DB template reference
* Remove deprecated Config.PrimaryStorageDownloadWait
* Code formating of @Inject to follow checkstyle
* Allow KVM VM live migration with ROOT volume on file
* Allow KVM VM live migration with ROOT volume on file
- Add JUnit tests
* Address reviewers and change some variable names to ease future
implementation (developers can easily guess the name and use
autocomplete)