This PR introduces volume encryption option to service offerings and disk offerings. Fixes#136
There is a hypervisor component and a storage pool component. Hypervisors are responsible for being capable of running/using the encrypted volumes. Storage pools are responsible for being able to create, copy, resize, etc. Hypervisors will report encryption support in their details, storage pools are marked for encryption support by pool type.
The initial offering for experimental release of this feature will have support for encryption on Local, NFS, SharedMountPoint, and ScaleIO storage types.
When volumes choosing an encrypted offering are allocated to a pool, the pool type must be capable of supporting encryption and this is enforced.
When VMs are started and they have an encrypted volume, the hypervisor must be capable of supporting encryption. Also, if volumes are attached to running VMs, the attach will only work if the hypervisor supports encryption.
This change includes a few other minor changes - for example the ability to force the KVM hypervisor private IP. This was necessary in my testing of ScaleIO, where the KVM hypervisors had multiple IPs and the ScaleIO storage only functions if the hypervisor as a ScaleIO client matches IPs with what CloudStack sees as the hypervisor IP.
For experimental release of this feature, some volume workflows like extract volume and migrate volume aren't supported for encrypted volumes. In the future we could support these, as well as migrating from unencrypted to encrypted offerings, and vice versa.
It may also be possible to configure encryption specifics in the future, perhaps at the pool level or the offering level. Currently, there is only one workable encryption offering for KVM that is supported by Libvirt and Qemu for raw and qcow2 disk files, LUKS version 1. This PR ensures we at least store this encryption format associated with each volume, with the expectation that later we may have LUKS v2 volumes or something else. Thus we will have the information necessary to use each volume with Libvirt if/when other formats are introduced.
I think the most disruptive change here is probably a refactoring of the QemuImg utility to support newer flags like --object. I've tested the change against the basic Qemu 1.5.3 that comes with EL7 and I believe it is good, but it will be nice to see the results of some functional tests. Most of the other changes are limited to changing behavior only if volume encryption is requested.
Working on documentation for the CloudStack docs. One thing to note is that hypervisors that run the stock EL7 version of Qemu will not support encryption. This is tested to be detected and report properly via the CloudStack API/UI. I intend to like to have a support matrix in the CloudStack docs.
I may add a few more unit tests. I'd also like some guidance on having functional tests. I'm not sure if there's a separate framework, or if Marvin is still used, or what the current thing is.
* Add Qemu object flag to QemuImg create
* Add apache license header to new files
* Add Qemu object flag to QemuImg convert
* Set host details if hypervisor supports LUKS
* Add disk encrypt flag to APIs, diskoffering
* Schema upgrade 4.16.0.0 to 4.16.1.0 to support vol encryption
* Add Libvirt secret on disk attach, and refer to it in disk XML
* Add implementation of luks volume encryption to QCOW2 and RAW disk prep
* Start VMs that have encrypted volumes
* Add encrypt option to service offering and root volume provisioning
* Refactor volume passphrase into its own table and object
* CryptSetup, use key files to pass keys instead of command line
* Update storage types and allocators to select encryption support
* Allow agent.properties to define the hypervisor's private IP
* Implement createPhysicalDisk for ScaleIOStorageAdaptor
* UI: Add encrypt options to offerings
* UI module security updates
* Revert "UI module security updates" - belongs in base
This reverts commit a7cb7cf7f57aad38f0b5e5d67389c187b88ffd94.
* Add --target-is-zero support for QemuImg
* Allow qemu image options to be passed, API support convert encrypted
* Switch hypervisor encryption support detection to use KeyFiles
* Fixes for ScaleIO root disk encryption
* Resize root disk if it won't fit encryption header
* Use cryptsetup to prep raw root disks, when supported
* Create qcow2 formatting if necessary during initial template copy to ScaleIO
* Allow setting no cache for qemu-img during disk convert
* Use 1M sparse on qemu-img convert for zero target disks
* UI: Add volume encryption support to hypervisor details
* QemuImg use --image-opts and --object depending on version
* Only send storage commands that require encryption to hosts that support encryption
* Move host encryption detail to a static constant
* Update host selection to account for volume encryption support
Only attach volumes if encryption requirements are met
* Ensure resizeVolume won't allow changing encryption
* Catch edge cases for clearing passphrase when volume is removed
* Disable volume migration and extraction for encrypted volumes
* Register volume secret on destination host during live migration
* Fix configdrive path editing during live migration
* Ensure configdrive path is edited properly during live migration
* Pass along and store volume encryption format during creation
* Fixes for rebase
* Fix tests after rebase
* Add unit tests for DeploymentPlanningManagerImpl to support encryption
* Deployment planner tests for encryption support on last host
* Add deployment tests for encryption when calling planner
* Added Libvirt DiskDef test for encryption details
* Add test for KeyFile utility
* Add CryptSetup tests
* Add QemuImageOptionsTest
* add smoke tests for API level changes on create/list offerings
* Fix schema upgrade, do disk_offering_view first
* Fix UI to show hypervisor encryption support
* Load details into hostVO before trying to query them for encryption
* Remove whitespace in CreateNetworkOfferingTest
* Move QemuImageOptions to use constants for flag keys
* Set physical disk encrypt format during createDiskFromTemplate in KVM Agent
* Whitespace in AbstractStoragePoolAllocator
* Fix whitespace in VolumeDaoImpl
* Support old Qemu in convert
* Log how long it takes to generate a passphrase during volume creation
* Move passphrase generation to async portion of createVolume
* Revert "Allow agent.properties to define the hypervisor's private IP"
This reverts commit 6ea9377505f0e5ff9839156771a241aaa1925e70.
* Updated ScaleIO/PowerFlex storage plugin to support separate (storage) network for Host(KVM) SDC connection. (#144)
* Added smoke tests for volume encryption (in KVM). (#149)
* Updated ScaleIO pool unit tests.
* Some improvements/fixes for code smells (in ScaleIO storage plugin).
* Updated review changes for ScaleIO improvements.
* Updated host response parameter 'encryptionsupported' in the UI.
* Move passphrase generation for the volume to async portion, while deploying VM (#158)
* Move passphrase generation for the volume to async portion, while deploying VM.
* Updated logs, to include volume details.
* Fix schema upgrade, create passphrase table first
* Fixed the DB upgrade issue (as noticed in the logs below.)
DEBUG [c.c.u.d.ScriptRunner] (main:null) (logid:) CALL `cloud`.`IDEMPOTENT_ADD_FOREIGN_KEY`('cloud.volumes', 'passphrase', 'id')
ERROR [c.c.u.d.ScriptRunner] (main:null) (logid:) Error executing: CALL `cloud`.`IDEMPOTENT_ADD_FOREIGN_KEY`('cloud.volumes', 'passphrase', 'id')
ERROR [c.c.u.d.ScriptRunner] (main:null) (logid:) java.sql.SQLException: Failed to open the referenced table 'passphrase'
ERROR [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Unable to execute upgrade script
* Fixes for snapshots with encrypted qcow2
Fixes#159#160#163
* Support create/delete encrypted snapshots of encrypted qcow2 volumes
* Select endpoints that support encryption when snapshotting encrypted volumes
* Update revert snapshot to be compatible with encrypted snapshots
* Disallow volume and template create from encrypted vols/snapshots
* Disallow VM memory snapshots on encrypted vols. Fixes#157
* Fix for TemplateManagerImpl unit test failure
* Support offline resize of encrypted volumes. Fixes#168
* Fix for resize volume unit tests
* Updated libvirt resize volume unit tests
* Support volume encryption on kvm only, and passphrase generation refactor (#169)
* Fail deploy VM when ROOT/DATA volume's offering has encryption enabled, on non-KVM hypervisors
* Fail attach volume when volume's offering has encryption enabled, on non-KVM hypervisors
* Refactor passphrase generation for volume
* Apply encryption to dest volume for live local storage migration
fixes#161
* Apply encryption to data volumes during live storage migration
Fixes#161
* Use the same encryption passphrase id for migrating volumes
* Pass secret consumer during storage migration prepare
Fix for #161
* Fixes create / delete volume snapshot issue, for stopped VMs
* Block volume snapshot if encrypted and VM is running
Fixes#159
* Block snap schedules on encrypted volumes
Fix for #159
* Support cryptsetup where luks type defaults to 2
Fixes#170
* Modify domain XML secret UUID when storage migrating VM
Fix for #172
* Remove any libvirt secrets on VM stop and post migration
Fix for #172
* Update disk profile with encryption requirement from the disk offering (#176)
Update disk profile with encryption requirement from the disk offering
and some code improvements
* Updated review changes / javadoc in ScaleIOUtil
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
* Extract the IO_URING configuration into the agent.properties (#6253)
When using advanced virtualization the IO Driver is not supported. The
admin will decide if want to enable/disable this configuration from
agent.properties file. The default value is true
* kvm: truncate vnc password to 8 chars (#6244)
This PR truncates the vnc password of kvm vms to 8 chars to support latest versions of libvirt.
* merge fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* [KVM] Enable IOURING only when it is available on the host (#6399)
* [KVM] Disable IOURING by default on agents
* Refactor
* Remove agent property for iouring
* Restore property
* Refactor suse check and enable on ubuntu by default
* Refactor irrespective of guest OS
* Improvement
* Logs and new path
* Refactor condition to enable iouring
* Improve condition
* Refactor property check
* Improvement
* Doc comment
* Extend comment
* Move method
* Add log
* [KVM] Fix VM migration error due to VNC password on libvirt limiting versions (#6404)
* [KVM] Fix VM migration error due to VNC password on libvirt limiting versions
* Fix passwd value
* Simplify implementation
Co-authored-by: slavkap <51903378+slavkap@users.noreply.github.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
* agent: enable ssl only for kvm agent (not in system vms)
* Revert "agent: enable ssl only for kvm agent (not in system vms)"
This reverts commit b2d76bad2e.
* Revert "KVM: Enable SSL if keystore exists (#6200)"
This reverts commit 4525f8c8e7.
* KVM: Enable SSL if keystore exists in LibvirtComputingResource.java
Co-authored-by: Wei Zhou <weizhou@apache.org>
* Use base clock when detecting host CPU speed from file, to match lscpu
Allow for manually setting the CPU speed via agent.properties if all else fails
Signed-off-by: Marcus Sorensen <mls@apple.com>
* Update agent/conf/agent.properties
Co-authored-by: dahn <daan.hoogland@gmail.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
* kvm: Use lscpu to get cpu max speed
* Fix str conversion
* Reorder
* Refactor
* Apply suggestions from code review
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
* Updated the calling method name getCpuSpeedFromCommandLscpu
* Make it more readable
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* kvm: don't force scsi controller for aarch64 VMs
This would allow use of virtio disk controller with Ceph, etc or as
defined in the VM's root disk controller setting, rather than always
enforce SCSI.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* remove test that doesn't apply now
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* address review comment
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
With diskful set to true, linstor will fail if it cannot create local
storage for the resource. Which in turn will make it impossible to have a
setup with just compute nodes on cloudstack.
* KVM: Add MV Settings for virtual GPU hardware type and memory
* fix method createVideoDef argument in test package
* add available options for KVM virtual GPU hardware VM setting
* fix videoRam default value
* fix _videoRam is 0, it will use default provided by libvirt
This adds a volume(primary) storage plugin for the Linstor SDS.
Currently it can create/delete/migrate volumes, snapshots should be possible,
but currently don't work for RAW volume types in cloudstack.
* plugin-storage-volume-linstor: notify libvirt guests about the resize
* Fix of creating volumes from snapshots without backup
When few snaphots are created onyl on primary storage, and try to create
a volume or a template from the snapshot only the first operation is
successful. Its because the snapshot is backup on secondary storage with
wrong SQL query. The problem appears on Ceph/NFS but may affects other
storage plugins.
Bypassing secondary storage is implemented only for Ceph primary storage
and it didn't cover the functionality to create volume from snapshot
which is kept only on Ceph
* Address review
* Create utility to centralize byte convertions
* Add/change toString definitions
* Create Libvirt handler to ScaleVmCommand
* Enable dynamic scalling VM with KVM
* Move config from interface to class and rename it
As every variable declared in interfaces are already final,
this moving will be needed to mock tests in nexts commits
* Configure VM max memory and cpu cores
The values are according to service offering or global configs
* Extract dpdk configuration to a method and test it
* Extract OS desc config to a method and test it
* Extract guest resource def to a method and test it
Improve libvirt def
* Refactor LibvirtVMDef.GuestResourceDef
* Refactor ScaleVmCommand
* Improve VMInstaVO toString()
* Refactor upgradeRunningVirtualMachine method
* Turn int variables into long on utility
* Verify if VM is scalable on KVMGuru
* Rename some KVMGuruTest's methods
* Change vm's xml to work with max memory
* Verify if service offering is dynamic before scale
* Create methods to retrieve data from domain
* Create def to hotplug memory
* Adjust the way command was scaling the VM
* Fix database persistence before executing command
* Send more info to host to improve log
* Fix var name
* Fix missing "}"
* Undo unnecessary changes
* Address review
* Fix scale validation
* Add VM prepared for dynamic scaling validation
* Refactor LibvirtScaleVmCommandWrapper and improve unit tests
* Remove duplicated method
* Add RuntimeException check
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Update ByteScaleUtilsTest.java
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Add SharedMountPoint to KVMs supported storage pool types
* Fix live migration to iSCSI and improve logs
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Externalize KVM Agent storage's timeout configuration
* Address @nvazquez review
* Add empty line at the end of the agent.properties file
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Refactor method createVMFromSpec
* Add unit tests
* Fix test
* Extract if block to method for add extra configs to VM Domain XML
* Split travis tests trying to isolate which test is causing an error
* Override toString() method
* Update documentation
* Fix checkstyle error (line with trailing spaces)
* Change VirtualMachineTO print of object
* Add try except to find message error. Remove after test
* Fix indent
* Trying to understanding why is happening in this code
* Refactor method createVMFromSpec
* Add unit tests
* Fix test
* Extract if block to method for add extra configs to VM Domain XML
* Split travis tests trying to isolate which test is causing an error
* Override toString() method
* Update documentation
* Fix checkstyle error (line with trailing spaces)
* Remove unnecessary comment
* Revert travis tests
Co-authored-by: SadiJr <17a0db2854@firemailbox.club>
* Externalize KVM Agent storage's timeout configuration
Created a class of constant agent's properties available to configure on "agent.properties".
Created a class to provides a facility to read the agent's properties file and get its properties.
* Refactored KVHAMonitor nested thread and changed some logs
* It has been added the timeout's config in the agent.properties file
* Rename classes
* Rename var and remove comment
* Fix typo with word "heartbeat"
* Extract multiple methods call to variables
* Add unit tests to file handler
* Increase info about the property
* Create inner class Property
* Rename method getProperty to getPropertyValue
* Remove copyright
* Remove copyright
* Extract code to createHeartBeatCommand
* Change method access from protected to private
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
On newer libvirt/qemu it seems PCI hot-plugging could be an issue as
seen in:
https://www.suse.com/support/kb/doc/?id=000019383https://bugs.launchpad.net/nova/+bug/1836065
This was found to be true on ARM64/aarch64 platform (tested on
RaspberryPi4). As per the default machine doc, it advises to
pre-allocate PCI controllers on the machine and pcie-to-pci-bridge based
controller for legacy PCI models:
https://libvirt.org/pci-hotplug.html#x86_64-q35
This patch introduces the concept as a workaround until a proper fix is
done (ideally in the upstream libvirt/qemu projects). Until then client
code can add 32 PCI controllers and a pcie-to-pci-bridge controller for
aarch64 platforms.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Currently there is no disk IO driver configuration for VMs running on KVM. That's OK for most the cases; however, recently there have been added some quite interesting optimizations with the IO driver io_uring.
Note that IO URING requires:
Qemu >= 5.0, and
Libvirt >= 6.3.0.
By using io_uring we can see a massive I/O performance improvement within Virtual Machines running from Local and/or NFS storage.
This implementation enhances the KVM disk configuration by adding workflow for setting the disk IO drivers. Additionally, if the Qemu and Libvirt versions matches with the required for having io_uring we are going to set it on the VM. If there is no support for such driver we keep it as it is nowadays, without any IO driver configured.
Fixes: #4883
* server: fix failed to apply userdata when enable static nat
* server: fix cannot expunge vm as applyUserdata fails
* configdrive: fix ISO is not recognized when plug a new nic
* configdrive: detach and attach configdrive ISO as it is changed when plug a new nic or migrate vm
* configdrive test: (1) password file does not exists in recreated ISO; (2) vm hostname should be changed after migration
* configdrive: use centos55 template with sshkey and configdrive support
* configdrive: disklabel is 'config-2' for configdrive ISO
* configdrive: use copy for configdrive ISO and move for other template/volume/iso
* configdrive: use public-keys.txt
* configdrive test: fix (1) update_template ; (2) ssh into vm by keypair
This PR intends to improve logging on agent start to facilitate troubleshooting.
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* vxlan: arp does not work between hosts as multicast group is communicated over physical nic instead of linux bridge
when linux bridge is setup (refer to http://docs.cloudstack.apache.org/projects/archived-cloudstack-getting-started/en/latest/networking/vxlan.html#configure-product-to-use-vxlan-plugin) and used as the kvm traffic label of physical networks, the vms on different hosts cannot reach each other.
(1) does not work:
```
/usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvxlan.sh -v 1001 -p eth1 -b brvx-1001 -o add
```
"bridge fdb" shows
```
00:00:00:00:00:00 dev vxlan1001 dst 239.0.3.233 via eth1 self permanent
```
(2) this works:
```
/usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvxlan.sh -v 1001 -p cloudbr1 -b brvx-1001 -o add
```
"bridge fdb" shows
```
00:00:00:00:00:00 dev vxlan1001 dst 239.0.3.233 via cloudbr1 self permanent
```
* vxlan: fix issue if kvm network label is not set
* Fix of some UEFI related issues
1 - fix of attach/detach ISO of VM with UEFI boot type
2 - if OS type of an ISO is categorized as "Other" the bus type of the disk
will be set to "sata"
* Simplify the validation of OS types
Datastore cluster as a primary storage support is already there. But if any changes at vCenter to datastore cluster like addition/removal of datastore is not synchronised with CloudStack directly. It needs removal of primary storage from CloudStack and add it again to CloudStack.
Here synchronisation of datastore cluster is fixed without need to remove or add the datastore cluster.
1. A new API is introduced syncStoragePool which takes datastore cluster storage pool UUID as the parameter. This API checks if there any changes in the datastore cluster and updates management server accordingly.
2. During synchronisation if a new child datastore is found in datastore cluster, then management server will create a new child storage pool in database under the datastore cluster. If the new child storage pool is already added as an individual storage pool then the existing storage pool entry will be converted to child storage pool (instead of creating a new storage pool entry)
3. During synchronisaton if the existing child datastore in CloudStack is found to be removed on vCenter then management server removes that child datastore from datastore cluster and makes it an individual storage pool.
The above behaviour is on par with the vCenter behaviour when adding and removing child datastore.