cloudstack/test/integration/smoke
Marcus Sorensen 697e12f8f7
kvm: volume encryption feature (#6522)
This PR introduces a feature designed to allow CloudStack to manage a generic volume encryption setting. The encryption is handled transparently to the guest OS, and is intended to handle VM guest data encryption at rest and possibly over the wire, though the actual encryption implementation is up to the primary storage driver.

In some cases cloud customers may still prefer to maintain their own guest-level volume encryption, if they don't trust the cloud provider. However, for private cloud cases this greatly simplifies the guest OS experience in terms of running volume encryption for guests without the user having to manage keys, deal with key servers and guest booting being dependent on network connectivity to them (i.e. Tang), etc, especially in cases where users are attaching/detaching data disks and moving them between VMs occasionally.

The feature can be thought of as having two parts - the API/control plane (which includes scheduling aspects), and the storage driver implementation.

This initial PR adds the encryption setting to disk offerings and service offerings (for root volume), and implements encryption support for KVM SharedMountPoint, NFS, Local, and ScaleIO storage pools.

NOTE: While not required, operations can be significantly sped up by ensuring that hosts have the `rng-tools` package and service installed and running on the management server and hypervisors. For EL hosts the service is `rngd` and for Debian it is `rng-tools`. In particular, the use of SecureRandom for generating volume passphrases can be slow if there isn't a good source of entropy. This could affect testing and build environments, and otherwise would only affect users who actually use the encryption feature. If you find tests or volume creates blocking on encryption, check this first.

### Management Server

##### API

* createDiskOffering now has an 'encrypt' Boolean
* createServiceOffering now has an 'encryptroot' Boolean. The 'root' suffix is added here in case there is ever any other need to encrypt something related to the guest configuration, like the RAM of a VM.  This has been refactored to deal with the new separation of service offering from disk offering internally.
* listDiskOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listServiceOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listHosts now shows encryption support of each hypervisor host via `encryptionsupported`
* Volumes themselves don't show encryption on/off, rather the offering should be referenced. This follows the same pattern as other disk offering based settings such as the IOPS of the volume.

##### Volume functions

A decent effort has been made to ensure that the most common volume functions have either been cleanly supported or blocked. However, for the first release it is advised to mark this feature as *experimental*, as the code base is complex and there are certainly edge cases to be found.

Many of these features could eventually be supported over time, such as creating templates from encrypted volumes, but the effort and size of the change is already overwhelming.

Supported functions:
* Data Volume create
* VM root volume create
* VM root volume reinstall
* Offline volume snapshot/restore
* Migration of VM with storage (e.g. local storage VM migration)
* Resize volume
* Detach/attach volume

Blocked functions:
* Online volume snapshot
* VM snapshot w/memory
* Scheduled snapshots (would fail when VM is running)
* Disk offering migration to offerings that don't have matching encryption
* Creating template from encrypted volume
* Creating volume from encrypted volume
* Volume extraction (would we decrypt it first, or expose the key? Probably the former).

##### Primary Storage Support

For storage developers, adding encryption support involves:

1. Updating the `StoragePoolType` for your primary storage to advertise encryption support. This is used during allocation of storage to match storage types that support encryption to storage that supports it.

2. Implementing encryption feature when your `PrimaryDataStoreDriver` is called to perform volume lifecycle functions on volumes that are requesting encryption. You are free to do what your storage supports - this could be as simple as calling a storage API with the right flag when creating a volume. Or (as is the case with the KVM storage types), as complex as managing volume details directly at the hypervisor host. The data objects passed to the storage driver will contain volume passphrases, if encryption is requested.

##### Scheduling

For the KVM implementations specified above, we are dependent on the KVM hosts having support for volume encryption tools. As such, the hosts `StartupRoutingCommand` has been modified to advertise whether the host supports encryption. This is done via a probe during agent startup to look for functioning `cryptsetup` and support in `qemu-img`. This is also visible via the listHosts API and the host details in the UI.  This was patterned after other features that require hypervisor support such as UEFI.

The `EndPointSelector` interface and `DefaultEndpointSelector` have had new methods added, which allow the caller to ask for endpoints that support encryption.  This can be used by storage drivers to find the proper hosts to send storage commands that involve encryption. Not all volume activities will require a host to support encryption (for example a snapshot backup is a simple file copy), and this is the reason why the interface has been modified to allow for the storage driver to decide, rather than just passing the data objects to the EndpointSelector and letting the implementation decide.

VM scheduling has also been modified. When a VM start is requested, if any volume that requires encryption is attached, it will filter out hosts that don't support encryption.

##### DB Changes

A volume whose disk offering enables encryption will get a passphrase generated for it before its first use. This is stored in the new 'passphrase' table, and is encrypted using the CloudStack installation's standard configured DB encryption. A field has been added to the volumes table, referencing this passphrase, and a foreign key added to ensure passphrases that are referenced can't be removed from the database.  The volumes table now also contains an encryption format field, which is set by the implementer of the encryption and used as it sees fit.

#### KVM Agent

For the KVM storage pool types supported, the encryption has been implemented at Qemu itself, using the built-in LUKS storage support. This means that the storage remains encrypted all the way to the VM process, and decrypted before the block device is visible to the guest.  This may not be necessary in order to implement encryption for /your/ storage pool type, maybe you have a kernel driver that decrypts before the block device on the system, or something like that. However, it seemed like the simplest, common place to terminate the encryption, and provides the lowest surface area for decrypted guest data.

For qcow2 based storage, `qemu-img` is used to set up a qcow2 file with LUKS encryption. For block based (currently just ScaleIO storage), the `cryptsetup` utility is used to format the block device as LUKS for data disks, but `qemu-img` and its LUKS support is used for template copy.

Any volume that requires encryption will contain a passphrase ID as a byte array when handed down to the KVM agent. Care has been taken to ensure this doesn't get logged, and it is cleared after use in attempt to avoid exposing it before garbage collection occurs.  On the agent side, this passphrase is used in two ways:

1. In cases where the volume experiences some libvirt interaction it is loaded into libvirt as an ephemeral, private secret and then referenced by secret UUID in any libvirt XML. This applies to things like VM startup, migration preparation, etc.

2. In cases where `qemu-img` needs to use this passphrase for volume operations, it is written to a `KeyFile` on the cloudstack agent's configured tmpfs and passed along. The `KeyFile` is a `Closeable` and when it is closed, it is deleted. This allows us to try-with-resources any volume operations and get the KeyFile removed regardless.

In order to support the advanced syntax required to handle encryption and passphrases with `qemu-img`, the `QemuImg` utility has been modified to support the new `--object` and `--image-opts` flags. These are modeled as `QemuObject` and `QemuImageOptions`.  These `qemu-img` flags have been designed to supersede some of the existing, older flags being used today (such as choosing file formats and paths), and an effort could be made to switch over to these wholesale. However, for now we have instead opted to keep existing functions and do some wrapping to ensure backward compatibility, so callers of `QemuImg` can choose to use either way.

It should be noted that there are also a few different Enums that represent the encryption format for various purposes. While these are analogous in principle, they represent different things and should not be confused. For example, the supported encryption format strings for the `cryptsetup` utility has `LuksType.LUKS` while `QemuImg` has a `QemuImg.PhysicalDiskFormat.LUKS`.

Some additional effort could potentially be made to support advanced encryption configurations, such as choosing between LUKS1 and LUKS2 or changing cipher details. These may require changes all the way up through the control plane. However, in practice Libvirt and Qemu currently only support LUKS1 today. Additionally, the cipher details aren't required in order to use an encrypted volume, as they're stored in the LUKS header on the volume there is no need to store these elsewhere.  As such, we need only set the one encryption format upon volume creation, which is persisted in the volumes table and then available later as needed.  In the future when LUKS2 is standard and fully supported, we could move to it as the default and old volumes will still reference LUKS1 and have the headers on-disk to ensure they remain usable. We could also possibly support an automatic upgrade of the headers down the road, or a volume migration mechanism.

Every version of cryptsetup and qemu-img tested on variants of EL7 and Ubuntu that support encryption use the XTS-AES 256 cipher, which is the leading industry standard and widely used cipher today (e.g. BitLocker and FileVault).

Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
2022-09-27 10:20:59 +05:30
..
__init__.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_accounts.py Fix spelling (#6161) 2022-03-29 00:21:07 -03:00
test_affinity_groups.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_affinity_groups_projects.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_annotations.py Enable flake8 rule W292 No newline at end of file (#6274) 2022-06-30 12:08:27 +05:30
test_async_job.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_attach_multiple_volumes.py test,xcp-ng: fix tests for VM PV driver issue (#6549) 2022-08-09 12:44:27 +05:30
test_backup_recovery_dummy.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_certauthority_root.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_console_endpoint.py console: Console access enhancements (#6577) 2022-09-14 12:39:59 +05:30
test_create_list_domain_account_project.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_create_network.py Fix spelling (#6597) 2022-08-03 15:43:47 +05:30
test_deploy_vgpu_enabled_vm.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_deploy_virtio_scsi_vm.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_deploy_vm_extra_config_data.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_deploy_vm_iso.py CLOUDSTACK-10013: Fixes based on code review and test failures 2017-12-23 17:51:42 +05:30
test_deploy_vm_iso_uefi.py Adding support for RHEL8 binary-compatible variants (#5158) 2021-08-18 10:03:03 +02:00
test_deploy_vm_root_resize.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_deploy_vm_with_userdata.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_deploy_vms_in_parallel.py Resource reservation framework (#6694) 2022-09-16 15:44:35 +05:30
test_deploy_vms_with_varied_deploymentplanners.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_diagnostics.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_direct_download.py Direct download certificates additions and improvements (#6104) 2022-04-11 22:57:23 -03:00
test_disk_offerings.py kvm: volume encryption feature (#6522) 2022-09-27 10:20:59 +05:30
test_disk_provisioning_types.py Added disk provisioning type support for VMWare (#4640) 2021-07-16 22:37:42 -03:00
test_domain_disk_offerings.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_domain_network_offerings.py tests: Fix test failures for Local storage and Basic zones (#5106) 2021-07-01 09:45:21 +05:30
test_domain_service_offerings.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_domain_vpc_offerings.py tests: Fix test failures for Local storage and Basic zones (#5106) 2021-07-01 09:45:21 +05:30
test_dynamicroles.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_enable_account_settings_for_domain.py Enable account settings to be visible under domain settings (#4215) 2021-09-29 10:29:20 +02:00
test_enable_role_based_users_in_projects.py projects: Role based users in Projects (#4128) 2020-08-13 15:45:39 +05:30
test_events_resource.py refactor: new line, lint error fix (#6529) 2022-07-05 20:27:40 +05:30
test_gateway_on_shared_networks.py Gateways after Nic update on Shared Network tests (#6355) 2022-05-05 19:53:31 -03:00
test_global_settings.py Incorrect param name caused global setting test to fail (#3821) 2020-01-24 14:31:27 +01:00
test_guest_vlan_range.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_host_maintenance.py packaging: Adding Centos8, Ubuntu 20.04, XCPNG8.1 Support (#4068) 2020-08-17 16:28:30 +05:30
test_hostha_kvm.py Adding support for RHEL8 binary-compatible variants (#5158) 2021-08-18 10:03:03 +02:00
test_hostha_simulator.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_human_readable_logs.py Changed test failure to warning (#4264) 2020-08-25 15:29:59 +05:30
test_internal_lb.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_ipv6_infra.py test: add, refactor ipv6 network, vpc tests (#6338) 2022-07-12 12:54:53 +05:30
test_iso.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_kubernetes_clusters.py cks: upgrade k8s to 1.23.3/1.24.0 in smoke test (#6388) 2022-05-17 11:19:37 -03:00
test_kubernetes_supported_versions.py cks: upgrade k8s to 1.23.3/1.24.0 in smoke test (#6388) 2022-05-17 11:19:37 -03:00
test_list_ids_parameter.py xen: Fix volume snapshot deletion when it has child snapshots (#6296) 2022-04-22 14:36:08 -03:00
test_loadbalance.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_login.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_metrics_api.py Mshost stats (#5588) 2022-04-22 08:48:19 -03:00
test_migration.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_multipleips_per_nic.py CLOUDSTACK-10193: Fix smoke tests failures with new systemvmtemplate 2017-12-23 09:22:44 +05:30
test_nested_virtualization.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_network.py Enable flake8 rule W292 No newline at end of file (#6274) 2022-06-30 12:08:27 +05:30
test_network_acl.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_network_ipv6.py api: fix ipv6 firewall apis default role permissions (#6579) 2022-07-31 16:49:29 +05:30
test_network_permissions.py test,xcp-ng: fix tests for VM PV driver issue (#6549) 2022-08-09 12:44:27 +05:30
test_nic.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_nic_adapter_type.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_non_contigiousvlan.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_outofbandmanagement.py test: Frix travis failure - test_outofbandmanagement.py (#5346) 2021-08-20 13:00:34 +02:00
test_outofbandmanagement_nestedplugin.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_over_provisioning.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_password_server.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_persistent_network.py Enable flake8 rule W292 No newline at end of file (#6274) 2022-06-30 12:08:27 +05:30
test_portable_publicip.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_portforwardingrules.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_primary_storage.py tests: Fix test failures for Local storage and Basic zones (#5106) 2021-07-01 09:45:21 +05:30
test_privategw_acl.py cloudstack: make code more inclusive 2021-06-08 15:47:20 +05:30
test_privategw_acl_ovs_gre.py OVS/GRE: bug fixes (#5446) 2021-10-03 14:47:52 +05:30
test_projects.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_public_ip_range.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_pvlan.py New feature: give access permission of networks to other accounts in same domain (#5769) 2022-04-19 11:29:31 -03:00
test_regions.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_reset_configuration_settings.py Fix spelling (#6272) 2022-07-05 09:08:53 +02:00
test_reset_vm_on_reboot.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_resource_accounting.py CLOUDSTACK-3009: Fix resource calculation CPU, RAM for accounts. (#3012) 2018-11-13 06:29:08 +05:30
test_resource_detail.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_router_dhcphosts.py cleanup of unused code and cleanup of cleanup procedure (#5562) 2021-12-23 10:10:38 +05:30
test_router_dns.py CLOUDSTACK-10193: Fix smoke tests failures with new systemvmtemplate 2017-12-23 09:22:44 +05:30
test_router_dnsservice.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_routers.py server: Use ACPI event to reboot VM on KVM, and Use 'forced' reboot option to stop and start the VM(s) (#4681) 2021-03-06 14:58:56 +05:30
test_routers_iptables_default_policy.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_routers_network_ops.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_scale_vm.py Enable flake8 rule W292 No newline at end of file (#6274) 2022-06-30 12:08:27 +05:30
test_secondary_storage.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_service_offerings.py kvm: volume encryption feature (#6522) 2022-09-27 10:20:59 +05:30
test_snapshots.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_ssvm.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_staticroles.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_storage_policy.py network: fix vm can be deployed on L2 network of other accounts (#5784) 2022-01-11 12:16:00 +05:30
test_templates.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_update_security_group.py Fix spelling (#6597) 2022-08-03 15:43:47 +05:30
test_usage.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_usage_events.py python3: Migrate Marvin and smoketests to python3 (#4727) 2021-05-04 23:19:37 +05:30
test_vm_deployment_planner.py server: Add support for new heuristics based VM Deployement for admins (#3454) 2019-07-13 12:52:48 +05:30
test_vm_life_cycle.py test: add test for importUnmanagedInstance (#6385) 2022-05-17 11:18:45 -03:00
test_vm_lifecycle_unmanage_import.py test: add test for importUnmanagedInstance (#6385) 2022-05-17 11:18:45 -03:00
test_vm_snapshot_kvm.py .github/linters: Enable flake8 W293 blank line contains whitespace (#6268) 2022-04-15 20:32:52 +05:30
test_vm_snapshots.py storage: New Dell EMC PowerFlex Plugin (formerly ScaleIO, VxFlexOS) (#4304) 2021-02-24 14:58:33 +05:30
test_volumes.py kvm: volume encryption feature (#6522) 2022-09-27 10:20:59 +05:30
test_vpc_ipv6.py test: add, refactor ipv6 network, vpc tests (#6338) 2022-07-12 12:54:53 +05:30
test_vpc_redundant.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_vpc_router_nics.py Add Python flake8 linting for W291 trailing whitespace with Super-Linter (#4687) 2022-03-28 11:40:26 -03:00
test_vpc_vpn.py Merge remote-tracking branch 'origin/4.15' into main 2021-08-18 16:56:19 +05:30