Commit Graph

358 Commits

Author SHA1 Message Date
Rohit Yadav 5a52ca78ae
kvm: export sysinfo for arm64 domains for cloud-init to work (#8940)
This fixes a limitation for arm64/aarch64 KVM hosts to correctly export
the product name via sysconfig attribute. Without this `cloud-init`
doesn't function correctly on arm64 platforms.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-04-19 21:23:49 +02:00
Suresh Kumar Anaparti d3e020a545
Mark libvirt events experimental, add properties flag (#8825)
* Mark libvirt events experimental, add properties flag

* unit test fixes

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-11 17:06:33 +05:30
Abhishek Kumar ff3e9bd821 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-04 14:11:05 +05:30
Wei Zhou 939d0b9011 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

core: some changes on SimpleHttpMultiFileDownloader
similar as HttpTemplateDownloader

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit b1642bc3bf)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-04-04 11:19:20 +05:30
Wei Zhou a7ec8738a2
kvm: fix NPE while import KVM VMs from other hosts (#8720) 2024-03-04 09:46:28 +01:00
Harikrishna c462be1412
New API "checkVolume" to check and repair any leaks or issues reported by qemu-img check (#8577)
* Introduced a new API checkVolumeAndRepair that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM

* some fixes

* Added unit tests

* addressed review comments

* add repair volume while granting access

* Changed repair parameter to accept both leaks/all

* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations

* Added volume check and repair changes only during VM start and volume attach operations

* Refactored the names to look similar across the code

* Some code fixes

* remove unused code

* Renamed repair values

* Fixed unit tests

* changed version

* Address review comments

* Code refactored

* used volume name in logs

* Changed the API to Async and the setting scope to storage pool

* Fixed exit value handling with check volume command

* Fixed storage scope to the setting

* Fix volume format issues

* Refactored the log messages

* Fix formatting
2024-02-29 14:41:49 +05:30
Vishesh a8028eecbd
Merge remote-tracking branch 'origin/4.18' into 4.19 2024-02-13 11:44:20 +05:30
dahn 672206c312
kvm: ITCO watchdog added (#8282)
* ITCO watchdog added

* add inject-nmi action

* Update plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtVMDef.java

Co-authored-by: Wei Zhou <weizhou@apache.org>

---------

Co-authored-by: Wei Zhou <weizhou@apache.org>
2024-02-12 08:54:39 +01:00
Wei Zhou b8904f75dd Merge remote-tracking branch 'apache/4.18' into 4.19 2024-02-05 10:08:31 +01:00
Marcus Sorensen 9f1b34aeb2
Fix libvirt domain event listener by properly processing events (#8437)
* Fix libvirt domain event listener by properly processing events

* Add javadoc for setupEventListener

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-02-05 13:30:10 +05:30
Lucas Martins 1c98b5a4e5 Change Cryptsetup validation (#8482)
Co-authored-by: lucas.martins.scclouds <lucas.martins@scclouds.com.br>
2024-02-01 09:43:28 +01:00
Vishesh fedcf66de0
Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)
This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)
2024-01-27 23:36:13 +05:30
Vishesh c3b77cb7b8
Fix host stuck in connecting state (#8502)
There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.
2024-01-15 13:56:34 +05:30
Nicolas Vazquez a3a4833c3e
Fixes for KVM unmanaged instances import on advanced network and VNC password (#8492)
This PR fixes a regression caused by #8465 on advanced zones, import fails with:

2024-01-10 12:13:33,234 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Allocating nic for vm 142272e8-9e2e-407b-9d7e-e9a03b81653c in network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10} during import
2024-01-10 12:13:33,239 ERROR [o.a.c.v.UnmanagedVMsManagerImpl] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Failed to import NICs while importing vm: i-2-31-VM
com.cloud.exception.InsufficientVirtualNetworkCapacityException: Unable to acquire Guest IP  address for network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10}Scope=interface com.cloud.dc.DataCenter; id=1
	at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.importNic(NetworkOrchestrator.java:4582)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importNic(UnmanagedVMsManagerImpl.java:859)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importVirtualMachineInternal(UnmanagedVMsManagerImpl.java:1198)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstanceFromHypervisor(UnmanagedVMsManagerImpl.java:1511)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.baseImportInstance(UnmanagedVMsManagerImpl.java:1342)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstance(UnmanagedVMsManagerImpl.java:1282)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Also, addresses the VNC password field set instead of a fixed string
2024-01-12 14:14:01 +05:30
Nicolas Vazquez 59e78cbc45
Fix KVM unmanage disks path (#8483)
This PR fixes the volumes path on KVM import unmanaged instances

Fixes: #8479
2024-01-11 14:45:57 +05:30
slavkap c569fe9119
Fix KVM import and list unmanaged VMs (#8445)
VM import fixes

1 - Fix of VM insert for VMs with StorPool volumes
2 - Fix of list/insert unmanaged VMs with RBD volumes
2024-01-10 13:12:07 +05:30
kishankavala ab20b1220f
KVM Ingestion - Import Instance (#7976)
This PR adds new functionality to import KVM instances from an external host or from disk images in local or shared storage.
Doc PR: https://github.com/apache/cloudstack-documentation/pull/356
2023-12-14 13:08:56 +05:30
Abhishek Kumar 82f7abddb3 Merge remote-tracking branch 'apache/4.18' 2023-12-13 11:24:15 +05:30
Bryan Lima 3bb318bab9
kvm: Add support for cgroupv2 (#8252)
1. Problem description

In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation.

    Equation 1
    shares = CPU * speed

Fixes: #6744
2. Proposed changes

To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion.

The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS.

    Equation 2
    shares = (VM requested shares * cgroup upper limit)/host max shares

To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host.

To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency.

It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now.
2023-12-13 10:51:24 +05:30
Rene Glover 1031c31e6a
FiberChannel Multipath for KVM + Pure Flash Array and HPE-Primera Support (#7889)
This PR provides a new primary storage volume type called "FiberChannel" that allows access to volumes connected to hosts over fiber channel connections. It requires Multipath to provide path discovery and failover. Second, the PR adds an AdaptivePrimaryDatastoreProvider that abstracts how volumes are managed/orchestrated from the connector to communicate with the primary storage provider, using a ProviderAdapter interface, allowing the code interacting with the primary storage provider API's to be simpler and have no direct dependencies on Cloudstack code. Lastly, the PR provides an implementation of the ProviderAdapter classes for the HP Enterprise Primera line of storage solutions and the Pure Flash Array line of storage solutions.
2023-12-09 11:31:33 +05:30
Abhishek Kumar c599011ef5 Merge remote-tracking branch 'apache/4.18' 2023-12-08 18:06:15 +05:30
Wei Zhou 7ea068c4dc
kvm: fix error 'Failed to find passphrase for keystore: cloud.jks' when enable SSL for kvm agent (#7923) 2023-12-07 09:10:11 +01:00
Nicolas Vazquez 371ad9f55b
New Feature: Import VMware VMs into KVM (#7881)
This PR adds the capability in CloudStack to convert VMware Instances disk(s) to KVM using virt-v2v and import them as CloudStack instances. It enables CloudStack operators to import VMware instances from vSphere into a KVM cluster managed by CloudStack. vSphere/VMware setup might be managed by CloudStack or be a standalone setup.

    CloudStack will let the administrator select a VM from an existing VMware vCenter in the CloudStack environment or external vCenter requesting vCenter IP, Datacenter name and credentials.
    The migrated VM will be imported as a KVM instance
    The migration is done through virt-v2v: https://access.redhat.com/articles/1351473, https://www.ovirt.org/develop/release-management/features/virt/virt-v2v-integration.html
    The migration process timeout can be set by the setting convert.instance.process.timeout
    Before attempting the virt-v2v migration, CloudStack will create a clone of the source VM on VMware. The clone VM will be removed after the registration process finishes.
    CloudStack will delegate the migration action to a KVM host and the host will attempt to migrate the VM invoking virt-v2v. In case the guest OS is not supported then CloudStack will handle the error operation as a failure
    The migration process using virt-v2v may not be a fast process
    CloudStack will not perform any check about the guest OS compatibility for the virt-v2v library as indicated on: https://access.redhat.com/articles/1351473.
2023-12-07 12:59:56 +05:30
sato03 fdfbb4fad1
Prioritize hypervisor.uri configuration (#8254)
Co-authored-by: Henrique Sato <henrique.sato@scclouds.com.br>
2023-12-06 16:43:04 -03:00
Daan Hoogland 14376ce298 Merge release branch 4.18 to main
* 4.18:
  kvm: fix ide controller for rocky/alma vms (#8247)
2023-12-06 16:06:09 +01:00
Wei Zhou db6dd52f44
kvm: fix ide controller for rocky/alma vms (#8247) 2023-12-06 15:05:49 +01:00
Stephan Krug 267a457efc
Externalize KVM HA heartbeat frequency (#6892)
Co-authored-by: Stephan Krug <stephan.krug@scclouds.com.br>
Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
2023-11-16 09:17:17 +01:00
Daan Hoogland 05b9b6e2e7 Merge branch '4.18' into main 2023-11-13 11:36:51 +01:00
Abhishek Kumar d0f3233fda
edge-zone,kvm,iso,cks: allow k8s deployment with direct-download iso (#8142)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2023-11-10 13:56:05 +01:00
slavkap 2bb182c3e1
KVM Host HA enhancement for StorPool storage (#8045)
Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA

This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage.

If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host
2023-11-04 12:35:37 +05:30
Daan Hoogland a15cb81c85 Merge remote-tracking branch 'apache/4.18' into main 2023-11-03 11:55:26 +01:00
Harikrishna 1e133d05c7
kvm: Handle the failures when setting up memory balloon stats period for KVM VMs (#8049) 2023-11-03 09:07:11 +01:00
John Bampton f090c77f41
misc: fix spelling (#7549)
Co-authored-by: Stephan Krug <stekrug@icloud.com>
2023-11-02 09:23:53 +01:00
Vishesh 5362bad442
Storage Management (#7949) 2023-11-01 10:46:22 +01:00
Daan Hoogland 587d1d7dba Merge remote-tracking branch 'apache/4.18' into main 2023-10-26 09:37:38 +02:00
slavkap 6ae3b73ca2
Create snapshot from VM snapshot without memory for NFS/Local storage (#8117) 2023-10-26 08:46:14 +02:00
Daan Hoogland 72cf9740f9 Merge branch '4.18' 2023-10-06 13:50:29 +02:00
Daniel Augusto Veronezi Salvador 9b8eaeea78
Fix: Convert volume to another directory instead of copying it while taking volume snapshots on KVM (#8041) 2023-10-06 09:47:34 +02:00
Marcus Sorensen 82b981854b
KVM Agent config to reserve dom0 CPUs (#7987)
This PR allows an admin to reserve some hypervisor host CPUs for system use. Another way to think of it is limiting the number of CPUs allocatable to VMs. This can be useful if the admin wants to do other things with the hypervisor's CPU, for example reserve some cores for running hyperconverged storage processes.

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-10-06 10:50:18 +05:30
Rohit Yadav 8cd7147b25 Merge remote-tracking branch 'origin/4.18' 2023-09-28 12:15:23 +05:30
Marcus Sorensen 3694667f50
Trigger out of band VM state update via libvirt event when VM stops (#7963)
* Trigger out of band VM state update via libvirt event when VM stops

* Add License headers, refactor nested try

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-28 12:12:03 +05:30
Marcus Sorensen 221f863939
Use direct download timeout configs for URL check (#7948)
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-28 12:11:38 +05:30
Marcus Sorensen 28c4be1cf2
Fix style for LibvirtComputingResource variable names and its dependencies (#7991)
* Fix style for LibvirtComputingResource variable names and its dependencies

* More variable name fixes

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-27 12:38:25 +05:30
Wei Zhou 45616aaf61 Merge remote-tracking branch 'origin/4.18' 2023-09-14 14:00:01 +02:00
Marcus Sorensen f049d4d189
Increase reserve on ScaleIO disk formatting for fragmentation (#7955)
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-14 16:43:16 +05:30
Wei Zhou f6b2a58727 Merge branch '4.18' 2023-09-07 08:56:35 +02:00
Wei Zhou 126dd5fa4c
kvm: fix live vm migration between local storage pools (#7945) 2023-09-07 08:22:37 +05:30
Nicolas Vazquez 57c61fb33c
Fix direct download https compressed qcow2 template checker (#7932)
This PR fixes an issue on direct download while registering HTTPS compressed files
Fixes: #7929
2023-09-01 08:16:03 +02:00
Daan Hoogland 7b64236469 Merge release branch 4.18 to main
* 4.18:
  server: remove registered userdata when cleanup an account (#7777)
  server: Use max secondary storage defined on the account during upload  (#7441)
  test: upgrade kubernetes versions to 1.25.0/1.26.0 (#7685)
  kvm: Added VNI Devices as normal bridge slave devs (#7836)
  noVNC: fix JP keyboard on vmware7+ which uses websocket URL (#7694)
2023-08-10 14:50:46 +02:00
fermosan fa58f59619
kvm: Added VNI Devices as normal bridge slave devs (#7836)
This will allow for VXLAN configurations to utilize tags on the physical network of a zone
2023-08-10 08:59:28 +02:00
Daan Hoogland eb31e3d795 Merge release branch 4.18 to main
* 4.18:
  Allow KVM overcommit to work without reducing minimum VM memory when vm ballooning is disabled (#7810)
2023-08-06 10:41:00 +02:00
Rohit Yadav dc5e4f3ec6
Allow KVM overcommit to work without reducing minimum VM memory when vm ballooning is disabled (#7810)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
Co-authored-by: Daan Hoogland <daan@onecht.net>
2023-08-06 10:39:14 +02:00
Daan Hoogland d51d8a4a13 Merge release branch 4.18 to main
* 4.18:
  UI: Filter templates by zone and hypervisor type when reinstall a VM (#7739)
  KVM: fix SSVM starting when overprovisioning memory (#7663)
  pom.xml: add property project.systemvm.template.location (#7706)
  cloudutils: fix adding rocky9 host failure due to missing /etc/sysconfig/libvirtd (#7779)
  server: get id from persisted object ReservationVO (#7785)
  search in (too) large result sets (#7766)
  ui: fix 404 error when list volumes of system vms (#7772)
  packaging: install tzdata-java on centos7/centos8 (#7768)
2023-07-31 09:04:44 +02:00
dahn d127d7939d
KVM: fix SSVM starting when overprovisioning memory (#7663) 2023-07-28 11:23:30 +02:00
Rohit Yadav 62a8f4ef72 Merge remote-tracking branch 'origin/4.18' 2023-07-24 15:57:37 +05:30
Marcus Sorensen 63216425d5
Set encrypted PowerFlex disk format correctly (#7735)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-07-24 13:13:46 +05:30
Daan Hoogland 6bb95c0200 Merge release branch 4.18 to main
* 4.18:
  Storage and volumes statistics tasks for StorPool primary storage (#7404)
  proper storage construction (#6797)
  guarantee MAC uniqueness (#7634)
  server: allow migration of all VMs with local storage on KVM (#7656)
  Add L2 networks to Zones with SG (#7719)
2023-07-19 10:59:19 +02:00
dahn 0aade286f5
proper storage construction (#6797) 2023-07-19 10:27:20 +02:00
Rohit Yadav 5383bf64f4 Merge remote-tracking branch 'origin/4.18'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-07-07 23:04:44 +05:30
dahn 2752c49fa7
agent: get the right controll cidr (#7580)
Fixes: #7574
2023-07-07 22:57:58 +05:30
Vishesh 594c70dde0
Sync precommit config from main (#7732)
Co-authored-by: John Bampton <jbampton@users.noreply.github.com>
Co-authored-by: dahn <daan@onecht.net>
2023-07-07 11:18:16 +02:00
Daan Hoogland 2132f46fcb Merge branch '4.18' 2023-07-06 11:24:08 +02:00
Nicolas Vazquez c733a23c90
Fix direct download URL checks (#7693)
This PR fixes the URL check for direct downloads, in the case of HTTPS URLs the certificates were not loaded into the SSL context
2023-07-06 13:47:13 +05:30
Wei Zhou 09a4a252d7 Merge remote-tracking branch 'apache/4.18' into HEAD 2023-06-21 15:08:56 +02:00
Harikrishna 40cc10a73d
Allow volume migrations in ScaleIO within and across ScaleIO storage clusters (#7408)
* Live storage migration of volume in scaleIO within same storage scaleio cluster

* Added migrate command

* Recent changes of migration across clusters

* Fixed uuid

* recent changes

* Pivot changes

* working blockcopy api in libvirt

* Checking block copy status

* Formatting code

* Fixed failures

* code refactoring and some changes

* Removed unused methods

* removed unused imports

* Unit tests to check if volume belongs to same or different storage scaleio cluster

* Unit tests for volume livemigration in ScaleIOPrimaryDataStoreDriver

* Fixed offline volume migration case and allowed encrypted volume migration

* Added more integration tests

* Support for migration of encrypted volumes across different scaleio clusters

* Fix UI notifications for migrate volume

* Data volume offline migration: save encryption details to destination volume entry

* Offline storage migration for scaleio encrypted volumes

* Allow multiple Volumes to be migrated with migrateVirtualMachineWithVolume API

* Removed unused unittests

* Removed duplicate keys in migrate volume vue file

* Fix Unit tests

* Add volume secrets if does not exists during volume migrations. secrets are getting cleared on package upgrades.

* Fix secret UUID for encrypted volume migration

* Added a null check for secret before removing

* Added more unit tests

* Fixed passphrase check

* Add image options to the encypted volume conversion
2023-06-21 11:57:05 +05:30
Wei Zhou 9d46df57f2
kvm: add vm setting for nic multiqueue number and packed virtqueues (#7333)
This PR adds two vm setting for user vms on KVM

- nic multiqueue number
- packed virtqueues enabled . optional are true and false (false by default). It requires qemu>=4.2.0 and libvirt >=6.3.0

Tested ok on ubuntu 22 and rocky 8.4
2023-05-09 15:19:26 +05:30
Rohit Yadav a2561df25b Merge remote-tracking branch 'origin/4.18' 2023-05-08 12:57:38 +05:30
Marcus Sorensen ec0f8bddf6
Support local storage live migration for direct download templates (#7453)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-05-04 17:37:58 -03:00
Nicolas Vazquez be66eb2a35
Auto Enable/Disable KVM hosts (#7170)
* Auto Enable Disable KVM hosts

* Improve health check result

* Fix corner cases

* Script path refactor

* Fix sonar cloud reports

* Fix last code smells

* Add marvin tests

* Fix new line on agent.properties to prevent host add failures

* Send alert on auto-enable-disable and add annotations when the setting is enabled

* Address reviews

* Add a reason for enabling or disabling a host when the automatic feature is enabled

* Fix comment on the marvin test description

* Fix for disabling the feature if the admin has manually updated the host resource state before any health check result
2023-04-04 17:03:37 +05:30
John Bampton c2e17310d6
Add three more `pre-commit` checks (#7083)
Co-authored-by: dahn <daan@onecht.net>
2023-03-27 13:28:55 +02:00
João Jandre 523ab58d02
Fix PR 7131 bugs and vulnerabilities (#7140) 2023-03-21 15:06:18 +01:00
Abhishek Kumar 1a03f69a3a
cleanup: remove testing logs (#7270)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2023-02-21 14:42:33 +01:00
Wei Zhou 8ef35466de
Tungsten: fix functional issues (#7173)
Co-authored-by: dahn <daan.hoogland@gmail.com>
2023-02-13 09:15:28 +01:00
David Jumani c774b865c9
Tungsten integration (#7065)
Co-authored-by: rtodirica <rtodirica@ena.com>
Co-authored-by: Huy Le <huylm@unitech.vn>
Co-authored-by: radu-todirica <Radu.Todirica@ness.com>
Co-authored-by: Huy Le <minh.le@ext.ewerk.com>
Co-authored-by: Simon Weller <siweller77@gmail.com>
Co-authored-by: dahn <daan@onecht.net>
2023-02-01 09:19:53 +01:00
Abhishek Kumar 028ca74fb6
ui,server,api: resource metrics improvements (#6803)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-01-30 09:48:03 +01:00
slavkap d288bb0c78
KVM support of iothreads and IO driver policy (#6909) 2023-01-25 12:34:05 +01:00
Wei Zhou 743ebe7278
kvm: get vm disk stats for ceph disks (#7045) 2023-01-16 14:19:14 +01:00
Rohit Yadav 55d2d26449
kvm: make UEFI host check to support both Ubuntu and EL (#7084)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-01-16 14:12:53 +01:00
Stephan Krug 4d76054377
Fix volume snapshot in VM with attached ISO (#7037)
Co-authored-by: Stephan Krug <stephan.krug@scclouds.com.br>
2023-01-04 09:24:34 +01:00
Rohit Yadav fab4fc2a14 Merge remote-tracking branch 'origin/4.17' 2022-12-22 22:55:15 +05:30
Paula Oliveira 0fe2e6950e
Improving code related to the Agent properties (#6348)
Co-authored-by: Paula Zomignani Oliveira <paula@scclouds.com.br>
Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2022-12-22 12:00:49 +01:00
Abhishek Kumar fb22c5c3c9
kvm: correctly set vm cpu topology (#6870)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2022-12-19 11:01:10 +01:00
Rodrigo D. Lopez 2ed7868f27
Inserts timer in check detach volume (#6508)
Co-authored-by: Lopez <rodrigo@scclouds.com.br>
Co-authored-by: Stephan Krug <stekrug@icloud.com>
2022-12-16 09:35:27 +01:00
John Bampton def7ce655d
Fix spelling (#6898)
Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com>
2022-12-13 14:58:14 +01:00
GaOrtiga 684f3f4c49
Improvements and cleanup on the javadocs of QemuImg (#6917)
* Cleanup in the javadocs of QemuImg

* Update QemuImg.java

* Apply suggestions from code review

Co-authored-by: Stephan Krug <stekrug@icloud.com>

Co-authored-by: cloudstack-lab-gabriel <gabriel.fernandes@scclouds.com.br>
Co-authored-by: Stephan Krug <stekrug@icloud.com>
2022-12-06 17:59:03 -03:00
Wei Zhou a63b2aba7a
VM Autoscaling with virtual router (#6571) 2022-12-05 15:23:03 +01:00
Daniel Augusto Veronezi Salvador b8b66b7a3d
Fix typos and improve javadocs on ByteScaleUtils (#6877)
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2022-11-10 10:14:24 +01:00
José Flauzino 1843632c24
Fix memory stats for KVM (#6358)
Co-authored-by: joseflauzino <jose@scclouds.com.br>
2022-11-09 18:00:12 +01:00
Wei Zhou 48ffa5dc0b
Support multiple ceph monitors (#6792) 2022-10-21 10:37:30 +02:00
Daniel Augusto Veronezi Salvador 2ca164ac96
Quota custom tariffs (#5909)
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
Co-authored-by: dahn <daan.hoogland@gmail.com>
2022-10-17 10:03:50 +02:00
Peinthor Rene ff961c9594
linstor: support QoS(IOPs) and small improvements (#6682)
This PR has 3 improvements for the Linstor primary storage driver:

- Create a separate jar of it and move all Linstor related classes into the correct project (similar to the storpool plugin)
- Add aux properties for Cloudstack volumes in Linstor to make it easier to identify them in Linstor
- Add support for IOPs settings with the Linstor storage plugin
2022-10-08 12:06:49 +05:30
Wei Zhou 6786c24138
kvm: fix backup volume snapshot fails on RBD storage (#6790)
This PR fixes the issue that volume snapshot fails on RBD storage with the following error

qemu-img: Could not open 'driver=raw,file.filename=rbd:cloudstack/test_wei.img:mon_host=10.0.32.254:auth_supported=cephx:id=cloudstack:key=AQDwHTNjjHXRKRAAJb+AToFr6x4a1AvKUc4Ksg==:rbd_default_format=2:client_mount_timeout=30': Could not open 'rbd:cloudstack/test_wei.img:mon_host=10.0.32.254:auth_supported=cephx:id=cloudstack:key=AQDwHTNjjHXRKRAAJb+AToFr6x4a1AvKUc4Ksg==:rbd_default_format=2:client_mount_timeout=30': No such file or directory

However, it works without using image options

Therefore, do not pass the image options if the image format is not QCOW2 and LUKS.
2022-10-08 11:55:33 +05:30
Marcus Sorensen 697e12f8f7
kvm: volume encryption feature (#6522)
This PR introduces a feature designed to allow CloudStack to manage a generic volume encryption setting. The encryption is handled transparently to the guest OS, and is intended to handle VM guest data encryption at rest and possibly over the wire, though the actual encryption implementation is up to the primary storage driver.

In some cases cloud customers may still prefer to maintain their own guest-level volume encryption, if they don't trust the cloud provider. However, for private cloud cases this greatly simplifies the guest OS experience in terms of running volume encryption for guests without the user having to manage keys, deal with key servers and guest booting being dependent on network connectivity to them (i.e. Tang), etc, especially in cases where users are attaching/detaching data disks and moving them between VMs occasionally.

The feature can be thought of as having two parts - the API/control plane (which includes scheduling aspects), and the storage driver implementation.

This initial PR adds the encryption setting to disk offerings and service offerings (for root volume), and implements encryption support for KVM SharedMountPoint, NFS, Local, and ScaleIO storage pools.

NOTE: While not required, operations can be significantly sped up by ensuring that hosts have the `rng-tools` package and service installed and running on the management server and hypervisors. For EL hosts the service is `rngd` and for Debian it is `rng-tools`. In particular, the use of SecureRandom for generating volume passphrases can be slow if there isn't a good source of entropy. This could affect testing and build environments, and otherwise would only affect users who actually use the encryption feature. If you find tests or volume creates blocking on encryption, check this first.

### Management Server

##### API

* createDiskOffering now has an 'encrypt' Boolean
* createServiceOffering now has an 'encryptroot' Boolean. The 'root' suffix is added here in case there is ever any other need to encrypt something related to the guest configuration, like the RAM of a VM.  This has been refactored to deal with the new separation of service offering from disk offering internally.
* listDiskOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listServiceOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listHosts now shows encryption support of each hypervisor host via `encryptionsupported`
* Volumes themselves don't show encryption on/off, rather the offering should be referenced. This follows the same pattern as other disk offering based settings such as the IOPS of the volume.

##### Volume functions

A decent effort has been made to ensure that the most common volume functions have either been cleanly supported or blocked. However, for the first release it is advised to mark this feature as *experimental*, as the code base is complex and there are certainly edge cases to be found.

Many of these features could eventually be supported over time, such as creating templates from encrypted volumes, but the effort and size of the change is already overwhelming.

Supported functions:
* Data Volume create
* VM root volume create
* VM root volume reinstall
* Offline volume snapshot/restore
* Migration of VM with storage (e.g. local storage VM migration)
* Resize volume
* Detach/attach volume

Blocked functions:
* Online volume snapshot
* VM snapshot w/memory
* Scheduled snapshots (would fail when VM is running)
* Disk offering migration to offerings that don't have matching encryption
* Creating template from encrypted volume
* Creating volume from encrypted volume
* Volume extraction (would we decrypt it first, or expose the key? Probably the former).

##### Primary Storage Support

For storage developers, adding encryption support involves:

1. Updating the `StoragePoolType` for your primary storage to advertise encryption support. This is used during allocation of storage to match storage types that support encryption to storage that supports it.

2. Implementing encryption feature when your `PrimaryDataStoreDriver` is called to perform volume lifecycle functions on volumes that are requesting encryption. You are free to do what your storage supports - this could be as simple as calling a storage API with the right flag when creating a volume. Or (as is the case with the KVM storage types), as complex as managing volume details directly at the hypervisor host. The data objects passed to the storage driver will contain volume passphrases, if encryption is requested.

##### Scheduling

For the KVM implementations specified above, we are dependent on the KVM hosts having support for volume encryption tools. As such, the hosts `StartupRoutingCommand` has been modified to advertise whether the host supports encryption. This is done via a probe during agent startup to look for functioning `cryptsetup` and support in `qemu-img`. This is also visible via the listHosts API and the host details in the UI.  This was patterned after other features that require hypervisor support such as UEFI.

The `EndPointSelector` interface and `DefaultEndpointSelector` have had new methods added, which allow the caller to ask for endpoints that support encryption.  This can be used by storage drivers to find the proper hosts to send storage commands that involve encryption. Not all volume activities will require a host to support encryption (for example a snapshot backup is a simple file copy), and this is the reason why the interface has been modified to allow for the storage driver to decide, rather than just passing the data objects to the EndpointSelector and letting the implementation decide.

VM scheduling has also been modified. When a VM start is requested, if any volume that requires encryption is attached, it will filter out hosts that don't support encryption.

##### DB Changes

A volume whose disk offering enables encryption will get a passphrase generated for it before its first use. This is stored in the new 'passphrase' table, and is encrypted using the CloudStack installation's standard configured DB encryption. A field has been added to the volumes table, referencing this passphrase, and a foreign key added to ensure passphrases that are referenced can't be removed from the database.  The volumes table now also contains an encryption format field, which is set by the implementer of the encryption and used as it sees fit.

#### KVM Agent

For the KVM storage pool types supported, the encryption has been implemented at Qemu itself, using the built-in LUKS storage support. This means that the storage remains encrypted all the way to the VM process, and decrypted before the block device is visible to the guest.  This may not be necessary in order to implement encryption for /your/ storage pool type, maybe you have a kernel driver that decrypts before the block device on the system, or something like that. However, it seemed like the simplest, common place to terminate the encryption, and provides the lowest surface area for decrypted guest data.

For qcow2 based storage, `qemu-img` is used to set up a qcow2 file with LUKS encryption. For block based (currently just ScaleIO storage), the `cryptsetup` utility is used to format the block device as LUKS for data disks, but `qemu-img` and its LUKS support is used for template copy.

Any volume that requires encryption will contain a passphrase ID as a byte array when handed down to the KVM agent. Care has been taken to ensure this doesn't get logged, and it is cleared after use in attempt to avoid exposing it before garbage collection occurs.  On the agent side, this passphrase is used in two ways:

1. In cases where the volume experiences some libvirt interaction it is loaded into libvirt as an ephemeral, private secret and then referenced by secret UUID in any libvirt XML. This applies to things like VM startup, migration preparation, etc.

2. In cases where `qemu-img` needs to use this passphrase for volume operations, it is written to a `KeyFile` on the cloudstack agent's configured tmpfs and passed along. The `KeyFile` is a `Closeable` and when it is closed, it is deleted. This allows us to try-with-resources any volume operations and get the KeyFile removed regardless.

In order to support the advanced syntax required to handle encryption and passphrases with `qemu-img`, the `QemuImg` utility has been modified to support the new `--object` and `--image-opts` flags. These are modeled as `QemuObject` and `QemuImageOptions`.  These `qemu-img` flags have been designed to supersede some of the existing, older flags being used today (such as choosing file formats and paths), and an effort could be made to switch over to these wholesale. However, for now we have instead opted to keep existing functions and do some wrapping to ensure backward compatibility, so callers of `QemuImg` can choose to use either way.

It should be noted that there are also a few different Enums that represent the encryption format for various purposes. While these are analogous in principle, they represent different things and should not be confused. For example, the supported encryption format strings for the `cryptsetup` utility has `LuksType.LUKS` while `QemuImg` has a `QemuImg.PhysicalDiskFormat.LUKS`.

Some additional effort could potentially be made to support advanced encryption configurations, such as choosing between LUKS1 and LUKS2 or changing cipher details. These may require changes all the way up through the control plane. However, in practice Libvirt and Qemu currently only support LUKS1 today. Additionally, the cipher details aren't required in order to use an encrypted volume, as they're stored in the LUKS header on the volume there is no need to store these elsewhere.  As such, we need only set the one encryption format upon volume creation, which is persisted in the volumes table and then available later as needed.  In the future when LUKS2 is standard and fully supported, we could move to it as the default and old volumes will still reference LUKS1 and have the headers on-disk to ensure they remain usable. We could also possibly support an automatic upgrade of the headers down the road, or a volume migration mechanism.

Every version of cryptsetup and qemu-img tested on variants of EL7 and Ubuntu that support encryption use the XTS-AES 256 cipher, which is the leading industry standard and widely used cipher today (e.g. BitLocker and FileVault).

Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
2022-09-27 10:20:59 +05:30
Abhishek Kumar 687a21c116 Merge remote-tracking branch 'apache/4.17' into main 2022-09-06 18:47:47 +05:30
Abhishek Kumar b831f23f5f
kvm: add libvirt host capabilities method for cpu speed retrieval (#6696)
Fixes #6680

While finding CPU speed for KVM host following methods will be used in the same order:
1. lscpu
2. value in /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
3. virsh capabilities
4. libvirt nodeinfo

This will allow correct value for AMD based hosts when first two methods doesn't give a value
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2022-09-06 16:45:05 +05:30
Marcus Sorensen f23a4db6d2
kvm: Add usermode interface option to Libvirt Domain XML builder (#6640)
This PR provides constructors and the associated changes to use LibvirtVMDef for creating user mode network interfaces.

While this isn't used directly in the CloudStack KVM agent today, it could be used in the future for e.g. pod networking/management networks without needing to assign a pod IP. The VIF driver used by the CloudStack Agent is also pluggable, so this allows plugin code to create user mode network interfaces as well.

Note that the user mode network already exists in the GuestNetType enum, but wasn't usable prior to this change.

Also included unit test to ensure we continue to create the expected XML.

Additionally, this uncovered a null pointer on _networkRateKBps and this PR fixes it. The decision to add bandwidth throttling assumes this field is not null and simply checks for > 0.

Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
2022-08-18 13:14:50 +05:30
Abhishek Kumar d9b770eb48 Merge remote-tracking branch 'apache/4.17' into main 2022-08-12 23:44:42 +05:30
Ruben Bosch 696b93f421
kvm: update host memory stats (#6622)
Fixes #6621

Each time getMemStat() is called, a static value is returned. This value
should instead be refreshed to return the actual memory used.

Co-authored-by: Ruben Bosch <ruben.bosch@cldin.eu>
2022-08-12 17:14:04 +05:30
Rohit Yadav 840c3f6a7a Merge remote-tracking branch 'origin/4.17' 2022-08-10 23:11:09 +02:00
slavkap 76f52af8f3
removed the use of SharedMountPoint storage type for the StorPool plugin (#6552)
Fixes #6455

The default storage adaptor - LibvirtStorageAdaptor - is used by different storage types and doesn't use the annotation @StorageAdaptorInfo. In this case, a storage plugin that wants to adopt one of the predefined storage pool types will override the default behaviour. If fixing the issue in general (for new storage plugins or current ones that want to reuse the existing storage pool types) would affect all volume/snapshot/VM cases. This will lead to the need of extensive testing for each storage plugin for which we don't have the resources to do it. That's why this patch fixes the old behaviour for the SharedMountPoint by adding a new storage pool type for the StorPool plugin.
2022-08-10 14:41:32 +05:30