OAuth2, the industry-standard authorization or authentication framework, simplifies the process of
granting access to resources. CloudStack supports OAuth2 authentication wherein users can login into
CloudStack without using a username and password. Support for Google and Github providers has been added.
Other OAuth2 providers can be easily integrated with CloudStack using its plugin framework.
The login page will show provider options when the OAuth2 is enabled and corresponding providers are configured.
"OAuth configuration" sub-section is present under "Configuration" where admins can register the corresponding
OAuth providers.
This pull request (PR) implements a Distributed Resource Scheduler (DRS) for a CloudStack cluster. The primary objective of this feature is to enable automatic resource optimization and workload balancing within the cluster by live migrating the VMs as per configuration.
Administrators can also execute DRS manually for a cluster, using the UI or the API.
Adds support for two algorithms - condensed & balanced. Algorithms are pluggable allowing ACS Administrators to have customized control over scheduling.
Implementation
There are three top level components:
Scheduler
A timer task which:
Generate DRS plan for clusters
Process DRS plan
Remove old DRS plan records
DRS Execution
We go through each VM in the cluster and use the specified algorithm to check if DRS is required and to calculate cost, benefit & improvement of migrating that VM to another host in the cluster. On the basis of cost, benefit & improvement, the best migration is selected for the current iteration and the VM is migrated. The maximum number of iterations (live migrations) possible on the cluster is defined by drs.iterations which is defined as a percentage (as a value between 0 and 1) of total number of workloads.
Algorithm
Every algorithms implements two methods:
needsDrs - to check if drs is required for cluster
getMetrics - to calculate cost, benefit & improvement of a migrating a VM to another host.
Algorithms
Condensed - Packs all the VMs on minimum number of hosts in the cluster.
Balanced - Distributes the VMs evenly across hosts in the cluster.
Algorithms use drs.level to decide the amount of imbalance to allow in the cluster.
APIs Added
listClusterDrsPlan
id - ID of the DRS plan to list
clusterid - to list plans for a cluster id
generateClusterDrsPlan
id - cluster id
iterations - The maximum number of iterations in a DRS job defined as a percentage (as a value between 0 and 1) of total number of workloads. Defaults to value of cluster's drs.iterations setting.
executeClusterDrsPlan
id - ID of the cluster for which DRS plan is to be executed.
migrateto - This parameter specifies the mapping between a vm and a host to migrate that VM. Format of this parameter: migrateto[vm-index].vm=<uuid>&migrateto[vm-index].host=<uuid>.
Config Keys Added
ClusterDrsPlanExpireInterval
Key drs.plan.expire.interval
Scope Global
Default Value 30 days
Description The interval in days after which old DRS records will be cleaned up.
ClusterDrsEnabled
Key drs.automatic.enable
Scope Cluster
Default Value false
Description Enable/disable automatic DRS on a cluster.
ClusterDrsInterval
Key drs.automatic.interval
Scope Cluster
Default Value 60 minutes
Description The interval in minutes after which a periodic background thread will schedule DRS for a cluster.
ClusterDrsIterations
Key drs.max.migrations
Scope Cluster
Default Value 50
Description Maximum number of live migrations in a DRS execution.
ClusterDrsAlgorithm
Key drs.algorithm
Scope Cluster
Default Value condensed
Description DRS algorithm to execute on the cluster. This PR implements two algorithms - balanced & condensed.
ClusterDrsLevel
Key drs.imbalance
Scope Cluster
Default Value 0.5
Description Percentage (as a value between 0.0 and 1.0) of imbalance allowed in the cluster. 1.0 means no imbalance
is allowed and 0.0 means imbalance is allowed.
ClusterDrsMetric
Key drs.imbalance.metric
Scope Cluster
Default Value memory
Description The cluster imbalance metric to use when checking the drs.imbalance.threshold. Possible values are memory and cpu.
This PR adds new functionality to copy snapshots across zones and take snapshots for multiple zones.
Copy functionality is similar to template copy. The source zone acts as the web server from where the destination zone(s) can download the snapshot files. For this purpose, a new API - `copySnapshot` has been added. The response for copySnapshot will be returning zone and download details from the first destination zone of the request. This behaviour is similar to the `copyTemplate` API.
In a similar manner, multiple zones can be selected while taking the snapshots or creating snapshot policies. For this snapshot will be taken in the base zone(in which volume is present) and then copied to the additional zones. A new parameter - `zoneids` has been added to `createSnapshot` and `createSnapshotPolicy` APIs.
As snapshots can be present on multiple zones (secondary stores), a new parameter `zoneid` has been added to delete the snapshot copy on a specific zone.
`listSnapshots` API has been updated to allow listing snapshot entries for different zones/datastores. New parameters - `showUnique`, `locationType` have been added.
Events generated during snapshot operations will now be linked to the snapshot itself rather than the volume of the snapshot.
`listSnapshotPolicies` and `createSnapshotPolicy` APIs will return zone details of the zones in which backup will be scheduled for the policy.
----
New API added
`copySnapshot`
Request and response params updated for APIs
```
- listSnapshots
- deleteSnapshot
- createTemplate
- listZones
- listSnapshotPolicies
- createSnapshotPolicy
```
UI updated for
- Snapshot detail view
- Create snapshot form
- Create snapshot policy form
- Create volume (from snapshot) form
- Create template (from snapshot) form
Doc PR: https://github.com/apache/cloudstack-documentation/pull/344
PR: https://github.com/apache/cloudstack/pull/7873
Making a diskful resource was meant as an optimization,
but cannot work on non hyperconverged setups,
as the storage nodes (diskful) are not part of the cloudstack cluster.
A TODO was overseen and never implemented,
which could trigger the following bug:
If Linstor didn't create a resource (diskless or diskfull) on
the cloudstack choosen node, it would not be able to copy the
template data there, it even seems no error was
triggered and the new template file silently just became
empty/corrupt.
This PR allows an admin to reserve some hypervisor host CPUs for system use. Another way to think of it is limiting the number of CPUs allocatable to VMs. This can be useful if the admin wants to do other things with the hypervisor's CPU, for example reserve some cores for running hyperconverged storage processes.
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Trigger out of band VM state update via libvirt event when VM stops
* Add License headers, refactor nested try
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Fix style for LibvirtComputingResource variable names and its dependencies
* More variable name fixes
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
This fixes the following cases in which Solidfire storage integration
caused issues when using Solidfire datadisks with VMware:
1. Take Volume Snapshot of Solidfire data disk
2. Delete an active Instance with Solidfire data disk attached
3. Attach used existing Solidfire data disk to a running/stopped VM
4. Stop and Start an instance with Solidfire data disks attached
5. Expand disk by resizing Solidfire data disk by providing size
6. Expand disk by changing disk offering for the Solidfire data disk
Additional changes:
- Use VMFS6 as managed datastore type if the host supports
- Refactor detection and splitting of managed storage ds name in storage
processor
- Restrict storage rescanning for managed datastore when resizing
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* VMware: add support for 8.0b (8.0.0.2)
* VMware 8: add new guest os mappings in VirtualMachineGuestOsIdentifier
The full list can be found at https://developer.vmware.com/apis/1355/vsphere
* VMware: get guest os mappings of parent version
* VMware8: remove guest os mappings for 8.0.0.2
* VMware8: fix code smells
* vmware: remove annotations in VmwareVmImplementerTest which caused 0.0% code coverage
* VMware8: add a unit test case
* VMware: add support for 8.0c (8.0.0.3)
* VMware8: move to CloudStackVersion.getVMwareParentVersion
* VMware: add support for 8.0u1 (8.0.1.0)
* Copy engine/schema/src/main/java/com/cloud/upgrade/GuestOsMapper.java from PR 6979
* Copy engine/schema/src/main/java/com/cloud/storage/dao/GuestOSHypervisorDao.java from PR 6979
* VMware: ignore the last number in VMware versions
* VMware: copy guest os mapping from 8.0 to 8.0.1
* VMware: add unit tests in VmwareVmImplementerTest.java
* Copy engine/schema/src/test/java/com/cloud/upgrade/GuestOsMapperTest.java from PR 6979
* VMware8: retry vm poweron if fails due to exception "File system specific implementation of Ioctl[file] failed"
This fixes a weird issue on vmware8. When power on a vm, sometimes it fails due to error
2023-04-27 07:04:43,207 ERROR [c.c.h.v.r.VmwareResource] (DirectAgent-442:ctx-cdd42b03 10.0.32.133, job-105/job-106, cmd: StartCommand) (logid:8a24a607) StartCommand failed due to [Exception: java.lang.RuntimeException
Message: File system specific implementation of Ioctl[file] failed
].
java.lang.RuntimeException: File system specific implementation of Ioctl[file] failed
at com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(VmwareClient.java:426)
at com.cloud.hypervisor.vmware.mo.VirtualMachineMO.powerOn(VirtualMachineMO.java:288)
in vmware.log on ESXi host, it shows
2023-04-27T09:20:41.713Z In(05)+ vmx - Power on failure messages: File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of LookupAndOpen[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - File system specific implementation of Ioctl[file] failed
2023-04-27T09:20:41.713Z In(05)+ vmx - Failed to lock the file
2023-04-27T09:20:41.713Z In(05)+ vmx - Cannot open the disk '/vmfs/volumes/7b29c876-ac102328/i-2-167-VM/ROOT-167.vmdk' or one of the snapshot disks it depends on.
2023-04-27T09:20:41.713Z In(05)+ vmx - Module 'Disk' power on failed.
2023-04-27T09:20:41.713Z In(05)+ vmx - Failed to start the virtual machine.
There is a KB article for it, but I still do not know why and how to fix it.
https://kb.vmware.com/s/article/1004232
* VMware: extract to method powerOnVM
* vmware: fix mistake in logs
* vmware8: use curl instead of wget to fix test failures
Traceback (most recent call last):
File "/root/test_internal_lb.py", line 555, in test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80
self.execute_internallb_roundrobin_tests(vpc_offering)
File "/root/test_internal_lb.py", line 641, in execute_internallb_roundrobin_tests
client_vm, applb.sourceipaddress, max_http_requests)
File "/root/test_internal_lb.py", line 497, in run_ssh_test_accross_hosts
(e, clienthost.public_ip))
AssertionError: list index out of range: SSH failed for VM with IP Address: 10.0.52.187
and
sshClient: DEBUG: {Cmd: /usr/bin/wget -T3 -qO- --user=admin --password=password http://10.1.2.253:8081/admin?stats via Host: 10.0.52.188} {returns: ["/usr/bin/wget: '/usr/lib/libpcre.so.1' is not an ELF file", "/usr/bin/wget: can't load library 'libpcre.so.1'"]}
* VMware: correct guest OS names in hypervisor mappings for VMware 8.0
el9 and variants were introduced by https://github.com/apache/cloudstack/pull/7059
they are supported with guest os identifiers since VMware 8.0
see https://vdc-repo.vmware.com/vmwb-repository/dcr-public/c476b64b-c93c-4b21-9d76-be14da0148f9/04ca12ad-59b9-4e1c-8232-fd3d4276e52c/SDK/vsphere-ws/docs/ReferenceGuide/vim.vm.GuestOsDescriptor.GuestOsIdentifier.html
* VMware: add Ubuntu 20.04 and 22.04 support for vmware 7.0+
* PR7380: only add guest os mappings for Ubuntu 20.04
* PR7380: Correct RHEL9 guest os names and others for VMware 8.0
* PR7380: correct guest os names on 8.0.0.1 as well
* PR7380: remove Windows 12 and Windows Server 2025 which are not released yet
### Description
Design document: https://cwiki.apache.org/confluence/display/CLOUDSTACK/%5BDRAFT%5D+Minimal+changes+to+allow+new+dynamic+hypervisor+type%3A+Custom+Hypervisor
This PR introduces the minimal changes to add a new hypervisor type (internally named Custom in the codebase, and configurable display name), allowing to write an external hypervisor plugin as a Custom Hypervisor to CloudStack
The custom hypervisor name is set by the setting: 'hypervisor.custom.display.name'. The new hypervisor type does not affect the behaviour of any CloudStack operation, it simply introduces a new hypervisor type into the system.
CloudStack does not have any means to dynamically add new hypervisor types. The hypervisor types are internally preset by an enum defined within the CloudStack codebase and unless a new version supports a new hypervisor it is not possible to add a host of a hypervisor that is not in part of the enum. It is possible to implement minimal changes in CloudStack to support a new hypervisor plugin that may be developed privately
This PR is an initial work on allowing new dynamic hypervisor types (adds a new element to the HypervisorType enum, but allows variable display name for the hypervisor)
##### Proposed Future work:
Replace the HypervisorType from a fixed enum to an extensible registry mechanism, registered from the hypervisor plugin
#### Feature Specifications
- The new hypervisor type is internally named 'Custom' to the CloudStack services (management server and agent services, database records).
- A new global setting ‘hypervisor.custom.display.name’ allows administrators to set the display name of the hypervisor type. The display name will be shown in the CloudStack UI and API.
- In case the ‘hypervisor.list’ setting contains the display name of the new hypervisor type, the setting value is automatically updated after the ‘hypervisor.custom.display.name’ setting is updated.
- The new Custom hypervisor type supports:
- Direct downloads (the ability to download templates into primary storage from the hypervisor hosts without using secondary storage)
- Local storage (use hypervisor hosts local storage as primary storage)
- Template format: RAW format (the templates to be registered on the new hypervisor type must be in RAW format)
- The UI is also extended to display the new hypervisor type and the supported features listed above.
- The above are the minimal changes for CloudStack to support the new hypervisor type, which can be tested by integrating the plugin codebase with this feature.
#### Use cases
This PR allows the cloud administrators to test custom hypervisor plugins implementations in CloudStack and easily integrate it into CloudStack as a new hypervisor type ("Custom"), reducing the implementation to only the hypervisor supported specific storage/networking and the hypervisor resource to communicate with the management server.
- CloudStack admin should be able to create a zone for the new custom hypervisor and add clusters, hosts into the zone with normal operations
- CloudStack users should be able to execute normal VMs/volumes/network/storage operations on VMs/volumes running on the custom hypervisor hosts
steps to reproduce the issue:
- git clone https://github.com/apache/cloudstack.git
- cd cloudstack
- rm -rf .git/
- run `mvn -P developer,systemvm clean install`
Without this PR, it fails with error
```
> [ 8/10] RUN mvn -Pdeveloper -Dsimulator -DskipTests clean install:
668.1 [ERROR] Failed to execute goal pl.project13.maven:git-commit-id-plugin:4.9.10:revision (get-the-git-infos) on project cloud-plugin-storage-volume-storpool: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]
```
* 4.18:
server: remove registered userdata when cleanup an account (#7777)
server: Use max secondary storage defined on the account during upload (#7441)
test: upgrade kubernetes versions to 1.25.0/1.26.0 (#7685)
kvm: Added VNI Devices as normal bridge slave devs (#7836)
noVNC: fix JP keyboard on vmware7+ which uses websocket URL (#7694)
* 4.18:
UI: allow new keys for VM details (#7793)
Refactoring StorPool's smoke tests (#7392)
UI: decode userdata in EditVM dialog (#7796)
packaging: unalias cp before package upgrade (#7722)
make NoopDbUpgrade do a systemvm template check (#7564)
UI unit test: fix expected values (#7792)
* Removed the hardcoded StorPool endpoint from tests
- removed the hardcoded enpoint of StorPool primary storage from tests
- added the git commit information into the maven build
* Convert indents to spaces
* update git-commit-id-plugin version
* 4.18:
UI: Filter templates by zone and hypervisor type when reinstall a VM (#7739)
KVM: fix SSVM starting when overprovisioning memory (#7663)
pom.xml: add property project.systemvm.template.location (#7706)
cloudutils: fix adding rocky9 host failure due to missing /etc/sysconfig/libvirtd (#7779)
server: get id from persisted object ReservationVO (#7785)
search in (too) large result sets (#7766)
ui: fix 404 error when list volumes of system vms (#7772)
packaging: install tzdata-java on centos7/centos8 (#7768)
* 4.18:
SSVM: 'allow from' private IP in other SSVMs if the public IP is in allowed internal sites cidrs (#7288)
eof added to StorPoolStatsCollector (#7754)
* 4.18:
Storage and volumes statistics tasks for StorPool primary storage (#7404)
proper storage construction (#6797)
guarantee MAC uniqueness (#7634)
server: allow migration of all VMs with local storage on KVM (#7656)
Add L2 networks to Zones with SG (#7719)
Currently, ACS can continue to show an imported instance/VM as an unmanaged instance if the name and internalCSName (custom attribute, cloud.vm.internal.name) is different for the instance/VM on vCenter. This PR while filtering managed instances from the instance list received from ESXi host also checks if the internal name for the instance is not in the managed instance names list.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
There are tools like cluster-api which create and manage kubernetes cluster on CloudStack. This PR adds the option to add unmanaged kubernetes cluster which are not managed by CKS plugin. This helps provide a consolidated view of unmanaged clusters on CloudStack. The changes done make sure that operations for managed clusters are not executed for unmanaged clusters.
Two new APIs have also been added:
1. addVirtualMachinesToKubernetesCluster - to add VMs to unmanaged clusters.
2. removeVirtualMachinesFromKubernetesCluster - to remove VMs to unmanaged clusters.
Two APIs have been updated:
1. createKubernetesCluster - made KUBERNETES_VERSION_ID, SERVICE_OFFERING_ID, SIZE as not required for unmanaged clusters. Add an additional parameter, managed, which is true by default.
2. listKubernetesClusters - Add a parameter managed to filter on managed field.
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
This PR fixes an intermittent issue where SDC id (local_path) is getting deleted and not getting populated when host connects back again.
Fix is to remove the code to delete the records from storage_pool_host_ref table. We are anyways updating the entry if the SDC ID is changed during agent restart which is anyways required inorder to get the new connections. I've quickly verified the host delete scenario to check the storage_pool_host_ref entries behavior, entries are getting deleted.
* Guest OS mapping improvements
- Checks the OS mapping name in hypervisor (VMware, XenServer)
- Displays guest OS mappings in UI
* Added API getHypervisorGuestOsNames to list the guest OS names in the hypervisor, and code improvements
* Some static analysis fixes
* Removed commented code in listview
* Guest OS list
* UI changes for adding guest os and mappings
* Added guest os mappings in guest os form
* Added new filter to guest os mapping
* Name and description changes
* VMWare Host and cluster MO unit tests
* CheckGuestOsMapping command and answer unit tests
* GetHypervisorGuestOsNames command and answer unit tests
* VmwareResource unitests
* GuestOsMapper unittests
* icon changes
* Addressed review comments
* Renaming fixes
* Removed comments
* marvin tests for guest os operations
* Added marvin tests for OS mappings
* Document links and UI improvements
* Added deduplication for the list guest OS API
* Fixed linter failure
* Few bug fixes and UI changes
* Few improvements
* Addressed code smells
* Fixed UI issues after rebase
---------
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Supported Virtual machine operations:
- live migration of VM to another host
- virtual machine snapshots (group snapshot without memory)
- revert VM snapshot
- delete VM snapshot
Supported Volume operations:
- attach/detach volume
- live migrate volume between two StorPool primary storages
- volume snapshot
- delete snapshot
- revert snapshot
* Live storage migration of volume in scaleIO within same storage scaleio cluster
* Added migrate command
* Recent changes of migration across clusters
* Fixed uuid
* recent changes
* Pivot changes
* working blockcopy api in libvirt
* Checking block copy status
* Formatting code
* Fixed failures
* code refactoring and some changes
* Removed unused methods
* removed unused imports
* Unit tests to check if volume belongs to same or different storage scaleio cluster
* Unit tests for volume livemigration in ScaleIOPrimaryDataStoreDriver
* Fixed offline volume migration case and allowed encrypted volume migration
* Added more integration tests
* Support for migration of encrypted volumes across different scaleio clusters
* Fix UI notifications for migrate volume
* Data volume offline migration: save encryption details to destination volume entry
* Offline storage migration for scaleio encrypted volumes
* Allow multiple Volumes to be migrated with migrateVirtualMachineWithVolume API
* Removed unused unittests
* Removed duplicate keys in migrate volume vue file
* Fix Unit tests
* Add volume secrets if does not exists during volume migrations. secrets are getting cleared on package upgrades.
* Fix secret UUID for encrypted volume migration
* Added a null check for secret before removing
* Added more unit tests
* Fixed passphrase check
* Add image options to the encypted volume conversion
This change will allow CKS to be enabled by default on new installs. It would not affect server or performance but would help highlighting k8s support in CloudStack.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
JaCoCo used for code coverage calculation in the project doesn't support PowerMockito classes.
This PR attemps to reduce usage of PowerMockito.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
When a host is not tagged, its maintenance status is reported in the
cloudstack_hosts_total metric: maintenance_enabled is OFFLINE,
maintenance_disabled is ONLINE.
When a host is tagged, its maintenance status is now also verified to
ensure consistent behaviour.
In prometheus exporter, maintenance status for cloudstack_hosts_total_by_tag is not checked. While it is checked for cloudstack_hosts_total metric.
Classified by_tag or not, metrics should be the same.
Fixes: #7470
This PR adds two vm setting for user vms on KVM
- nic multiqueue number
- packed virtqueues enabled . optional are true and false (false by default). It requires qemu>=4.2.0 and libvirt >=6.3.0
Tested ok on ubuntu 22 and rocky 8.4
This PR fixes#7362 and also other search criteria to use the name as an exact search where keyword is also there.
Made UI changes for roles search to make use of keyword instead of name.
* Auto Enable Disable KVM hosts
* Improve health check result
* Fix corner cases
* Script path refactor
* Fix sonar cloud reports
* Fix last code smells
* Add marvin tests
* Fix new line on agent.properties to prevent host add failures
* Send alert on auto-enable-disable and add annotations when the setting is enabled
* Address reviews
* Add a reason for enabling or disabling a host when the automatic feature is enabled
* Fix comment on the marvin test description
* Fix for disabling the feature if the admin has manually updated the host resource state before any health check result
There are multiple ways in which a SAML response can be formatted, especially when encryption is enabled. This PR removes the hardcoding of EncryptedKeyResolver= InlineEncryptedKeyResolver in favor of using a ChainingEncryptedKeyResolver which will try multiple resolvers. It preserves the InlineEncryptedKeyResolver as the first option but adds EncryptedElementTypeEncryptedKeyResolver to the chain of resolvers to try.
ChainingEncryptedKeyResolver is a bit finicky in that you can't provide it a list of resolvers, you can only fetch its internal list and add to it.
Theoretically we could add all of the resolver types to the chain, but for now just preserving the ones known to be in use.
Co-authored-by: Marcus Sorensen <mls@apple.com>
* 4.17:
api: fix new password is applied on host when update host password with update_passwd_on_host=false (#7092)
CKS: remove details when delete a cks cluster (#7104)
api/server: add project id/name in ssh keypair response (#7100)
Fixes#6983
In case of multiple classes for and API class, ApiServer returns an API command class for User role only when ResponseView is set to Restricted in annotation.
This PR set Restricted ResponseView for ListVMsMetrics class. It also adds a smoke test for User role account for the listVirtualMachinesMetrics API.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
PR #5774 introduced a functionality to allow operators to choose in which Veeam's repository, if more than one is configured, ACS' clone job will be executed. However, a change was missing in the PR and caused the errors reported in #6599.
This PR addresses the fix for #6599.
Co-authored-by: SadiJr <sadi@scclouds.com.br>
Fixes#6786
listVirtualMachinesMetrics does not support some of the params that are supported by admin API call for listVirtualMachines.
These parameters are used in UI.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohityadav89@gmail.com>
Co-authored-by: Daan Hoogland <daan@onecht.net>
* Cleanup in the javadocs of QemuImg
* Update QemuImg.java
* Apply suggestions from code review
Co-authored-by: Stephan Krug <stekrug@icloud.com>
Co-authored-by: cloudstack-lab-gabriel <gabriel.fernandes@scclouds.com.br>
Co-authored-by: Stephan Krug <stekrug@icloud.com>
* Add quota plugin to accout/domain scope
* Add check in quota usage calculation to skip accounts with quota disabled
* Set quota config enabled default to true
* Fix if condition
* Update condition to use primitive boolean expression
Co-authored-by: dahn <daan.hoogland@gmail.com>
* Remove unused var
* Add quota state as a column in the Quota Summary view
* Remove trailling spaces
* Address review
Co-authored-by: dahn <daan.hoogland@gmail.com>
This PR has 3 improvements for the Linstor primary storage driver:
- Create a separate jar of it and move all Linstor related classes into the correct project (similar to the storpool plugin)
- Add aux properties for Cloudstack volumes in Linstor to make it easier to identify them in Linstor
- Add support for IOPs settings with the Linstor storage plugin
This PR fixes the issue that volume snapshot fails on RBD storage with the following error
qemu-img: Could not open 'driver=raw,file.filename=rbd:cloudstack/test_wei.img:mon_host=10.0.32.254:auth_supported=cephx:id=cloudstack:key=AQDwHTNjjHXRKRAAJb+AToFr6x4a1AvKUc4Ksg==:rbd_default_format=2:client_mount_timeout=30': Could not open 'rbd:cloudstack/test_wei.img:mon_host=10.0.32.254:auth_supported=cephx:id=cloudstack:key=AQDwHTNjjHXRKRAAJb+AToFr6x4a1AvKUc4Ksg==:rbd_default_format=2:client_mount_timeout=30': No such file or directory
However, it works without using image options
Therefore, do not pass the image options if the image format is not QCOW2 and LUKS.
This PR Fixes#6739 (for PowerFlex/ScaleIO only, Datera still needs to be addressed), which can occur if the last host the VM ran on is deleted from CloudStack. At the point the VM is deleted, cloudstack attempts to make a final call to revoke access to volumes, passing the last host the VM ran on. If this host is gone, we get an error and are unable to delete the VM.
It's possible that there may be a more holistic fix to this by identifying all of the places where revokeAccess() is called and checking for null host. It's possible other storage plugins don't even need host information to revoke access to volumes and may need this call to revoke. Therefore I'm only applying this fix to the ScaleIOPrimaryDataStoreDriver to skip revoking access when there is no host to revoke access for, and this should protect us as well when a new part of the code tries to use revokeAccess() in the future.
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Export count of total/up/down hosts by tags
* Export count of vms by state and host tag.
* Add host tags to host cpu/cores/memory usage in Prometheus exporter
* Cloudstack Prometheus exporter: Add allocated capacity group by host tag.
* Show count of Active domains on grafana.
* Show count of Active accounts and vms by size on grafana
* Use prepared statement to query database for a number of VM who use a specific tag.
* Extract repeated codes to new methods.
This PR introduces a feature designed to allow CloudStack to manage a generic volume encryption setting. The encryption is handled transparently to the guest OS, and is intended to handle VM guest data encryption at rest and possibly over the wire, though the actual encryption implementation is up to the primary storage driver.
In some cases cloud customers may still prefer to maintain their own guest-level volume encryption, if they don't trust the cloud provider. However, for private cloud cases this greatly simplifies the guest OS experience in terms of running volume encryption for guests without the user having to manage keys, deal with key servers and guest booting being dependent on network connectivity to them (i.e. Tang), etc, especially in cases where users are attaching/detaching data disks and moving them between VMs occasionally.
The feature can be thought of as having two parts - the API/control plane (which includes scheduling aspects), and the storage driver implementation.
This initial PR adds the encryption setting to disk offerings and service offerings (for root volume), and implements encryption support for KVM SharedMountPoint, NFS, Local, and ScaleIO storage pools.
NOTE: While not required, operations can be significantly sped up by ensuring that hosts have the `rng-tools` package and service installed and running on the management server and hypervisors. For EL hosts the service is `rngd` and for Debian it is `rng-tools`. In particular, the use of SecureRandom for generating volume passphrases can be slow if there isn't a good source of entropy. This could affect testing and build environments, and otherwise would only affect users who actually use the encryption feature. If you find tests or volume creates blocking on encryption, check this first.
### Management Server
##### API
* createDiskOffering now has an 'encrypt' Boolean
* createServiceOffering now has an 'encryptroot' Boolean. The 'root' suffix is added here in case there is ever any other need to encrypt something related to the guest configuration, like the RAM of a VM. This has been refactored to deal with the new separation of service offering from disk offering internally.
* listDiskOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listServiceOfferings shows encryption support on each offering, and has an encrypt boolean to choose to list only offerings that do or do not support encryption
* listHosts now shows encryption support of each hypervisor host via `encryptionsupported`
* Volumes themselves don't show encryption on/off, rather the offering should be referenced. This follows the same pattern as other disk offering based settings such as the IOPS of the volume.
##### Volume functions
A decent effort has been made to ensure that the most common volume functions have either been cleanly supported or blocked. However, for the first release it is advised to mark this feature as *experimental*, as the code base is complex and there are certainly edge cases to be found.
Many of these features could eventually be supported over time, such as creating templates from encrypted volumes, but the effort and size of the change is already overwhelming.
Supported functions:
* Data Volume create
* VM root volume create
* VM root volume reinstall
* Offline volume snapshot/restore
* Migration of VM with storage (e.g. local storage VM migration)
* Resize volume
* Detach/attach volume
Blocked functions:
* Online volume snapshot
* VM snapshot w/memory
* Scheduled snapshots (would fail when VM is running)
* Disk offering migration to offerings that don't have matching encryption
* Creating template from encrypted volume
* Creating volume from encrypted volume
* Volume extraction (would we decrypt it first, or expose the key? Probably the former).
##### Primary Storage Support
For storage developers, adding encryption support involves:
1. Updating the `StoragePoolType` for your primary storage to advertise encryption support. This is used during allocation of storage to match storage types that support encryption to storage that supports it.
2. Implementing encryption feature when your `PrimaryDataStoreDriver` is called to perform volume lifecycle functions on volumes that are requesting encryption. You are free to do what your storage supports - this could be as simple as calling a storage API with the right flag when creating a volume. Or (as is the case with the KVM storage types), as complex as managing volume details directly at the hypervisor host. The data objects passed to the storage driver will contain volume passphrases, if encryption is requested.
##### Scheduling
For the KVM implementations specified above, we are dependent on the KVM hosts having support for volume encryption tools. As such, the hosts `StartupRoutingCommand` has been modified to advertise whether the host supports encryption. This is done via a probe during agent startup to look for functioning `cryptsetup` and support in `qemu-img`. This is also visible via the listHosts API and the host details in the UI. This was patterned after other features that require hypervisor support such as UEFI.
The `EndPointSelector` interface and `DefaultEndpointSelector` have had new methods added, which allow the caller to ask for endpoints that support encryption. This can be used by storage drivers to find the proper hosts to send storage commands that involve encryption. Not all volume activities will require a host to support encryption (for example a snapshot backup is a simple file copy), and this is the reason why the interface has been modified to allow for the storage driver to decide, rather than just passing the data objects to the EndpointSelector and letting the implementation decide.
VM scheduling has also been modified. When a VM start is requested, if any volume that requires encryption is attached, it will filter out hosts that don't support encryption.
##### DB Changes
A volume whose disk offering enables encryption will get a passphrase generated for it before its first use. This is stored in the new 'passphrase' table, and is encrypted using the CloudStack installation's standard configured DB encryption. A field has been added to the volumes table, referencing this passphrase, and a foreign key added to ensure passphrases that are referenced can't be removed from the database. The volumes table now also contains an encryption format field, which is set by the implementer of the encryption and used as it sees fit.
#### KVM Agent
For the KVM storage pool types supported, the encryption has been implemented at Qemu itself, using the built-in LUKS storage support. This means that the storage remains encrypted all the way to the VM process, and decrypted before the block device is visible to the guest. This may not be necessary in order to implement encryption for /your/ storage pool type, maybe you have a kernel driver that decrypts before the block device on the system, or something like that. However, it seemed like the simplest, common place to terminate the encryption, and provides the lowest surface area for decrypted guest data.
For qcow2 based storage, `qemu-img` is used to set up a qcow2 file with LUKS encryption. For block based (currently just ScaleIO storage), the `cryptsetup` utility is used to format the block device as LUKS for data disks, but `qemu-img` and its LUKS support is used for template copy.
Any volume that requires encryption will contain a passphrase ID as a byte array when handed down to the KVM agent. Care has been taken to ensure this doesn't get logged, and it is cleared after use in attempt to avoid exposing it before garbage collection occurs. On the agent side, this passphrase is used in two ways:
1. In cases where the volume experiences some libvirt interaction it is loaded into libvirt as an ephemeral, private secret and then referenced by secret UUID in any libvirt XML. This applies to things like VM startup, migration preparation, etc.
2. In cases where `qemu-img` needs to use this passphrase for volume operations, it is written to a `KeyFile` on the cloudstack agent's configured tmpfs and passed along. The `KeyFile` is a `Closeable` and when it is closed, it is deleted. This allows us to try-with-resources any volume operations and get the KeyFile removed regardless.
In order to support the advanced syntax required to handle encryption and passphrases with `qemu-img`, the `QemuImg` utility has been modified to support the new `--object` and `--image-opts` flags. These are modeled as `QemuObject` and `QemuImageOptions`. These `qemu-img` flags have been designed to supersede some of the existing, older flags being used today (such as choosing file formats and paths), and an effort could be made to switch over to these wholesale. However, for now we have instead opted to keep existing functions and do some wrapping to ensure backward compatibility, so callers of `QemuImg` can choose to use either way.
It should be noted that there are also a few different Enums that represent the encryption format for various purposes. While these are analogous in principle, they represent different things and should not be confused. For example, the supported encryption format strings for the `cryptsetup` utility has `LuksType.LUKS` while `QemuImg` has a `QemuImg.PhysicalDiskFormat.LUKS`.
Some additional effort could potentially be made to support advanced encryption configurations, such as choosing between LUKS1 and LUKS2 or changing cipher details. These may require changes all the way up through the control plane. However, in practice Libvirt and Qemu currently only support LUKS1 today. Additionally, the cipher details aren't required in order to use an encrypted volume, as they're stored in the LUKS header on the volume there is no need to store these elsewhere. As such, we need only set the one encryption format upon volume creation, which is persisted in the volumes table and then available later as needed. In the future when LUKS2 is standard and fully supported, we could move to it as the default and old volumes will still reference LUKS1 and have the headers on-disk to ensure they remain usable. We could also possibly support an automatic upgrade of the headers down the road, or a volume migration mechanism.
Every version of cryptsetup and qemu-img tested on variants of EL7 and Ubuntu that support encryption use the XTS-AES 256 cipher, which is the leading industry standard and widely used cipher today (e.g. BitLocker and FileVault).
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
Adds option to provide custom DNS servers for isolated network, shared network and VPC tier.
New API parameters added in createNetwork API along with the corresponding response parameters.
Doc PR: apache/cloudstack-documentation#276
Fixes#6680
While finding CPU speed for KVM host following methods will be used in the same order:
1. lscpu
2. value in /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
3. virsh capabilities
4. libvirt nodeinfo
This will allow correct value for AMD based hosts when first two methods doesn't give a value
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Fixes#6657
Fixes k8s cluster node VM deployment when the underlying hypervisor host has multiple host tags and the service offering used for the cluster deployment does not contain all of those host tags.
This PR provides constructors and the associated changes to use LibvirtVMDef for creating user mode network interfaces.
While this isn't used directly in the CloudStack KVM agent today, it could be used in the future for e.g. pod networking/management networks without needing to assign a pod IP. The VIF driver used by the CloudStack Agent is also pluggable, so this allows plugin code to create user mode network interfaces as well.
Note that the user mode network already exists in the GuestNetType enum, but wasn't usable prior to this change.
Also included unit test to ensure we continue to create the expected XML.
Additionally, this uncovered a null pointer on _networkRateKBps and this PR fixes it. The decision to add bandwidth throttling assumes this field is not null and simply checks for > 0.
Signed-off-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
Fixes#4314
Failure in attaching k8s ISO is seen when VMware DRS is enabled. Log reported VM is not found. This fix tries to find VM on peer hosts when the VM is not found on the given host.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Fixes#6621
Each time getMemStat() is called, a static value is returned. This value
should instead be refreshed to return the actual memory used.
Co-authored-by: Ruben Bosch <ruben.bosch@cldin.eu>
Fixes issue #6465 where listing backup restore points are failing with Veeam version v11.0.1.1261.
Though this version is not fully supported for backup and recovery, existing backups, restore points for the VMs can continue to work with the Veeam version v11.0.1.1261. I've created a separate ticket here to fully support the version #6554
Fixes#6455
The default storage adaptor - LibvirtStorageAdaptor - is used by different storage types and doesn't use the annotation @StorageAdaptorInfo. In this case, a storage plugin that wants to adopt one of the predefined storage pool types will override the default behaviour. If fixing the issue in general (for new storage plugins or current ones that want to reuse the existing storage pool types) would affect all volume/snapshot/VM cases. This will lead to the need of extensive testing for each storage plugin for which we don't have the resources to do it. That's why this patch fixes the old behaviour for the SharedMountPoint by adding a new storage pool type for the StorPool plugin.
This PR fixes the issue #6209 where the snapshot revert operation fails after certain volume operations like Migrate VM with volume / migrate volume / reinstall VM.
The root cause of the issue after these volume operations, the primary storage entry is getting deleted for that volume. We have fixed it here to get the primary datastore entry wrt volume and continue the operation.
* add global setting to allow parallel execution on vmware
* cleanup setting distribution for vmware.create.full.clone
* query setting in vmware guru
* don´t touch other hypervisor's commands
* guru hierarchy cleanup
Fixes#6514
On latest systemvm template used for CKS /usr/sbin is not present in the $PATH for normal user used during upgrade. This leads to failure for blkid command. Due to this during k8s version upgrade ISO is not being able to mount on the k8s cluster VMs and upgrade process is not carried out.
This PR fixes mounting of k8s version ISO and also returns failure for script when ISO mounting is failed.
Same failure is not seen during deployment of the cluster because setup-kube-system workflow is executed as ROOT user and it has a different value for $PATH.
From /etc/login.defs:
ENV_SUPATH PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ENV_PATH PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR fixes the issue #6427 -> SAML request must be appended to an IdP URL as a query param with an ampersand, if the URL already contains a question mark, as opposed to always assume that IdP URLs don't have any query params.
Google's IdP URL for instance looks like this: https://accounts.google.com/o/saml2/idp?idpid=<ID>, therefore the expected redirect URL would be https://accounts.google.com/o/saml2/idp?idpid=<ID>&SAMLRequest=<SAMLRequest>
This code change is backwards compatible with the current behaviour.
* Fix VMware VM migration with volume in case of local storage
* Break the loop once target host is found
* Code optimisations in getting the target host guid for local storage
* Fixed code smells and added unit test
* Add more logs to migrate VM process in KVM
* Remove unused imports
* Verify if debug is enable before write the log string
* Fix conflicts
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
Co-authored-by: SadiJr <sadi@scclouds.com.br>
This PR enhances the existing PowerFlex/ScaleIO storage plugin to support separate (storage) network for Hosts(KVM)/Storage connection, mainly the SDC (ScaleIo Data Client) connection.
* cks: Fix when deployed on a nw without internet access
* Revert "cks: Fix when deployed on a nw without internet access"
This reverts commit 40e3338001.
* cks: Fix issue when creating cluster in nw without internet access
* Fix extract snapshot from VM snapshot on KVM
* Fix validation expression - does not need to escape the slash
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
* login/-out constants
* no request listener
* store session as value, using id as key
* Apply suggestions from sonarcloud.io code review
three instances of unsafe parameters to logging
* new sonar issues
* sonar issues
* kvm: truncate vnc password to 8 chars (#6244)
This PR truncates the vnc password of kvm vms to 8 chars to support latest versions of libvirt.
* Use lang3 string utils
Co-authored-by: Wei Zhou <weizhou@apache.org>
* agent: enable ssl only for kvm agent (not in system vms)
* Revert "agent: enable ssl only for kvm agent (not in system vms)"
This reverts commit b2d76bad2e.
* Revert "KVM: Enable SSL if keystore exists (#6200)"
This reverts commit 4525f8c8e7.
* KVM: Enable SSL if keystore exists in LibvirtComputingResource.java
Fixes: #6346
Move LDAP embedded server dependencies to test scope so they aren't packaged in final management server jar.
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Improve log when live patching fails
* change patching path from /tmp to /var/cache/clou
* add iptable rule for console proxy (novnc)
* temporary template paths
* revert pom xml to original paths