Added caching for ConfigKey value retrievals based on the Caffeine
in-memory caching library.
https://github.com/ben-manes/caffeine
Currently, expire time for a cache is 1 minute and each update of the
config key invalidates the cache.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
- mTLS implementation for cluster service communication
- Listen only on the specified cluster node IP address instead of all interfaces
- Validate incoming cluster service requests are from peer management servers based on the server's certificate dns name which can be through global config - ca.framework.cert.management.custom.san
- Hardening of KVM command wrapper script execution
- Improve API server integration port check
- cloudstack-management.default: don't have JMX configuration if not needed. JMX is used for instrumentation; users who need to use it should enable it explicitly
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 4f5561937c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Mitigation for non-scalable Powerflex/ScaleIO clients
- Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts.
- Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts.
- Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only.
* tests issue fixed
* refactor / improvements
* lock with powerflex systemid while checking connections limit
* updated powerflex systemid lock to hold till sdc preparation
* Added custom stats support for storage pool, through listStoragePools API
* code improvements, and unit tests
* Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements
* Stop SDC on host after migration if no volumes mapped to host
* Wait for SDC to connect after scini service start, and some log improvements
* Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume
* some log improvements
* kvm: Add support for cgroupv2 (#8252)
1. Problem description
In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation.
Equation 1
shares = CPU * speed
Fixes: #6744
2. Proposed changes
To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion.
The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS.
Equation 2
shares = (VM requested shares * cgroup upper limit)/host max shares
To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host.
To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency.
It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now.
* Update overcommit ratio during live VM migration
* minor refactoring
---------
Co-authored-by: Bryan Lima <42067040+BryanMLima@users.noreply.github.com>
This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine).
There would be three mechanisms for purging removed resources:
- Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings.
- API - New API `purgeExpungedResources`. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate
- Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge.
Following new global settings have been added:
- `expunged.resources.purge.enabled`: Default: false. Whether to run a background task to purge the DB records of the expunged resources.
- `expunged.resources.purge.resources`: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the DB records of the expunged resources. Currently only VirtualMachine is supported. An empty value will result in considering all resource types for purging.
- `expunged.resources.purge.interval`: Default: 86400. Interval (in seconds) for the background task to purge the DB records of the expunged resources.
- `expunged.resources.purge.delay`: Default: 300. Initial delay (in seconds) to start the background task to purge the DB records of the expunged resources task.
- `expunged.resources.purge.batch.size`: Default: 50. Batch size to be used during purging of the DB records of the expunged resources.
- `expunged.resources.purge.start.time`: Default: (empty). Start time to be used by the background task to purge the DB records of the expunged resources. Use format `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss`.
- `expunged.resources.purge.keep.past.days`: Default: 30. The number of days in the past from the execution time of the background task to purge the DB records of the expunged resources for which the expunged resources must not be purged. To enable purging DB records of the expunged resource till the execution of the background task, set the value to zero.
- `expunged.resource.purge.job.delay`: Default: 180. Delay (in seconds) to execute the purging of the DB records of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used.
Upstream PRs:
https://github.com/apache/cloudstack/pull/8999https://github.com/apache/cloudstack-documentation/pull/397
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
* ScaleIO volume live migration - use usable bytes from source disk to format the destination disk
* Don't abort block copy job when cur,end = 0
* code improvements
* Merge two HostTagVO and HostTagDaoImpl
* Apple FR76: dynamic host tags
* Revert "Apple FR76: dynamic host tags"
This reverts commit 01b93a873f167018c4fafd0744c0de07ae4de4ed.
* Apple FR76: Implicit host tags
* Apple FR76: address Abhishek's comments
* Apple FR76: move updateImplicitTags
* Apple FR76: add since to other two responses
* Update 8929: add unit test in LibvirtComputingResourceTest
* Update variable names
* Update FR76: add explicithosttags in response
* Update FR76 UI: Update explicit host tags
* Update 8929: remove host tags and change labels on UI
* Update: ui polish for host tags
* fix since in responses
* Update 8929: fix UI error if no host tags
Found these CPU and DB hotspot that handle agent ping commands, this
adds idle load when there are high number of hosts. By design, there
isn't any quick win here. However, the power sync report/handling could
be improved, so it doesn't need to kick-in for every ping command
received.
Few more areas marked in the codebase.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Also add a flag to disable on-the-fly metrics computation when the
list metrics APIs for zones and clusters are called.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Add ability to set cpu.threadspercore similar to existing cpu.corespersocket
* Add license to new test file
* Add tests to handle some edge cases
* Add some edge test cases to CPU topology
* Rework logic on KVM CPU topology, handle more cases
* Add more test cases
* Add more test cases
* Update plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
* Added cpu.threadspercore detail in listDetailOptions response (for KVM hypervisor)
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Add a global setting to control whether redirection is allowed while
downloading templates and volumes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Support KVM storage implementations controlling logical/physical block io size
* Support custom block size during disk attach
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
* Use join instead of views for filtering volumes
* Use join instead of views for filtering events
* Use join instead of views for filtering accounts
* Use join instead of views for filtering domains
* Use join instead of views for filtering hosts
* Use join instead of views for filtering storage pools
* Use join instead of views for filtering service offerings
* Use join instead of views for filtering disk offerings
* Remove unused code
* Fix unit test
* Use disk_offering instead of disk_offering_view in service_offering_view
* Fixup
* Fix listing of diskoffering & serviceoffering
* Use constants instead of strings
* Make changes to prevent sql injection
* Remove commented code
* Prevent n+1 queries for template's response
* remove unused import
* refactor some code
* Add missing check for service offering's join with disk offering
* Fix n+1 queries for stoage pool metrics
* Remove n+1 queries from list accounts
* Remove unused imports
* remove todo
* Remove unused import
* Fixup query generation for nested joins
* Fixups
* Fix DB exception on ClientPreparedStatement
* events,alerts: Add missing indexes (#366)
* Fixup
* Fix resource count discrepancies
* Fixup while removing vm
* Fix discrepancies when starting VMs
* Fixup tests
* Fixups
* Don't take lock when amount is negative
Adds:
- Fix for volume limit checks for disk offerings with multiple tags - When a VM is deployed with multiple disks having offerings with multiple tags the resource limit check may falter as currently it tries to check based on individual diskoffering. With this, change if offering d1 and d2 for volumes v1 and v2 both have tag1, server will check volume limits for tag tag1 using the combined size of v1 and v2.
- Fix for template tag hosts in random host allocator - May affect use of template tag, service offering tags and random host allocator together. The current code for the random host allocator falters while trying to find the host allocation. This was found and fixed during the addition of the unit test here, https://github.com/shapeblue/cloudstack-apple/pull/378/files#diff-bbf9baea014e5cc1dfe9e7d13467c9857208cfe65e93883721d88a6f0452f912
- Unit tests for changes in api,server,ui: tagged resource limits #327
* Introduced a new API "checkVolumeAndRepair" that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM
* some fixes
* Added unit tests
* addressed review comments
* add repair volume while granting access
* Changed repair parameter to accept both leaks/all values
* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations
* Added volume check and repair changes only during VM start and volume attach operations
* Refactored the names to look similar across the code
* Some code fixes
* remove unused code
* Renamed repair values
* Addressed review comments
* code refactored
* used volume name in logs
* Changed the API to Async and the setting scope to storage pool
* Fixed exit value handling with check volume command
* Fixed storage scope to the setting
* Fixed volume format issues
* Refactored the log messages
* Fix formatting
* Fix host stuck in connecting state (#8502)
There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)
While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).
On the agent side, it gets stuck waiting for a response from the Script execution.
To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.
SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;
This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.
* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)
This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.
We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.
We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).
For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.
Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)
* fixup
* Check if volume on datastore requires access for migration, and grant/revoke volume access if requires
* Updated default implementation for requiresAccessForMigration method in PrimaryDataStoreDriver
Introduces the concept of tagged resource limits. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types,
Host limits
user_vm
cpu
memory
Storage limits
volume
primary_storage
Following global settings can used to specify tags for which limit needs to be enforced,
Host: resource.limit.host.tags
Storage: resource.limit.storage.tags
Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI.
Enhances use of templatetag for VM deployment and template creation
Adds option to list disk offering with suitability flag for a virtualmachine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the reponse.
* StoragePoolType as a class
* Fix agent side StoragePoolType enum to class
* Handle StoragePoolType for StoragePoolJoinVO
* Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation.
Implemented conveter class and logic to utilize @Convert annotation.
* Fix UserVMJoinVO for StoragePoolType
* fixed missing imports
* Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation.
Implemented conveter class and logic to utilize @Convert annotation.
* Fixed equals for the enum.
* removed not needed try/catch for prepareAttribute
* Added license to the file.
* Implemented "supportsPhysicalDiskCopy" for storage adaptor. (#352)
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
* Add javadoc to StoragePoolType class
* Add unit test for StoragePoolType comparisons
* StoragePoolType "==" and ".equals()" fix.
* Fix for abstract storage adaptor set up issue
* review comments
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
Co-authored-by: mprokopchuk <mprokopchuk@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Making a diskful resource was meant as an optimization,
but cannot work on non hyperconverged setups,
as the storage nodes (diskful) are not part of the cloudstack cluster.
(cherry picked from commit 67cb9b9e40)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
A TODO was overseen and never implemented,
which could trigger the following bug:
If Linstor didn't create a resource (diskless or diskfull) on
the cloudstack choosen node, it would not be able to copy the
template data there, it even seems no error was
triggered and the new template file silently just became
empty/corrupt.
(cherry picked from commit 4a86a0d233)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Trigger out of band VM state update via libvirt event when VM stops
* Add License headers, refactor nested try
---------
Co-authored-by: Marcus Sorensen <mls@apple.com>
(cherry picked from commit 3694667f50)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Design document: https://cwiki.apache.org/confluence/display/CLOUDSTACK/%5BDRAFT%5D+Minimal+changes+to+allow+new+dynamic+hypervisor+type%3A+Custom+Hypervisor
This PR introduces the minimal changes to add a new hypervisor type (internally named Custom in the codebase, and configurable display name), allowing to write an external hypervisor plugin as a Custom Hypervisor to CloudStack
The custom hypervisor name is set by the setting: 'hypervisor.custom.display.name'. The new hypervisor type does not affect the behaviour of any CloudStack operation, it simply introduces a new hypervisor type into the system.
CloudStack does not have any means to dynamically add new hypervisor types. The hypervisor types are internally preset by an enum defined within the CloudStack codebase and unless a new version supports a new hypervisor it is not possible to add a host of a hypervisor that is not in part of the enum. It is possible to implement minimal changes in CloudStack to support a new hypervisor plugin that may be developed privately
This PR is an initial work on allowing new dynamic hypervisor types (adds a new element to the HypervisorType enum, but allows variable display name for the hypervisor)
Replace the HypervisorType from a fixed enum to an extensible registry mechanism, registered from the hypervisor plugin
- The new hypervisor type is internally named 'Custom' to the CloudStack services (management server and agent services, database records).
- A new global setting ‘hypervisor.custom.display.name’ allows administrators to set the display name of the hypervisor type. The display name will be shown in the CloudStack UI and API.
- In case the ‘hypervisor.list’ setting contains the display name of the new hypervisor type, the setting value is automatically updated after the ‘hypervisor.custom.display.name’ setting is updated.
- The new Custom hypervisor type supports:
- Direct downloads (the ability to download templates into primary storage from the hypervisor hosts without using secondary storage)
- Local storage (use hypervisor hosts local storage as primary storage)
- Template format: RAW format (the templates to be registered on the new hypervisor type must be in RAW format)
- The UI is also extended to display the new hypervisor type and the supported features listed above.
- The above are the minimal changes for CloudStack to support the new hypervisor type, which can be tested by integrating the plugin codebase with this feature.
This PR allows the cloud administrators to test custom hypervisor plugins implementations in CloudStack and easily integrate it into CloudStack as a new hypervisor type ("Custom"), reducing the implementation to only the hypervisor supported specific storage/networking and the hypervisor resource to communicate with the management server.
- CloudStack admin should be able to create a zone for the new custom hypervisor and add clusters, hosts into the zone with normal operations
- CloudStack users should be able to execute normal VMs/volumes/network/storage operations on VMs/volumes running on the custom hypervisor hosts
(cherry picked from commit 8b5ba13b81)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>