Commit Graph

36162 Commits

Author SHA1 Message Date
Abhishek Kumar 35ed30bd51 continuation of 1d47e4d4ae
list host IDs instead of complete row where possible

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-16 18:03:06 +05:30
Abhishek Kumar 33321f00ce template list fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-16 14:28:51 +05:30
Abhishek Kumar 99115b9f09 server: cache cluster host type retrievals during connections
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 18:05:54 +05:30
Abhishek Kumar 97ddd17f94 fix related to 38d6c4e7e7
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 17:40:20 +05:30
Abhishek Kumar 1d47e4d4ae engine-schema,server,plugins: list host IDs instead whole row where
applicable

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 16:19:43 +05:30
Abhishek Kumar fe4ef05053 server,engine-schema: use single query to to list host capacities while
host capacity update

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 16:17:47 +05:30
Abhishek Kumar 98b27a409d engine-schema: fix get host type count searchcriteria
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 15:06:33 +05:30
Abhishek Kumar 38d6c4e7e7 optimize finding ready systemvm template for zone
Retrieving templates with inner join on host table was turning out to be
more expensive than finding distinct hypervisor types for a zone using
hsot table and then finding templates for those types

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 12:10:47 +05:30
Abhishek Kumar e1a5bd9ef2 improve agentlb sort when host list not needed
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-13 10:25:31 +05:30
Abhishek Kumar a1ee64344d address host/cluster dao listall
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 17:23:44 +05:30
Abhishek Kumar ad275e7a36 remove dead code
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 17:21:00 +05:30
Abhishek Kumar eb74974685 more test fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 17:00:57 +05:30
Abhishek Kumar 9e5c99ef9e fix tests from a78a2508e9
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 16:58:18 +05:30
Abhishek Kumar a78a2508e9 server: refactor MS list retrieval for agent connect
During agent join and while changing configs - host and indirect.agent.lb.algorithm, optimize calling DB for zone's host list

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 16:00:17 +05:30
Abhishek Kumar 68bab20d24 VMInstanceDao.updatePowerState refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-12 15:47:00 +05:30
Abhishek Kumar 1d11605787 remove todo as configkey caching is implemented
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-11 13:27:17 +05:30
Abhishek Kumar d5a774c736 import fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-11 11:44:48 +05:30
Abhishek Kumar de60fb64e8 fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-10 15:35:14 +05:30
Abhishek Kumar 9074c4b6ad address process vm power state report for transitioning VMs
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-10 15:22:16 +05:30
Abhishek Kumar 3e098b87a9 fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-10 15:22:02 +05:30
Abhishek Kumar df137fc387 refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-10 10:22:59 +05:30
Abhishek Kumar 61764aba1f cache and executors refactoring
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-09 19:39:50 +05:30
Abhishek Kumar e798ab30b3 cache api permission in DynamicRoleBasedAPIAccessChecker
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-06 14:28:30 +05:30
Abhishek Kumar 8f6c657159 optimize scanStalledVms procedure
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-06 14:27:56 +05:30
Abhishek Kumar af53644a0b utils: add wrapper for the loading cache
Follow up for #9628
Creates a utility class LazyCache which currently wraps Caffeine library Cache class.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-05 13:29:22 +05:30
Abhishek Kumar 8ee5e6a99a refactor transitioning vm process report
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-04 18:35:23 +05:30
Abhishek Kumar 060a8ca623 fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 17:56:03 +05:30
Abhishek Kumar 4f1eeae9f7 server: DownloadListener - add caching for processConnect StartupCommand
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 16:25:28 +05:30
Abhishek Kumar 1be848da25 server: PingRoutingCommand - enable scanStalledVm
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 16:03:11 +05:30
Abhishek Kumar 337add8fb9 server: PingRoutingCommand - apply some optimizations
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 16:01:46 +05:30
Abhishek Kumar 013ebfaf46 Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements 2024-09-03 15:58:30 +05:30
Abhishek Kumar 4400e02a1b
framework/config,server: configkey caching (#472)
Added caching for ConfigKey value retrievals based on the Caffeine
in-memory caching library.
https://github.com/ben-manes/caffeine
Currently, expire time for a cache is 1 minute and each update of the
config key invalidates the cache.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 15:53:08 +05:30
mprokopchuk 74ceba1f00
Merge pull request #474 from shapeblue/powerflex_cross_cluster_data_volume_migration
Provide encryption key for DATA volume type (in addition to ROOT) to copy volume.
2024-08-29 09:12:41 -07:00
Abhishek Kumar 3f80cd3c66 optimize db list.size() cases
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-08-27 13:06:25 +05:30
mprokopchuk 8890e71052 Provide encryption key for DATA volume type (in addition to ROOT) to copy volume. 2024-08-13 12:09:41 -07:00
Rohit Yadav a794462da1
server, api: account and api entity access improvements (#470)
Backports CVE fix from upstream 4.18.2.3

https://cloudstack.apache.org/blog/security-release-advisory-4.19.1.1-4.18.2.3

(cherry picked from commit e7dce2bcce)

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
2024-08-07 21:18:24 +05:30
Abhishek Kumar a7516bbd55
test: improve purge expunged resources b/g task testcase (#467)
* test: improve purge expunged resources b/g task testcase

Failures were seen for during purging of expunged resources via
bacground task during different test runs. This PR tries to make sure
b/g task execution is not skipped after MS restrat in a multi-MS
environment. It also updates expunged.resources.purge.interval to allow
running task again in 60s.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* new string formatting

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-07-30 11:25:08 +05:30
Abhishek Kumar e676b80052 revert fc2e4ffd12
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-07-29 13:08:28 +05:30
Abhishek Kumar fc2e4ffd12 server: refactor listNetworks api database retrievals (#9184)
* server: refactor listNetworks api database retrievals

* fixes

* remove unused methods

* imports

* fix empty searchcriteria issue

* refactor

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-07-22 16:14:00 +05:30
Abhishek Kumar 5e98405b38 Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements 2024-07-22 16:12:19 +05:30
Suresh Kumar Anaparti d1faa59677
Back port fixes from upstream 4.19 (#466)
* Fixed src datastore on copy check for PowerFlex/ScaleIO storage driver (#9310)

* Ignore non-managed pools for storage pool access preparation (#9376)
2024-07-19 09:38:11 +05:30
Suresh Kumar Anaparti 5c682677fc
Support resource name / displaytext with unicode / emoji chars, and SQL exception msg improvements (#460)
* Don't send sql exception/query from dao to upper layer, log it and send only the error message

* Updated charset to utf8mb4, for display_name column/user_vm table and job_result column/async_job table to support unicode chars & emojis

* Added API arg validator for RFC compliance domain name, to validate VM's host name

* Updated user resources name / display name column's charset to utf8mb4

* Check and update char set for affinity group name to utf8mb4, from the data migration in upgrade path

* Updated backup offering name column charset to utf8mb4

* Added unit tests for vm host/domain name validation

* Added smoke test to check resource name for vm, volume, service & disk offering, template, iso, account(first/lastname)

* Updated resource annotation charset to utf8mb4

* Updated some resources description charset to utf8mb4
2024-07-19 09:35:18 +05:30
Rohit Yadav a142359784
saml: make default signature check mandatory
Backport https://github.com/apache/cloudstack/pull/9357
2024-07-12 09:40:59 +05:30
Rohit Yadav b46e4d4bbf
framework/cluster: improve cluster service and integration API service (#465)
- mTLS implementation for cluster service communication
- Listen only on the specified cluster node IP address instead of all interfaces
- Validate incoming cluster service requests are from peer management servers based on the server's certificate dns name which can be through global config - ca.framework.cert.management.custom.san
- Hardening of KVM command wrapper script execution
- Improve API server integration port check
- cloudstack-management.default: don't have JMX configuration if not needed. JMX is used for instrumentation; users who need to use it should enable it explicitly

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 4f5561937c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-07-09 09:03:40 +05:30
Vishesh c6d35b31ca
Log stdout to a file (#399)
* Log stdout to a file

* Add logrotation

* Fixup permissions in log file

* Remove info logs in stdout

* Change output file names

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Fix logrotate config

* Disable logging to stdout

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-03 20:51:20 +05:30
Marcus Sorensen 23a0faf729
Apply upstream SAML sig check from #9219 (#463)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-07-01 09:33:40 +05:30
Abhishek Kumar 08246e05ed
server,test: fix resourceid for VOLUME.DETROY in restore VM (#9032) (#454)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-06-28 12:01:55 +05:30
Suresh Kumar Anaparti be87b1a668
FR74: Mitigation for non-scalable ScaleIO clients (#447)
* Mitigation for non-scalable Powerflex/ScaleIO clients
- Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts.
- Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts.
- Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only.

* tests issue fixed

* refactor / improvements

* lock with powerflex systemid while checking connections limit

* updated powerflex systemid lock to hold till sdc preparation

* Added custom stats support for storage pool, through listStoragePools API

* code improvements, and unit tests

* Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements

* Stop SDC on host after migration if no volumes mapped to host

* Wait for SDC to connect after scini service start, and some log improvements

* Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume

* some log improvements
2024-06-27 18:47:50 +05:30
Vishesh c2de75744e
kvm: Add support for cgroupv2 (#8252) (#459)
* kvm: Add support for cgroupv2 (#8252)

1. Problem description

In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation.

    Equation 1
    shares = CPU * speed

Fixes: #6744
2. Proposed changes

To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion.

The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS.

    Equation 2
    shares = (VM requested shares * cgroup upper limit)/host max shares

To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host.

To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency.

It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now.

* Update overcommit ratio during live VM migration

* minor refactoring

---------

Co-authored-by: Bryan Lima <42067040+BryanMLima@users.noreply.github.com>
2024-06-27 12:22:17 +05:30
Vishesh 7ed43e3e43
Let network guru decide if ipv6 cidr size can't be equal to 64 (#462) 2024-06-27 12:20:49 +05:30