Commit Graph

36083 Commits

Author SHA1 Message Date
Rohit Yadav de82aa8e91 engine/orchestartion: wrap db txn in try-with, only fetch id
Optimises DB query that seem to run against every Ping command, where
whole columns are fetched but only `id` column is used.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav c01aad6ba8 server: count hosts than get all hosts in capacity scans
This refactors hotspot code to fetch just the count of hosts than
all the host VOs for a zone, during capacity scans for systemvms.
This reduces CPU and DB load, in really large (10k+ hosts) env.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 2a48d71909 server: don't go into O(n^2) loop for non-XenServer hosts
Introduced in https://github.com/apache/cloudstack/pull/1403 this
gates the logic only to XenServer where this would at all run. The
specific code is only applicable for XenServer and SolidFire
(https://youtu.be/YQ3pBeL-WaA?si=ed_gT_A8lZYJiEh.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 47163df2ff
framework/config: make logic in ::value() defensive (#449)
This adds a NPE check on the s_depot.global() which can cause NPE in
case of unit tests, where s_depot is not null but the underlying config
dao is null (not mocked or initialised) via `s_depot.global()` becomes
null.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:20:37 +05:30
Vishesh c3eba5e213
Fix exceeding of resource limits with powerflex (#443)
* Fix exceeding of resource limits with powerflex

* Fix for volume prepare during VM start

* resolve comments

* Add e2e tests

* Fixup

* Update e2e tests

* minor refactoring

* refactoring

* fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-05-08 20:54:54 +05:30
Vishesh 2f4cea6dca
Fix message publish in transaction (#438)
* Fix message publish in transaction

* Resolve comments
2024-05-07 13:27:19 +05:30
Vishesh 04a589d013
Fixup e2e test_restore_vm (#445)
* Fixup e2e test_restore_vm

* Fix template's size attribute

* Resolve comments
2024-05-07 12:59:42 +05:30
Vishesh 7fae1fc747
Fix restore VM with allocated root disk (#441)
* Fix restore VM with allocated root disk

* Add e2e test for restore vm

* Add more checks for e2e test
2024-04-29 12:18:55 +05:30
Vishesh 9ab786c18a
Fix: Update rootdisksize detail on restore VM (#440)
* Fix: Update rootdisksize detail on restore VM

* Update server/src/main/java/com/cloud/vm/UserVmManagerImpl.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* minor fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-29 12:14:44 +05:30
Vishesh 1b54edd9de
Fix resource limit checks and increment/decrements for different operations (#430)
* Fix resource limit checks and increment/decrements for different operations

* Fixup

* More fixups

* fixup

* Refactor code

* Resolve comments

* Some minor code refactoring

* Fixup

* fixup

* Fix method name

* Fixup

* Fixup listing
2024-04-24 17:56:33 +05:30
Vishesh c21b6d8b52
Update volume's passphrase to null if diskOffering doesn't support encryption (#428) 2024-04-23 09:46:20 -06:00
Vishesh 93e66c52dc
Fix null pointer exception in restore VM (#431) 2024-04-23 09:41:39 -06:00
Marcus Sorensen e630d7afea
Update netty version for compatibility/staying current (#433)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-19 21:20:18 +05:30
Vishesh 1b52bebd08
Fix error message for checkVolume command (#409) 2024-04-17 17:27:08 +05:30
Marcus Sorensen 3a058f3a18
Introduce scheduled executor wrapper with dynamic interval (#424)
* Introduce scheduled executor wrapper with dynamic interval

* Resolve comments

* Add validations

* Add validation for configkey

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Vishesh <vishesh92@gmail.com>
2024-04-17 14:24:31 +05:30
Vishesh fd9325a86d
Speed up resource count calculation (#425)
* Speed up resource count calculation

* server: remove supportedOwner from Resource.ResourceType (#7416)

* Refactor resource count calculation

* Start transaction for updateCountByDeltaForIds

---------

Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
2024-04-17 14:21:07 +05:30
Vishesh 26c1741af5
Fix listStoragePoolsMetricsCmd (#419) 2024-04-16 15:51:53 +05:30
Vishesh 1b7f33d0e1
This PR fixes the build issue on apple-base418 (#429) 2024-04-12 15:43:23 +02:00
Vishesh 0501678478
Allow overriding root diskoffering id & size, and expunge old root disk while restoring VM (#401)
* Allow overriding root diskoffering id & size while restoring VM

* UI changes

* Allow expunging of old disk while restoring a VM

* Apply suggestions from code review

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* resolve comments

* Fixup

* Rename some variables

* Resolve comments

* Address comments

* Duplicate volume's details while duplicating volume

* Allow setting IOPS for the new volume

* minor cleanup

* fixup

* Add checks for template size

* Replace strings for IOPS with constants

* Fix saveVolumeDetails method

* Fixup

* Fixup UI styling

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-12 17:47:16 +05:30
Vishesh 8d0915c4c9
Change iops on offering change (#416)
* Change IOPS on disk offering change

* Remove iops & bandwidth limits before copying template

* minor refactor

* Handle diskOfferingDetails

* Fixup
2024-04-11 16:59:57 +05:30
Marcus Sorensen 227dc5e86a
Add ability to set cpu.threadspercore similar to existing cpu.corespersocket (#411)
* Add ability to set cpu.threadspercore similar to existing cpu.corespersocket

* Add license to new test file

* Add tests to handle some edge cases

* Add some edge test cases to CPU topology

* Rework logic on KVM CPU topology, handle more cases

* Add more test cases

* Add more test cases

* Update plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Added cpu.threadspercore detail in listDetailOptions response (for KVM hypervisor)

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-10 09:58:03 -06:00
Marcus Sorensen 631b0960f3
Allow kvm storage plugin to customize diskdef, add geometry (#402)
* Allow kvm storage plugin to customize diskdef, add geometry

* formatting update

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-08 09:28:59 -06:00
Marcus Sorensen f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Rohit Yadav 0c23820c7c
Merge pull request #414 from shapeblue/security-backport418
Backport upstream security fixes to apple-base418
2024-04-03 19:58:28 +05:30
Marcus Sorensen ac4b030759
Mark libvirt events experimental, add properties flag (#404)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-03 08:03:26 -06:00
Vishesh c09cea5d86
Fix: check root disk offering tagged limits during VM deploy (#415) 2024-04-03 18:41:24 +05:30
Wei Zhou 21a03ae4da upgrade: fix upgrade from 4.18.1.0 to 4.18.2.0-SNAPSHOT (#7959)
The uprgade from 4.18.1.0 to 4.18.2.0-SNAPSHOT failed with error

```
2023-09-12 16:12:19,003 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.18.1.0 Code Version = 4.18.2.0
2023-09-12 16:12:19,004 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Database upgrade must be performed from 4.18.1.0 to 4.18.2.0
2023-09-12 16:12:19,036 DEBUG [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Running upgrade Upgrade41800to41810 to upgrade from 4.18.0.0-4.18.1.0 to 4.18.1.0
...
2023-09-12 16:12:19,041 DEBUG [c.c.u.d.ScriptRunner] (main:null) (logid:) -- Schema upgrade from 4.18.0.0 to 4.18.1.0
...
2023-09-12 16:12:21,602 DEBUG [c.c.u.d.DatabaseAccessObject] (main:null) (logid:) Statement: CREATE INDEX i_cluster_details__name on cluster_details (name)
2023-09-12 16:12:21,663 DEBUG [c.c.u.d.DatabaseAccessObject] (main:null) (logid:) Created index i_cluster_details__name
2023-09-12 16:12:21,673 DEBUG [c.c.u.d.T.Transaction] (main:null) (logid:) Rolling back the transaction: Time = 2632 Name =  Upgrade; called by -TransactionLegacy.rollback:888-TransactionLegacy.removeUpTo:831-TransactionLegacy.close:655-TransactionContextInterceptor.invoke:36-ReflectiveMethodInvocation.proceed:175-ExposeInvocationInterceptor.invoke:97-ReflectiveMethodInvocation.proceed:186-JdkDynamicAopProxy.invoke:215-$Proxy30.persist:-1-DatabaseUpgradeChecker.upgrade:319-DatabaseUpgradeChecker.check:403-CloudStackExtendedLifeCycle.checkIntegrity:64
```

It succeeded with this change.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit a88a47989369af204ea6ee8a5fd190311f43c74c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-04-03 15:50:56 +05:30
Vishesh 8eee0ec213
Fix getRepair method in checkVolume command (#408)
* Fix getRepair method in checkVolume command

* Add License
2024-04-02 17:37:39 +05:30
Vishesh 5137c196c2
HypervisorType as a class (#393)
* HypervisorType as a class

* Fixup

* fixup

* Add missing annotation

* Resolve comments

* Handle parallels typo
2024-04-02 17:35:16 +05:30
Abhishek Kumar 292c0eb291 fix test failure
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 14:05:03 +05:30
Abhishek Kumar 996ae9a959 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 09:23:17 +05:30
dahn cfaac2a67e api: client verification in servlet
This introduces new global settings to handle how client address checks
are handled by the API layer:

proxy.header.verify: enables/disables checking of ipaddresses from a
                     proxy set header
proxy.header.names: a list of names to check for allowed ipaddresses
                    from a proxy set header.
proxy.cidr: a list of cidrs for which \"proxy.header.names\" are
            honoured if the \"Remote_Addr\" is in this list.

(cherry picked from commit b65546636d)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-03-31 22:03:04 +05:30
Wei Zhou 2b93886934 server: fix security issues caused by extraconfig on KVM
- Move allow.additional.vm.configuration.list.kvm from Global to Account setting
- Disallow VM details start with "extraconfig" when deploy VMs
- Skip changes on VM details start with "extraconfig" when update VM settings
- Allow only extraconfig for DPDK in service offering details
- Check if extraconfig values in vm details are supported when start VMs
- Check if extraconfig values in service offering details are supported when start VMs
- Disallow add/edit/update VM setting for extraconfig on UI

(cherry picked from commit e6e4fe16fb)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-03-31 22:02:26 +05:30
Vishesh 98d021faed
server: skip password policies check on empty password (#8370) (#396)
This PR changes the password.policy.regex default value to empty. With an empty value for the configuration, it is skipped during the password policy check, only when the configuration is set to something different than a blank string, the regex will get checked.
This way, when creating a user on org.apache.cloudstack.ldap.LdapAuthenticator#authenticate() we won't get an error by default, as an empty value for the password is passed.

Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-03-27 13:44:28 +05:30
Marcus Sorensen 6a28cb33ff
update mysql dependency version (#394)
* update mysql dependency version

* Enable scrollTolerantForwardOnly property in the DB connection, to preserve the legacy behavior of Connector - tolerating backward and absolute cursor movements on result sets of type ResultSet.TYPE_FORWARD_ONLY

References:
https://dev.mysql.com/doc/relnotes/connector-j/en/news-8-0-24.html
https://dev.mysql.com/doc/connector-j/en/connector-j-connp-props-result-sets.html#cj-conn-prop_scrollTolerantForwardOnly

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-03-21 18:08:53 +05:30
Vishesh a1122d175e
Add missing indexes for vmstats (#391) 2024-03-21 14:04:50 +05:30
Vishesh 5b3a81c2a3
Fix failing test (#400) 2024-03-15 16:19:39 +05:30
Marcus Sorensen 98dda22a83
Support KVM storage implementations controlling logical/physical block size (#390)
* Support KVM storage implementations controlling logical/physical block io size

* Support custom block size during disk attach

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-03-15 14:06:33 +05:30
Vishesh 4c6c8216d5
Use join instead of views (#365)
* Use join instead of views for filtering volumes

* Use join instead of views for filtering events

* Use join instead of views for filtering accounts

* Use join instead of views for filtering domains

* Use join instead of views for filtering hosts

* Use join instead of views for filtering storage pools

* Use join instead of views for filtering service offerings

* Use join instead of views for filtering disk offerings

* Remove unused code

* Fix unit test

* Use disk_offering instead of disk_offering_view in service_offering_view

* Fixup

* Fix listing of diskoffering & serviceoffering

* Use constants instead of strings

* Make changes to prevent sql injection

* Remove commented code

* Prevent n+1 queries for template's response

* remove unused import

* refactor some code

* Add missing check for service offering's join with disk offering

* Fix n+1 queries for stoage pool metrics

* Remove n+1 queries from list accounts

* Remove unused imports

* remove todo

* Remove unused import

* Fixup query generation for nested joins

* Fixups

* Fix DB exception on ClientPreparedStatement

* events,alerts: Add missing indexes (#366)

* Fixup
2024-03-14 17:49:35 +05:30
Marcus Sorensen bf4ea0d59f
Storage drivers to decide if they need data motion for zone-wide use (#392)
* Storage drivers to decide if they need data motion for zone-wide use

* Apply fixes in resolving PrimaryDataStore

* add tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix imports

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-03-14 10:53:24 +05:30
Vishesh ba3284bdc5
Fix resource count discrepancies (#376)
* Fix resource count discrepancies

* Fixup while removing vm

* Fix discrepancies when starting VMs

* Fixup tests

* Fixups

* Don't take lock when amount is negative
2024-03-13 18:22:34 +05:30
Abhishek Kumar 1510b44f03
backport: add more unit tests and fix related to #327 (#378)
Adds:

- Fix for volume limit checks for disk offerings with multiple tags - When a VM is deployed with multiple disks having offerings with multiple tags the resource limit check may falter as currently it tries to check based on individual diskoffering. With this, change if offering d1 and d2 for volumes v1 and v2 both have tag1, server will check volume limits for tag tag1 using the combined size of v1 and v2.
- Fix for template tag hosts in random host allocator - May affect use of template tag, service offering tags and random host allocator together. The current code for the random host allocator falters while trying to find the host allocation. This was found and fixed during the addition of the unit test here, https://github.com/shapeblue/cloudstack-apple/pull/378/files#diff-bbf9baea014e5cc1dfe9e7d13467c9857208cfe65e93883721d88a6f0452f912
- Unit tests for changes in api,server,ui: tagged resource limits #327
2024-03-01 17:22:14 +05:30
Suresh Kumar Anaparti ae6d0fb2d6
Storage pool stats update (#383)
* Update PowerFlex storage stats on host connect (if any changes in capacity / used bytes)

* Sync the pool stats in DB with the actual stats from stats collector

* Updated capacityBytes check

* Revert "Updated capacityBytes check"

This reverts commit 3ffb17b2c4b3c794e5d0dbf4108d43255b4fbcca.

* Revert "Update PowerFlex storage stats on host connect (if any changes in capacity / used bytes)"

This reverts commit 9e473aed4c589b91f62cbe2fd135dc25e0adc1c3.
2024-02-29 15:26:00 +05:30
Harikrishna 747d1101c1
New API "checkVolume" to check and repair any leaks or repair all issues (#362)
* Introduced a new API "checkVolumeAndRepair" that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM

* some fixes

* Added unit tests

* addressed review comments

* add repair volume while granting access

* Changed repair parameter to accept both leaks/all values

* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations

* Added volume check and repair changes only during VM start and volume attach operations

* Refactored the names to look similar  across the code

* Some code fixes

* remove unused code

* Renamed repair values

* Addressed review comments

* code refactored

* used volume name in logs

* Changed the API to Async and the setting scope to storage pool

* Fixed exit value handling with check volume command

* Fixed storage scope to the setting

* Fixed volume format issues

* Refactored the log messages

* Fix formatting
2024-02-29 14:40:40 +05:30
anniejili 2df750c2f4
Fixed query param for getDomainReservation. (#388)
Co-authored-by: Annie Li <ji_li@apple.com>
2024-02-28 08:10:16 -07:00
Vishesh f30e07b312
Fix host stuck in connecting state (#375)
* Fix host stuck in connecting state (#8502)

There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.

* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)

This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)

* fixup
2024-02-21 13:44:53 +05:30
Suresh Kumar Anaparti 89f93746ac
Storage plugin support to check if volume on datastore requires access for migration (#380)
* Check if volume on datastore requires access for migration, and grant/revoke volume access if requires

* Updated default implementation for requiresAccessForMigration method in PrimaryDataStoreDriver
2024-02-20 11:32:32 -07:00
Vishesh 8b01c0aa62
Update VM's state if powerstate & state are not in sync (#368)
* Update VM's state if powerstate & state are not in sync

* Add unit tests

* some code improvements for instacne / power state check

* Update power state after vm stop confirmation (as power state 'PowerOn' is kept after vm stop and not updated later)

* Reset the power state update counter before migrate, to allow power state sync to proper state / host

* Do not consider transitional states (Starting, Stopping) to check power state sync

* set powerstate to off for all vm types

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-02-20 14:44:04 +05:30
anniejili 30d908c580
Added vm uuid as part of error response when vm create fails after vm entity is persisted. (#350)
* Added vm uuid as part of error response when vm create fails after vm entity is persisted

* Fixed styling issue

* Fixed styling issue.

* Fix unit tests

* Fixed merge conflicts.

* Fixed merge conflicts.

---------

Co-authored-by: Annie Li <ji_li@apple.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2024-02-09 00:02:02 +05:30
Abhishek Kumar 6a9cdedda4
api,server,ui: tagged resource limits (#327)
Introduces the concept of tagged resource limits. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types,

Host limits

    user_vm
    cpu
    memory

Storage limits

    volume
    primary_storage

Following global settings can used to specify tags for which limit needs to be enforced,

    Host: resource.limit.host.tags
    Storage: resource.limit.storage.tags

Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI.

Enhances use of templatetag for VM deployment and template creation

Adds option to list disk offering with suitability flag for a virtualmachine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the reponse.
2024-02-07 17:35:15 +05:30