Commit Graph

36048 Commits

Author SHA1 Message Date
Vishesh a1122d175e
Add missing indexes for vmstats (#391) 2024-03-21 14:04:50 +05:30
Vishesh 5b3a81c2a3
Fix failing test (#400) 2024-03-15 16:19:39 +05:30
Marcus Sorensen 98dda22a83
Support KVM storage implementations controlling logical/physical block size (#390)
* Support KVM storage implementations controlling logical/physical block io size

* Support custom block size during disk attach

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-03-15 14:06:33 +05:30
Vishesh 4c6c8216d5
Use join instead of views (#365)
* Use join instead of views for filtering volumes

* Use join instead of views for filtering events

* Use join instead of views for filtering accounts

* Use join instead of views for filtering domains

* Use join instead of views for filtering hosts

* Use join instead of views for filtering storage pools

* Use join instead of views for filtering service offerings

* Use join instead of views for filtering disk offerings

* Remove unused code

* Fix unit test

* Use disk_offering instead of disk_offering_view in service_offering_view

* Fixup

* Fix listing of diskoffering & serviceoffering

* Use constants instead of strings

* Make changes to prevent sql injection

* Remove commented code

* Prevent n+1 queries for template's response

* remove unused import

* refactor some code

* Add missing check for service offering's join with disk offering

* Fix n+1 queries for stoage pool metrics

* Remove n+1 queries from list accounts

* Remove unused imports

* remove todo

* Remove unused import

* Fixup query generation for nested joins

* Fixups

* Fix DB exception on ClientPreparedStatement

* events,alerts: Add missing indexes (#366)

* Fixup
2024-03-14 17:49:35 +05:30
Marcus Sorensen bf4ea0d59f
Storage drivers to decide if they need data motion for zone-wide use (#392)
* Storage drivers to decide if they need data motion for zone-wide use

* Apply fixes in resolving PrimaryDataStore

* add tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix imports

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-03-14 10:53:24 +05:30
Vishesh ba3284bdc5
Fix resource count discrepancies (#376)
* Fix resource count discrepancies

* Fixup while removing vm

* Fix discrepancies when starting VMs

* Fixup tests

* Fixups

* Don't take lock when amount is negative
2024-03-13 18:22:34 +05:30
Abhishek Kumar 1510b44f03
backport: add more unit tests and fix related to #327 (#378)
Adds:

- Fix for volume limit checks for disk offerings with multiple tags - When a VM is deployed with multiple disks having offerings with multiple tags the resource limit check may falter as currently it tries to check based on individual diskoffering. With this, change if offering d1 and d2 for volumes v1 and v2 both have tag1, server will check volume limits for tag tag1 using the combined size of v1 and v2.
- Fix for template tag hosts in random host allocator - May affect use of template tag, service offering tags and random host allocator together. The current code for the random host allocator falters while trying to find the host allocation. This was found and fixed during the addition of the unit test here, https://github.com/shapeblue/cloudstack-apple/pull/378/files#diff-bbf9baea014e5cc1dfe9e7d13467c9857208cfe65e93883721d88a6f0452f912
- Unit tests for changes in api,server,ui: tagged resource limits #327
2024-03-01 17:22:14 +05:30
Suresh Kumar Anaparti ae6d0fb2d6
Storage pool stats update (#383)
* Update PowerFlex storage stats on host connect (if any changes in capacity / used bytes)

* Sync the pool stats in DB with the actual stats from stats collector

* Updated capacityBytes check

* Revert "Updated capacityBytes check"

This reverts commit 3ffb17b2c4b3c794e5d0dbf4108d43255b4fbcca.

* Revert "Update PowerFlex storage stats on host connect (if any changes in capacity / used bytes)"

This reverts commit 9e473aed4c589b91f62cbe2fd135dc25e0adc1c3.
2024-02-29 15:26:00 +05:30
Harikrishna 747d1101c1
New API "checkVolume" to check and repair any leaks or repair all issues (#362)
* Introduced a new API "checkVolumeAndRepair" that allows users or admins to check and repair if any leaks observed.
Currently this is supported only for KVM

* some fixes

* Added unit tests

* addressed review comments

* add repair volume while granting access

* Changed repair parameter to accept both leaks/all values

* Introduced new global setting volume.check.and.repair.before.use to do volume check and repair before VM start or volume attach operations

* Added volume check and repair changes only during VM start and volume attach operations

* Refactored the names to look similar  across the code

* Some code fixes

* remove unused code

* Renamed repair values

* Addressed review comments

* code refactored

* used volume name in logs

* Changed the API to Async and the setting scope to storage pool

* Fixed exit value handling with check volume command

* Fixed storage scope to the setting

* Fixed volume format issues

* Refactored the log messages

* Fix formatting
2024-02-29 14:40:40 +05:30
anniejili 2df750c2f4
Fixed query param for getDomainReservation. (#388)
Co-authored-by: Annie Li <ji_li@apple.com>
2024-02-28 08:10:16 -07:00
Vishesh f30e07b312
Fix host stuck in connecting state (#375)
* Fix host stuck in connecting state (#8502)

There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.

* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)

This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)

* fixup
2024-02-21 13:44:53 +05:30
Suresh Kumar Anaparti 89f93746ac
Storage plugin support to check if volume on datastore requires access for migration (#380)
* Check if volume on datastore requires access for migration, and grant/revoke volume access if requires

* Updated default implementation for requiresAccessForMigration method in PrimaryDataStoreDriver
2024-02-20 11:32:32 -07:00
Vishesh 8b01c0aa62
Update VM's state if powerstate & state are not in sync (#368)
* Update VM's state if powerstate & state are not in sync

* Add unit tests

* some code improvements for instacne / power state check

* Update power state after vm stop confirmation (as power state 'PowerOn' is kept after vm stop and not updated later)

* Reset the power state update counter before migrate, to allow power state sync to proper state / host

* Do not consider transitional states (Starting, Stopping) to check power state sync

* set powerstate to off for all vm types

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-02-20 14:44:04 +05:30
anniejili 30d908c580
Added vm uuid as part of error response when vm create fails after vm entity is persisted. (#350)
* Added vm uuid as part of error response when vm create fails after vm entity is persisted

* Fixed styling issue

* Fixed styling issue.

* Fix unit tests

* Fixed merge conflicts.

* Fixed merge conflicts.

---------

Co-authored-by: Annie Li <ji_li@apple.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2024-02-09 00:02:02 +05:30
Abhishek Kumar 6a9cdedda4
api,server,ui: tagged resource limits (#327)
Introduces the concept of tagged resource limits. Limits can be enforced on accounts and domains for the deployment of entities for a tagged resource. Current tagged resource limits can be used for the following resource types,

Host limits

    user_vm
    cpu
    memory

Storage limits

    volume
    primary_storage

Following global settings can used to specify tags for which limit needs to be enforced,

    Host: resource.limit.host.tags
    Storage: resource.limit.storage.tags

Option for specifying tagged resource limits and viewing tagged resource usage are made available in the UI.

Enhances use of templatetag for VM deployment and template creation

Adds option to list disk offering with suitability flag for a virtualmachine. A new parameter named virtualmachineid has been added to the listDiskOfferings API which when passed returns suitableforvirtualmachine param in the reponse.
2024-02-07 17:35:15 +05:30
Marcus Sorensen f49265c14c
Fix missing code from backport of 4.16 version of dom0 CPU reserve (#374)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-02-05 11:53:45 +05:30
Suresh Kumar Anaparti b44710c8a9
Pass StoragePoolType object for poolType dao attribute - fixes conversion to DB column (#371) 2024-02-02 14:10:02 +05:30
Marcus Sorensen e610d2c54c
Fix libvirt domain event listener by properly processing events (#364)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-01-29 14:46:20 +05:30
Suresh Kumar Anaparti 0201e0af95
Allocate new ROOT volume (on restore virtual machine operation) only when resource count increment succeeds (#367)
* Allocate new volume on restore virtual machine operation when resource count increment succeeds
- keep them in transaction, and fail operation if resource count increment fails

* Added some (negative) unit tests for restore vm
2024-01-29 14:43:24 +05:30
Marcus Sorensen 40dd867198
Apple base418 storagepooltype as class (#351)
* StoragePoolType as a class

* Fix agent side StoragePoolType enum to class

* Handle StoragePoolType for StoragePoolJoinVO

* Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation.
Implemented conveter class and logic to utilize @Convert annotation.

* Fix UserVMJoinVO for StoragePoolType

* fixed missing imports

* Since StoragePoolType is a class, it cannot be converted by @Enumerated annotation.
Implemented conveter class and logic to utilize @Convert annotation.

* Fixed equals for the enum.

* removed not needed try/catch for prepareAttribute

* Added license to the file.

* Implemented "supportsPhysicalDiskCopy" for storage adaptor. (#352)

Co-authored-by: mprokopchuk <mprokopchuk@apple.com>

* Add javadoc to StoragePoolType class

* Add unit test for StoragePoolType comparisons

* StoragePoolType "==" and ".equals()" fix.

* Fix for abstract storage adaptor set up issue

* review comments

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
Co-authored-by: mprokopchuk <mprokopchuk@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-01-25 14:58:44 +05:30
Vishesh 47e53eceed
E2e test resource listing (#363)
* Use dualzones for ci github actions

* Update advdualzone.cfg to be similar to advanced.cfg & fixup test_metrics_api.py

* Add e2e tests for listing of accounts, disk_offerings, domains, hosts, service_offerings, storage_pools, volumes

* Add test for listing volumes with tags filter

* Add check for existing volumes in test_list_volumes

* Wait for volumes to be deleted on cleanup

* Filter out volumes in Destroy state before checking the count of volumes
2024-01-25 14:52:06 +05:30
Suresh Kumar Anaparti 7fef155621
Remove sensitive params (VmPassword, etc) from VMWork log (#369)
* Remove sensitive params (VmPassword, etc) from VMWork log

* Added unit tests

* review comments
2024-01-24 17:49:20 +05:30
Suresh Kumar Anaparti e704b6e492
Fix reorder/list pools when cluster details are not set (#358)
* Fix reorder/list pools when cluster details are not set

* minor code improvements

* added unit tests
2024-01-18 15:29:00 +05:30
kishankavala 99939d22a7
CleanUp Async Jobs after mgmt server maintenance (#356)
* Cleanup Volume AsyncJob after mgmt server stop

* Clean Up Vm async job resources during mggmt server stop

* Use State.isTransitional method to identify trnsition states

* Add cleanup for Network Async Job

* Add license

* Added RevertSnapshotting to volume transition state. Fixed spacing code style

* Added transitional flag in Volume state

* Updated network event for failed job, (re)added cleanup for volumes created from snapshots, and some code improvements

* Added java doc for volume state constructor

* Fixed cleanup SNAPSHOT_ID entry in volume details for failed volumes created from snapshots

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-01-09 17:54:26 +05:30
Suresh Kumar Anaparti 6f4cf60fab
Updated jetty maxFormContentSize value to 1048576 bytes (default is 200000 bytes), to support user data upto 1048576 bytes (#360)
* Updated jetty maxFormContentSize value to 1048576 bytes (default is 200000 bytes)

* Updated content size config using 'max.form.content.size' parameter in server.properties

* Updated content size config parameter (in server.properties) to 'request.content.size'
2024-01-03 18:04:11 +05:30
Harikrishna 28be74e0b9
Add lock mechanism considering template id, pool id, host id (#345)
* Add lock mechanism considering template id, pool id, host id

* Added missing lock
2023-12-06 13:51:25 +05:30
anniejili af4e657aee
Clear pool id if volume in allocated state (#341)
Co-authored-by: Annie Li <ji_li@apple.com>
2023-11-21 15:42:43 +05:30
Vishesh 63a4efa4c9
Use UserVmDao for listVirtualMachines API to increase performance (#343) 2023-11-10 13:08:30 +05:30
Wei Zhou c32d2fa990 CKS: fix wrong format of cluster size on UI (#8182)
(cherry picked from commit e6f048bc2e)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-11-07 21:00:23 +05:30
Rohit Yadav 7260204447 ui: Admin, account and project dashboard improvements
This backports only the dashboard changes from
https://github.com/apache/cloudstack/pull/7956 and
5d9ae31f1b

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 3376f94886)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-11-07 20:59:10 +05:30
Vishesh b9c3752ce0
Fix: Select another pod if all hosts in the pod becomes unavailable (#339) 2023-11-07 15:11:21 +01:00
Vishesh a7c7a33131
Apple base418 agent lock during reconnect (#340)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-11-03 16:56:15 +01:00
Abhishek Kumar 42131fdd16 ui: fix bulk delete template from zones (#8118)
Fixes #8083

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit e199678101)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:21:07 +05:30
Abhishek Kumar bf3dff2f57 marvin,test: fix directdownload template checksum test (#8096)
* marvin,test: fix directdownload template checksum test

During failure while deploying a VM with wrong checksum template, VM may be left in Error state. This PR adds code to delete such VM.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary logs

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit a2ec1f3777)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:21:02 +05:30
Abhishek Kumar 0c96202c7d ui: correctly show volume physical size (#8119)
Fixes #8073

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit f62b634033)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:57 +05:30
Peinthor Rene 8875a94242 linstor: fix template copy on non hyperconverged setups (#8114)
Making a diskful resource was meant as an optimization,
but cannot work on non hyperconverged setups,
as the storage nodes (diskful) are not part of the cloudstack cluster.

(cherry picked from commit 67cb9b9e40)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:51 +05:30
Abhishek Kumar f4b9e6c988 test: add test for standalone snapshot (#8104)
Fixes #8034

Adds the following test for a backed-up snapshot (original template and VM deleted beforehand):
- Create volume from snapshot
- Create a template from the snapshot and deploy a VM using it

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 540c7b802f)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:43 +05:30
Harikrishna 198e48c7c5 Fix VM snapshot size during storage capacity check (#8101)
(cherry picked from commit 0183e25279)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:37 +05:30
Harikrishna 4c7a81bd82 Fix UUID for child datastores in all cases (#8057)
(cherry picked from commit 76ab621a5a)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:32 +05:30
Peinthor Rene d49b0c4253 linstor: Fix template volume missing on copy node (#8082)
A TODO was overseen and never implemented,
which could trigger the following bug:

If Linstor didn't create a resource (diskless or diskfull) on
the cloudstack choosen node, it would not be able to copy the
template data there, it even seems no error was
triggered and the new template file silently just became
empty/corrupt.

(cherry picked from commit 4a86a0d233)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:28 +05:30
Abhishek Kumar 03fa5799e6 test,refactor: fix test_project_resources cleanup (#8097)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 065abe2a3b)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:20 +05:30
Abhishek Kumar 6f925f0022 kvm: fix direct download template size (#8093)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit ba24a18f27)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:20:11 +05:30
Harikrishna d1849a4033 Fix NPE if global setting implicit.host.tags is set to null (#8066)
(cherry picked from commit fb3a2ecb57)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:19:21 +05:30
slavkap a768c96a6d Create snapshot from VM snapshot without memory for NFS/Local storage (#8117)
(cherry picked from commit 6ae3b73ca2)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-26 12:18:56 +05:30
Nicolas Vazquez 3e2717424d Address review comments (#338)
This adds the missing commit to the fix #335 from the upstream PR:
apache/cloudstack#7977

(cherry picked from commit b5f77f9c3b53af7e8b05730da9807a2c9eb017a5)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-12 14:20:00 +05:30
Harikrishna 84fee7b896 ui: Fix non admin logouts (#8065)
If a user (non-admin) logs out from a session, then login page is not loading completely. Few starter APIs like listIds are failing and showing unauthorised access notification in Login Page. Also if SAML is enabled, it is not getting enabled since the corresponding API are failed. User needs to refresh the browser to get it back.

(cherry picked from commit 8b281284a2)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-11 21:03:09 +05:30
Wei Zhou f570934482 .github: run Sonar Check only on PRs from apache/cloudstack branches (#8058)
This PR fixes #8050

(cherry picked from commit 864a195868)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-11 21:02:57 +05:30
Harikrishna db54a09860 Default value of force should be false for template delete operation (#7731)
* default value of force should be false

* Added force flag in tests

(cherry picked from commit a9f3af85cb)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-11 21:02:48 +05:30
Wei Zhou 846cc2f26c systemvm: remove config in /etc/pam.d/systemd-user to fix user@0.service (#8048)
the service `user@0.service` fails in system vms and virtual routers

This PR removes a change to fix memory leak of SSH connections in the systemvm templates with old linux kernel.

(cherry picked from commit e290ac5451)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-11 21:02:44 +05:30
Rohit Yadav 18e7276df0 storage: allow VM snapshots without memory for KVM when global setting allows (#8062)
This removes the conditional logic where comment notest to remove it
after PR #5297 is merged that is applicable for ACS 4.18+. Only when the
global setting is enabled and memory isn't selected, VM snapshot could
be allowed for VMs on KVM that have qemu-guest-agent running.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 8350ce5aa4)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-10-11 21:01:31 +05:30