Commit Graph

4515 Commits

Author SHA1 Message Date
Pearl Dsilva 79b1490c9c Merge branch 'nsx-integration' of https://github.com/apache/cloudstack into nsx-reorder-acl 2024-02-06 12:56:32 -05:00
Pearl Dsilva 51a740c8dd fix reordering of acl rules on all networks that it is associated to 2024-02-06 12:45:53 -05:00
Pearl Dsilva 8d48e305a5 [WIP] NSX: Add support to re-order ACL rules (NSX FW rules) 2024-02-06 12:45:38 -05:00
nvazquez 09636cfcc8
Fix NSX plugin pom XML 2024-02-03 18:16:37 -03:00
nvazquez d5efb869fd
Merge branch 'main' into nsx-integration 2024-02-03 17:37:25 -03:00
nvazquez 8021f0cf3c
Fix unit tests 2024-02-03 17:15:15 -03:00
Pearl Dsilva ba77dbd56e
NSX: Fix ACL rule removal on replacement and fix rule order (#11) 2024-02-03 17:15:05 -03:00
nvazquez aac547b769
Fix unit test 2024-02-03 17:14:51 -03:00
Pearl Dsilva 7c6c9e62ec
NSX: Improve NSX resource cleanup process (#3) 2024-02-03 17:14:46 -03:00
Pearl Dsilva 9313d39315
Nsx: Support internal LB (#4)
* NSX: Support internal LB service in NSX

* add lb removal logic

* Fix UI issue hiding internal LB tab

* Refactor method name

---------

Co-authored-by: nvazquez <nicovazquez90@gmail.com>
2024-02-03 17:14:39 -03:00
Pearl Dsilva 8beaa44895
Nsx vpc routed mode (#5)
* NSX: Fix VPC routed mode

* NSX: VPC route mode

* remove unnecessary changes
2024-02-03 17:14:29 -03:00
Abhishek Kumar 7dffbc6e47 Updating pom.xml version numbers for release 4.20.0.0-SNAPSHOT
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-02-02 18:16:37 +05:30
Abhishek Kumar cf0d436fc8 Merge remote-tracking branch 'apache/4.19' into main 2024-02-02 18:15:21 +05:30
Abhishek Kumar a7b97ff3b0 Updating pom.xml version numbers for release 4.19.1.0-SNAPSHOT
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-02-02 18:06:04 +05:30
Lucas Martins 39e0a8e8d4
Change Cryptsetup validation (#8482)
Co-authored-by: lucas.martins.scclouds <lucas.martins@scclouds.com.br>
2024-01-31 10:23:53 -03:00
Abhishek Kumar 2746225b99 Updating pom.xml version numbers for release 4.19.0.0
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-01-29 10:21:52 +05:30
Vishesh fedcf66de0
Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)
This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)
2024-01-27 23:36:13 +05:30
Pearl Dsilva 5a4f38c2fc
NSX: Add retry logic with sleep to delete segments (#8554)
* NSX: Add retry logic with sleep to delete segments

* add logs
2024-01-23 11:36:20 -03:00
Pearl Dsilva 80365c8333
NSX: Fix Routed Mode for Isolated and VPC networks (#8534)
* NSX: Fix Routed Mode for Isolated and VPC networks

* NSX: Fix Routed mode - add checks for ports added for FW rules

* clean up code

* fix build failure
2024-01-23 08:13:24 -05:00
Pearl Dsilva 19ae12a05a
NSX: Add passive monitor for NSX LB to test whether a server is available (#8533)
* NSX: Add passive monitor for NSX LB to test whether a server is available

* Add active monitors too

* fix build failure
2024-01-21 22:18:05 -03:00
Nicolas Vazquez f01bb5d440
NSX: Improve segment deletion process (#8538) 2024-01-19 16:59:05 -03:00
Pearl Dsilva 330c99ca57 fix test failure 2024-01-19 12:53:23 -05:00
nvazquez 33fc9d8443
Merge branch 'main' into nsx-integration 2024-01-19 12:56:42 -03:00
Pearl Dsilva 080f171c6d
NSX: Cleanup NSX resources during k8s cluster cleanup (#8528) 2024-01-19 12:48:08 -03:00
Nicolas Vazquez 8d42ca8ccf
Use project version on pom dependencies (#8529)
This PR fixes the POM dependencies from a hardcoded value to the project.version property on dependencies
2024-01-18 20:16:06 +05:30
Vishesh c3b77cb7b8
Fix host stuck in connecting state (#8502)
There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.
2024-01-15 13:56:34 +05:30
nvazquez 2b05dd93a1
Merge branch 'main' into nsx-integration 2024-01-12 14:13:53 -03:00
Pearl Dsilva b7af40413b
CKS: Add action to during firewall rule creation (#8498) 2024-01-12 14:07:32 -03:00
Nicolas Vazquez a3a4833c3e
Fixes for KVM unmanaged instances import on advanced network and VNC password (#8492)
This PR fixes a regression caused by #8465 on advanced zones, import fails with:

2024-01-10 12:13:33,234 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Allocating nic for vm 142272e8-9e2e-407b-9d7e-e9a03b81653c in network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10} during import
2024-01-10 12:13:33,239 ERROR [o.a.c.v.UnmanagedVMsManagerImpl] (API-Job-Executor-3:ctx-991bbe9f job-128 ctx-f49517d4) (logid:d7b8e716) Failed to import NICs while importing vm: i-2-31-VM
com.cloud.exception.InsufficientVirtualNetworkCapacityException: Unable to acquire Guest IP  address for network Network {"id": 204, "name": "Isolated", "uuid": "9679fac5-e3ac-4694-a57b-beb635340f39", "networkofferingid": 10}Scope=interface com.cloud.dc.DataCenter; id=1
	at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.importNic(NetworkOrchestrator.java:4582)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importNic(UnmanagedVMsManagerImpl.java:859)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importVirtualMachineInternal(UnmanagedVMsManagerImpl.java:1198)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstanceFromHypervisor(UnmanagedVMsManagerImpl.java:1511)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.baseImportInstance(UnmanagedVMsManagerImpl.java:1342)
	at org.apache.cloudstack.vm.UnmanagedVMsManagerImpl.importUnmanagedInstance(UnmanagedVMsManagerImpl.java:1282)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Also, addresses the VNC password field set instead of a fixed string
2024-01-12 14:14:01 +05:30
Nicolas Vazquez 59e78cbc45
Fix KVM unmanage disks path (#8483)
This PR fixes the volumes path on KVM import unmanaged instances

Fixes: #8479
2024-01-11 14:45:57 +05:30
Vishesh 4f40eae1c4
DRS: Use free metrics insteado of used for computation (#8458)
This PR makes changes to use cluster's free metrics instead of used while computing imbalance for the cluster. This allows DRS to run for clusters where hosts doesn't have the same amount of metrics.
2024-01-10 17:52:46 +05:30
slavkap c569fe9119
Fix KVM import and list unmanaged VMs (#8445)
VM import fixes

1 - Fix of VM insert for VMs with StorPool volumes
2 - Fix of list/insert unmanaged VMs with RBD volumes
2024-01-10 13:12:07 +05:30
Abhishek Kumar d6ac91f2df
minio: fix store user creation (#8425)
To prevent errors during multi-user access, use account UUID to create/access user on the provider side. Also, update the existing secret key for a user that already exists.
2024-01-09 17:44:11 +05:30
Pearl Dsilva 68da68c09d
NSX: Fix code smells (#8436)
* NSX: Fix code smells

* Add changes to service creation logic
2024-01-08 17:50:45 -03:00
Nicolas Vazquez 886c071a6c
[NSX] Add more unit tests (#8431)
* [NSX] Add more unit tests

* More tests

* Fix build errors
2024-01-02 21:57:49 -03:00
Pearl Dsilva 4ce7f64ebd
NSX: Fix code smells and reported bugs (#8409)
* NSX: Fix code smells and reported bugs

* fox override issue

* remove unused imports

* fix test

* refactor code to reduce complexity

* add lisence

* cleanup

* fix build failure

* fix build failure

* address comments

* test - add config to ignore certain files from test coverage

* test exclusion of classes from test cov

* rever pom changes
2024-01-02 14:46:08 -03:00
Pearl Dsilva 7fa33a0831
NSX: Add more unit tests (#8381)
* NSX : Unit tests

* remove unused imports

* remove unused import causing build failure

* fix build failures due to unused imports

* fix build failure

* fix test assertion

* remove unused imports

* remove unused import
2023-12-28 18:33:43 -03:00
Abhishek Kumar 2253a33c1e Merge remote-tracking branch 'apache/4.18' 2023-12-20 08:58:30 +05:30
Wei Zhou ab70108f15
CKS: create Security Groups for CKS clusters of each account (#8316)
This PR fixes #7684

The security groups contain the same rules for port 22 and 6443, no need to recreate for each CKS cluster.
2023-12-20 08:57:27 +05:30
Pearl Dsilva 2b896a3a21 fix security hotspots 2023-12-18 11:08:39 -05:00
Pearl Dsilva 7cabc66f1c Merge branch 'main' of https://github.com/apache/cloudstack into nsx-integration 2023-12-18 10:02:05 -05:00
Pearl Dsilva 7288ac458f
NSX: Add unit tests to increase coverage (#8355)
* NSX: Add unit tests

* cleanup unused imports

* add more unit tests

* add tests for publicnsxnetworkguru

* add license

* fix build failures

* address sonar comment
2023-12-18 09:02:47 -05:00
John Bampton dda672503f
Remove unneeded duplicate words (#8358)
This PR removes some unneeded duplicate words.
2023-12-15 17:13:32 +05:30
Nicolas Vazquez 4457c62ad3
[NSX] Address SonarCloud Bugs (#8341)
* [NSX] Address SonarCloud Bugs

* Fix NSX API connection issues
2023-12-14 09:38:24 -03:00
nvazquez 7814829a48
Merge branch 'main' into nsx-integration 2023-12-14 09:37:14 -03:00
kishankavala ab20b1220f
KVM Ingestion - Import Instance (#7976)
This PR adds new functionality to import KVM instances from an external host or from disk images in local or shared storage.
Doc PR: https://github.com/apache/cloudstack-documentation/pull/356
2023-12-14 13:08:56 +05:30
Abhishek Kumar 82f7abddb3 Merge remote-tracking branch 'apache/4.18' 2023-12-13 11:24:15 +05:30
Bryan Lima 3bb318bab9
kvm: Add support for cgroupv2 (#8252)
1. Problem description

In Apache CloudStack (ACS), when a VM is deployed in a host with the KVM hypervisor, an XML file is created in the assigned host, which has a property shares that defines the weight of the VM to access the host CPU. The value of this property has no unit, and it is a relative measure to calculate how much CPU a given VM will have in the host. However, this value has a limit, which depends on the version of cgroup utilized by the host's kernel. The problem lies at the range value of shares that varies between both versions: [2, 264144] for cgroups version 1; and [1, 10000] for cgroups version 2. Currently, ACS calculates the value of shares using Equation 1, presented below, where CPU is the number of cores and speed is the CPU frequency; both specified in the VM's compute offering. Therefore, if a compute offering has, for example, 6 cores at 2 GHz, the shares value will be 12000 and an exception will be thrown by libvirt if the host utilizes cgroup v2. The second version is becoming the default one in current Linux distributions; thus, it is necessary to address this limitation.

    Equation 1
    shares = CPU * speed

Fixes: #6744
2. Proposed changes

To address the problem described, we propose to apply a scale conversion considering the max shares of the host. Using the same formula currently utilized by ACS, it is possible to calculate the maximum shares of a VM for a given host. In other words, using the number of cores and the nominal speed of the host's CPU as the upper limit of shares allowed to a VM. Then, this value will be scaled to the allowed interval of [1, 10000] of cgroup v2 by using a linear scale conversion.

The VM shares would be calculated as Equation 2, presented below, where VM requested shares is the requested shares value calculated using Equation 1, cgroup upper limit is fixed with a value of 10000 (cgroups v2 upper limit), and host max shares is the maximum shares value of the host, calculated using Equation 1. Using Equation 2, the only case where a VM passes the cgroup v2 limit is when the user requests more resources than the host has, which is not possible with the current implementation of ACS.

    Equation 2
    shares = (VM requested shares * cgroup upper limit)/host max shares

To implement the proposal, the following APIs will be updated: deployVirtualMachine, migrateVirtualMachine and scaleVirtualMachine. When a VM is being deployed, a new verification will be added to find a suitable host. The max shares of each host will be calculated, and the VM calculated shares will be verified if it does not surpass the host's value. Likewise, the migration of VMs will have a similar new verification. Lastly, the scale of VMs will also have the same verification for the VM's host.

To determine the max shares of a given host, we will use the same equation currently used in ACS for calculating the shares of VMs, presented in Section 1. When Equation 1 is used to determine the maximum shares of a host, CPU is the number of cores of the host, and speed is the nominal CPU speed, i.e., considering the CPU's base frequency.

It is important to note that these changes are only for hosts with the KVM hypervisor using cgroup v2 for now.
2023-12-13 10:51:24 +05:30
Nicolas Vazquez 27a3d61729
Fix unmanage VM marvin tests and small UI fixes for import (#8338)
This PR fixes the failing smoke test for test_vm_lifecycle_unmanage_import.py for Vmware and adds a small UI fix on the import wizard
2023-12-13 10:25:05 +05:30
Abhishek Kumar 080a5aee00 Merge remote-tracking branch 'apache/4.18' 2023-12-12 17:01:52 +05:30