Commit Graph

1278 Commits

Author SHA1 Message Date
Harikrishna Patnala dbda673e1f Updating pom.xml version numbers for release 4.23.0.0-SNAPSHOT
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-11-05 16:54:39 +05:30
Harikrishna Patnala d160731b9f Updating pom.xml version numbers for release 4.22.1.0-SNAPSHOT
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-11-05 16:07:07 +05:30
Harikrishna Patnala 71f47d6130 Updating pom.xml version numbers for release 4.22.0.0
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-10-30 19:23:56 +05:30
Wei Zhou e333ce9782
Updating pom.xml version numbers for release 4.20.3.0-SNAPSHOT 2025-10-24 09:13:19 +02:00
Wei Zhou 4dc3931233
Updating pom.xml version numbers for release 4.20.2.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2025-10-16 11:42:56 +02:00
Harikrishna Patnala 8b9f5fd8f9 Merge branch '4.20' 2025-10-16 13:39:40 +05:30
Wei Zhou 86cad79c15
importvm: fix IP address allocation on Shared networks (#11811) 2025-10-13 08:16:46 +02:00
Suresh Kumar Anaparti f67b738eb3
Migrate volume improvements, to bypass secondary storage when copy volume between pools is allowed directly (#11625)
* Migrate volume improvements, to bypass secondary storage when copy volume between pools is allowed directly

* Bypass secondary storage for copy volume between zone-wide pools and
- local storage on host in the same zone
- cluser-wide pools in the same zone

* Bypass secondary storage for volumes on ceph/rdb pool when the scope permits

* Fix dest disk format while migrating volume from ceph/rbd to nfs, and some code improvements

* unit tests

* Update suitable disk offering(s) for volume(s) after migrate VM with volumes when change in pool type (shared or local)

Currently, Migrate VM with volume(s) bypasses the service and disk offerings of the volumes, as the target pools for migration are specified,
which ignores the offerings. Offering change is required when pool type (shared or local) is changed, mainly
- when volume on shared pool is migrated to local pool
- when volume on local pool is migrated to shared pool

* Update with proper message while migrate volume when target pool and offering type mismatches (both are not shared/local)

* Consider host scope first during endpoint selection while copying between primary storages

* Update disk offering count (for listDiskOfferings api) while removing offerings with tags mismatch with storage tags
2025-10-09 16:00:46 +05:30
Vishesh d2615bb142
Add support for providing userdata to system VMs (#11654)
This PR adds support for specifying user data (cloud-init) for system VMs via Zone Scoped global settings. This allows the operators to customize the System VMs and setup monitoring, logging or execute any custom commands.

We set the user data from the global setting in /var/cache/cloud/cmdline, and use the NoCloud datasource to process user data. cloud-init service is still disabled in the system VMs and it's executed as part of the cloud-postinit service which executes the postinit.sh script.

Added global settings:
systemvm.userdata.enabled - Disabled by default. Needs to be enabled to utilize the feature.
console.proxy.vm.userdata - UUID of the User data to be used for Console Proxy
secstorage.vm.userdata - UUID of the User data to be used for Secondary Storage VM
virtual.router.userdata - UUID of the User data to be used for Virtual Routers
2025-10-08 10:44:26 +05:30
Manoj Kumar 9bcd98876d
Make kvm domain persistent when unmanaged from CS (#11541)
CS creates transient KVM domain.xml. When instance is unmanaged from CS, explicit dump of domain has to be taken to manage is outside of CS.

With this PR

    domainXML gets backed up and becomes persistent for further management of Instance.
    Stopped instance also can be unmanaged, last host for instance is considered for defining domain
    hostid param is supported in unmanageVirtualMachine API for KVM hypervisor and for stopped Instances
    hostid field in response of unmanageVirtualMachine, representing host used for unmanage operation
    Disable unmanaging instance with config drive, can unmanage from API using forced=true param for KVM
2025-10-07 10:32:33 +05:30
Abhisar Sinha 23c9e83047
Create Instance from backup on another Zone (DRaaS use case) (#11560)
* draas initial changes

* Added option to enable disaster recovery on a backup respository. Added UpdateBackupRepositoryCmd api.

* Added timeout for mount operation in backup restore configurable via global setting

* Addressed review comments

* fix for simulator test failures

* Added UT for coverage

* Fix create instance from backup ui for other providers

* Added events to add/update backup repository

* Fix race in fetchZones

* One more fix in fetchZones in DeployVMFromBackup.vue

* Fix zone selection in createNetwork via Create Instance from backup form.

* Allow template/iso selection in create instance from backup ui

* rename draasenabled to crosszoneinstancecreation

* Added Cross-zone instance creation in test_backup_recovery_nas.py

* Added UT in BackupManagerTest and UserVmManagerImplTest

* Integration test added for Cross-zone instance creation in test_backup_recovery_nas.py
2025-09-25 13:28:29 +05:30
dahn aca8732102
[router] make a distinction between fatal errors, warnings and unknown as healthcheck result (#10710)
* [routers] distiction between fatal failure and warning or unknown on healthchecks

* UI status for router health checks

* status from scripts varied

* automation signalled errors

* revert removal of update sql

* upgradeversion

* move config item and further cleanup

* handling services better

* backwards compatible response

---------

Co-authored-by: Daan Hoogland <dahn@apache.org>
2025-09-22 11:39:05 +05:30
vishesh92 ada750e391
Merge branch '4.20' 2025-09-17 14:26:06 +05:30
Wei Zhou 7c7497c624
Merge remote-tracking branch 'apache/4.19' into 4.20 2025-09-15 10:19:27 +02:00
Suresh Kumar Anaparti 6d16ac2113
ScaleIO/PowerFlex smoke tests improvements, and some fixes (#11554)
* ScaleIO/PowerFlex smoke tests improvements, and some fixes

* Fix test_volumes.py, encrypted volume size check (for powerflex volumes)

* Fix test_over_provisioning.py (over provisioning supported for powerflex)

* Update vm snapshot tests

* Update volume size delta in primary storage resource count for user vm volumes only
The VR volumes resource count for PowerFlex volumes is updated here, resulting in resource count discrepancy
(which is re-calculated through ResourceCountCheckTask later, and skips the VR volumes)

* Fix test_import_unmanage_volumes.py (unsupported for powerflex)

* Fix test_sharedfs_lifecycle.py (volume size check for powerflex)

* Update powerflex.connect.on.demand config default to true
2025-09-12 16:17:20 +02:00
Wei Zhou 70a4503ea1
Merge remote-tracking branch 'apache/4.20' 2025-09-11 14:04:52 +02:00
Nicolas Vazquez 036fd00170
kvm: Fix NPE in case host UEFI detail is not set on agent connection (#11610) 2025-09-11 10:40:08 +02:00
Vitor Hugo Homem Marzarotto 2e113e5ed7
Change log level of AgentHandler#processRequest() (#10869)
Co-authored-by: Vitor Hugo Homem Marzarotto <vitor.marzarotto@scclouds.com.br>
2025-09-10 11:06:16 +02:00
Suresh Kumar Anaparti 1033be4b31
Updating pom.xml version numbers for release 4.22.0.0-SNAPSHOT
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-28 12:00:42 +05:30
Suresh Kumar Anaparti f9513b47bf
Updating pom.xml version numbers for release 4.21.0.0
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-22 11:42:37 +05:30
slavkap 1272b13087
Fix of create a template from a StorPool snapshot on another zone (#11490)
* Fix of create template from snapshot on another zone

When a snapshot has a copy on StorPool primary storage in another zone, but the original snapshot resides on secondary storage, creating a template from the copied snapshot results in the template being created in the first zone.
If the snapshot.backup.to.secondary setting is disabled, and a user creates a volume or template from a snapshot, the snapshot is temporarily backed up to secondary storage during the operation. After the operation, this backup should be deleted. However, the snapshot currently remains on both primary and secondary storage.

* update snapshot info depending on the data store role
2025-08-22 00:53:55 +05:30
Suresh Kumar Anaparti e0bc8c3b1a
Merge branch '4.20' 2025-08-21 18:35:34 +05:30
Pearl Dsilva 6e59f4f4cc
Fix deployment of CKS clusters in Basic zone (#11457) 2025-08-21 18:32:01 +05:30
Suresh Kumar Anaparti f671461d4c
Fix for create template from snapshot (for snapshots on primary storage and storage doesn't support create snapshot to template directly) (#11452)
* Fix for create template from snapshot

* code improvements, for create volume from snapshot
2025-08-15 22:17:03 +05:30
Suresh Kumar Anaparti 4c3f29de1e
Agent manager connection handling improvements (#11376)
* Agent manager connection handling improvements

* Fix to send LB check interval in ready command
2025-08-05 15:07:02 +05:30
slavkap e5f61164b3
Support of snapshot copy to primary storage in different zones. (#9478)
* Support of snapshot copy to different StorPool primary storage between zones
2025-08-04 16:35:16 +05:30
Abhisar Sinha a87c5c2b3a
Create new Instance from VM backup (#10140)
This feature adds the ability to create a new instance from a VM backup for dummy, NAS and Veeam backup providers. It works even if the original instance used to create the backup was expunged or unmanaged. There are two parts to this functionality:
Saving all configuration details that the VM had at the time of taking the backup. And using them to create an instance from backup.
Enabling a user to expunge/unmanage an instance that has backups.
2025-07-31 15:47:22 +05:30
Vishesh f6ad184ea2
Feature: Add support for GPU with KVM hosts (#11143)
This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM.

It allows the operator to discover the GPU devices on the KVM host and create a Compute Offering with GPU support based on the available GPU devices on the host. Once the operator has created the Compute offering, it can be used by users to launch Instances with GPU devices.
2025-07-29 13:46:24 +05:30
Harikrishna cca8b2fef9
Extensions Framework & Orchestrate Anything (#9752)
The Extensions Framework in Apache CloudStack is designed to provide a flexible and standardised mechanism for integrating external systems and custom workflows into CloudStack’s orchestration process. By defining structured hook points during key operations—such as virtual machine deployment, resource preparation, and lifecycle events—the framework allows administrators and developers to extend CloudStack’s behaviour without modifying its core codebase.
2025-07-28 10:41:17 +05:30
Pearl Dsilva 0d4147f3f6
Netris Network Plugin Integration with CloudStack (#10458)
The Netris Plugin introduces Netris as a network service provider in CloudStack to be able to create and manage Virtual Private Clouds (VPCs) in CloudStack, being able to orchestrate the following network functionalities:

- Network segmentation with Netris-VXLAN isolation method
- Routing between "public" IP and network segments with an ACS ROUTED mode offering
- SourceNAT, DNAT, 1:1 NAT between "public" IP and network segments with an ACS NATTED mode offering
- Routing between VPC network segments (tiers in ACS nomenclature)
- Access Lists (ACLs) between VPC tiers and "public" network (TCP, UDP, ICMP) both as global egress rules and "public" IP specific ingress rules.
- ACLs between VPC network tiers (TCP, UDP, ICMP)
- External load balancing – between VPC network tiers and "public" IP
- Internal load balancing – between VPC network tiers
- CloudStack Virtual Router services (DHCP, DNS, UserData, Password Injection, etc…)
2025-07-25 15:26:42 +05:30
Abhishek Kumar 83bccead3d
schema, refactor: rename cloud.user_vm_details to cloud.vm_instance_details (#10736)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: dahn <daan@onecht.net>
2025-07-24 12:08:29 +02:00
slavkap 54b44cc316
KVM: Option to deploy a VM with existing volume/snapshot (#10503)
* Option to deploy a VM with existing volume/snapshot

* smoke test changes

check if the hypervisor is KVM
check if the primary storage's scope is ZONE wide

* skip all tests if the storage isn't Zone-Wide and the hypervisor isn't KVM

* support StorPool tags

add StorPool tags to a volume created from snapshot or to a volume which
will be attached as a ROOT to a new VM

* Add StorPool tags on the new ROOT volume

* Add the StorPool's tags when volume is created from a snapshot or a
volume is attached as a ROOT to a VM

* Addressed review
2025-07-14 15:10:45 +05:30
roopsai f9588960d4
Refactor: Replace sleep() with wait() (#10504) 2025-07-11 15:20:28 +02:00
Daan Hoogland 3e3a0c0678 Merge branch '4.20' 2025-07-03 15:29:05 +02:00
Suresh Kumar Anaparti be22bfe2c9
Management Server - Prepare for Maintenance and Cancel Maintenance improvements (#10995)
* Management Server - Prepare for Maintenance and Cancel Maintenance improvements:
- Added new setting 'management.server.maintenance.ignore.maintenance.hosts' to ignore hosts in maintenance states  while preparing management server for maintenance. This skips agent transfer and agents count check for hosts in maintenance.
- Rebalance indirect agents after cancel maintenance, using rebalance parameter in cancelMaintenance API
- Force maintenance after maintenance window timeout, using forced parameter in prepareForMaintenance API.
- Propagate 'indirect.agent.lb.check.interval' setting change to the host agents.

* rebases fixes

* code improvements, cleanup

* [UI] Set rebalance true by default in cancel maintenance dialog

* Update MS state after executing cluster cmd in the target MS, and some code improvements

* code improvements

* Ensure the host lb algorithm 'shuffle' is applied once before disabling the indirect agent lb check background task
2025-07-03 12:17:04 +05:30
Nicolas Vazquez 75147b7811
[Vmware to KVM Migration] Display virt-v2v and ovftool versions for supported hosts for migration (#11019)
* [Vmware to KVM Migration] Display virt-v2v and ovftool versions for supported hosts for migration

* Fix UI display

* Address review comments

* Fix ovftool and version display - also display versions on host details view
2025-06-23 12:49:51 +02:00
Pearl Dsilva 379ee07d88 Updating pom.xml version numbers for release 4.19.4.0-SNAPSHOT
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-06-06 18:00:09 +05:30
Pearl Dsilva b5e2c181f9 Updating pom.xml version numbers for release 4.20.2.0-SNAPSHOT
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-06-06 15:38:12 +05:30
Pearl Dsilva c61a5eb430 Updating pom.xml version numbers for release 4.20.1.0
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-05-30 12:43:00 +05:30
Daan Hoogland 0c7d47138d Updating pom.xml version numbers for release 4.19.3.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-05-30 09:08:58 +02:00
Daan Hoogland 650b5ec3da Merge branch '4.20' 2025-05-27 18:18:39 +02:00
Pearl Dsilva 16fc2cd1f0 Merge branch '4.19' of https://github.com/apache/cloudstack into 4.20 2025-05-27 19:27:33 +05:30
dahn bb79f0b727
engine/schema: create default network offering for vpc tier with conserve_mode=1 for fresh installation (#10744) (#10843)
Co-authored-by: Wei Zhou <weizhou@apache.org>
2025-05-27 08:17:49 +02:00
Wei Zhou 842b2f8c24
Merge remote-tracking branch 'apache/4.20' 2025-05-19 21:25:37 +02:00
Wei Zhou 5444261902
test: fix several simulator CI failures (#10890)
* test: fix several simulator CI failures

* Inject dataStoreProviderManager
2025-05-19 18:33:14 +02:00
Harikrishna b17808bfba
Introducing Storage Access Groups for better management for host and storage connections (#10381)
* Introducing Storage Access Groups to define the host and storage pool connections

In CloudStack, when a primary storage is added at the Zone or Cluster scope, it is by default connected to all hosts within that scope. This default behavior can be refined using storage access groups, which allow operators to control and limit which hosts can access specific storage pools.

Storage access groups can be assigned to hosts, clusters, pods, zones, and primary storage pools. When a storage access group is set on a cluster/pod/zone, all hosts within that scope inherit the group. Connectivity between a host and a storage pool is then governed by whether they share the same storage access group.

A storage pool with a storage access group will connect only to hosts that have the same storage access group. A storage pool without a storage access group will connect to all hosts, including those with or without a storage access group.
2025-05-19 11:33:29 +05:30
Daan Hoogland 8f8c685d17 Merge branch '4.19' into 4.20 2025-05-16 15:51:37 +02:00
Manoj Kumar d5ba23c848
Introduce volume allocation algorithm global configuration (#10696) 2025-05-16 14:06:42 +02:00
slavkap c183fc9859
Prevent data corruption for StorPool volumes (#10799) 2025-05-16 10:02:33 +02:00
Suresh Kumar Anaparti 95489b8bdd
Direct agents rebalance improvements with multiple management server nodes (#10674)
Sometimes hypervisor hosts (direct agents) stuck with Disconnect state during agent rebalancing activity across multiple management server nodes. This issue was noticed during frequent restart of the management server nodes in the cluster.

When there are multiple management server nodes in a cluster, if one or more nodes are shutdown/start/restart, CloudStack will rebalance the hosts among the remaining nodes or move the nodes to the newly joined management server nodes. During the rebalancing period multiple operations could happen including:

- DirectAgentScan at interval of configured direct.agent.scan.interval
- AgentRebalanceScan to identify and schedule rebalance agents
- TransferAgentScan to transfer the host from original owner to future owner

**Current Rebalance behavior**

1. For hosts that have AgentAttache && not forForward but in Disconnect state, CloudStack simply ignore these hosts without trying to ping again or update the status of the host.
2. For hosts that have AgentAttache && forForward, CloudStack removes the agent but still try to loadDirectlyConnectedHost.

**Improved Rebalance behavior**
During DirectAgentScan: scanDirectAgentToLoad(),  identify hosts that for self-managed hosts that are in Disconnect state (disconnected after pingtimeout).

1. For hosts that have AgentAttache and is forForward, CloudStack should remove the agent
2. For hosts that have AgentAttache and is not forForward but in Disconnect state, CloudStack should try to investigate and update the status to Up if host is pingable.
3. For hosts that don't have AgentAttache, CloudStack should try to loadDirectlyConnectedHost.
2025-05-13 17:47:46 +05:30
João Jandre 6fdaf51ddc
KVM incremental snapshot feature (#9270)
* KVM incremental snapshot feature

* fix log

* fix merge issues

* fix creation of folder

* fix snapshot update

* Check for hypervisor type during parent search

* fix some small bugs

* fix tests

* Address reviews

* do not remove storPool snapshots

* add support for downloading diff snaps

* Add multiple zones support

* make copied snapshots have normal names

* address reviews

* Fix in progress

* continue fix

* Fix bulk delete

* change log to trace

* Start fix on multiple secondary storages for a single zone

* Fix multiple secondary storages for a single zone

* Fix tests

* fix log

* remove bitmaps when deleting snapshots

* minor fixes

* update sql to new file

* Fix merge issues

* Create new snap chain when changing configuration

* add verification

* Fix snapshot operation selector

* fix bitmap removal

* fix chain on different storages

* address reviews

* fix small issue

* fix test

---------

Co-authored-by: João Jandre <joao@scclouds.com.br>
2025-05-12 10:50:30 -03:00
Pearl Dsilva 1e5d133033 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-05-12 13:12:09 +05:30
Pearl Dsilva a21f912be3 Merge branch '4.19' of https://github.com/apache/cloudstack into 4.20 2025-05-12 12:41:34 +05:30
Wei Zhou 7e2aa0efe4
engine/schema: create default network offering for vpc tier with conserve_mode=1 for fresh installation (#10744) 2025-05-09 13:51:43 +05:30
Abhishek Kumar 919c9797cc
server: prevent duplicate HA works and alerts (#10624)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-05-06 10:42:30 +02:00
Wei Zhou fd74895ad0
New feature: Reconcile commands (CopyCommand, MigrateCommand, MigrateVolumeCommand) (#10514) 2025-05-02 09:15:03 +02:00
Daan Hoogland d7d9d131b2 Merge branch '4.20' 2025-05-01 15:44:09 +02:00
Suresh Kumar Anaparti 9f229600e6
Add new config (non-dynamic) for agent connections monitor thread, and keep timeunit to secs (in sync with the earlier Wait config) (#10525) 2025-04-28 15:32:03 +02:00
Pearl Dsilva 2df1ac5106 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-04-28 12:15:48 +05:30
Pearl Dsilva 0785ba046e Merge branch '4.19' of https://github.com/apache/cloudstack into 4.20 2025-04-28 11:10:08 +05:30
Fabricio Duarte 9d263cd71b
Network Usage event model adjustments (#10755) 2025-04-26 17:35:28 +02:00
Abhishek Kumar 12c077d704
api,ui: multi arch improvements (#10289) 2025-04-25 11:02:27 +02:00
Daan Hoogland 3c75d9363b Merge branch '4.20' 2025-04-17 15:59:41 +02:00
Daan Hoogland d7765343ef Merge branch '4.19' into 4.20 2025-04-17 15:40:10 +02:00
Wei Zhou 7b68615bd9
HA: set correct hostId of HA work for vm migration (#10591) 2025-04-17 10:02:46 +02:00
Fabricio Duarte ac6b1b382c
Migrate public templates that have URLs on data migration across secondary storages (#10364)
Co-authored-by: Fabricio Duarte <fabricio.duarte@scclouds.com.br>
2025-04-15 13:48:45 +02:00
Suresh Kumar Anaparti 9dceae4614
MS maintenance improvements (#10417)
* Update last agents during ms maintenance, and some code improvements

* Send 503 (Service Unavailable) response status when maintenance or shutdown is initiated
[Any load balancer in the clustered environment can avoid routing requests to this MS node]

* Migrate systemvm agents before routing host agents, and some code improvements

* Added events for ms maintenance and shutdown operations

* Added the following ms maintenance and shutdown improvements

- block new agent connections during prepare for maintenance of ms

- maintain avoids ms list

- propagate updated management servers list and lb algorithm in host and indirect.agent.lb.algorithm settings respectively, to systemvm (non-routing) agents

- updated setup ms list and migrate agent connections to executor service

- migrate agent connection through executor, and send the answer to the ms host that initiated the migration

- re-initialize ssl handshake executor if it is shutdown

- don't allow prepare for maintenance or shutdown when other management server nodes are in preparing states

- don't allow trigger shutdown when management server is up and other management server nodes are in preparing states

- stop agent connections monitor on ms maintenance

- update avoid ms list in ready command

- updated connected host from the client connection

- update last agents in ms metrics from the database

- updated some agent config descriptions

- update last management server in the hosts during shutdown

- added agents and lastagents in management server response

- updated management server maintenance & shutdown unit tests

- some code improvements

* refactored code / addressed comments

* removed shutdown testcase (maybe, calling System.exit)

* Revert "removed shutdown testcase (maybe, calling System.exit)"

This reverts commit e14b071715.

* avoid system.exit during shutdown test

* code improvements

* testcase fix

* Fix cutoff time in agent connections monitor thread
2025-03-19 14:18:05 +05:30
Abhishek Kumar 1c1dad977e Merge remote-tracking branch 'apache/4.20' 2025-03-06 09:55:27 +05:30
Pearl Dsilva 3aabedd447
UI: Proper explanation for the global setting to avoid ambiguity (#10042) 2025-03-04 15:07:43 +01:00
Pearl Dsilva bdae23ed53
Fix listing disk offerings for newly created VMs that haven't yet been started (#10476) 2025-02-28 10:24:23 -05:00
Pearl Dsilva 3a28a87483 Merge branch '4.20' of https://github.com/apache/cloudstack 2025-02-27 11:20:25 -05:00
Abhishek Kumar e8ac477e9f
engine/orchestration: fix missing vm powerstate update vm state (#10407)
* engine/orchestration: fix missing vm powerstate update vm state

Fixes #10406
VMs were not moving to Stopped state when PowerReportMissing is processed.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add unit tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add license

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add lenient

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-02-25 15:50:27 +05:30
Daan Hoogland 4a3686297d Updating pom.xml version numbers for release 4.19.3.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-25 10:43:11 +01:00
Daan Hoogland 24b7c66251 Merge branch '4.20' 2025-02-24 14:33:12 +01:00
Nicole Schmidt c80b8860e4
Fix hostId verification on unsuccessful expunge operation (#10418) 2025-02-20 09:11:53 -05:00
Daan Hoogland 4e321d4356 Updating pom.xml version numbers for release 4.19.2.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2025-02-20 09:32:07 +01:00
Daan Hoogland 0dcb8da03a Merge branch '4.20' 2025-02-12 16:54:05 +01:00
Daan Hoogland 4f3e8e8c5a Merge branch '4.19' into 4.20 2025-02-12 15:00:51 +01:00
Rene Glover 3337f425ff
Primera pure patches & various small fixes (#10132)
Co-authored-by: GLOVER RENE <rg9975@cs419-mgmtserver.rg9975nprd.app.ecp.att.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-02-07 13:19:34 +01:00
Daan Hoogland 2654890e86 Merge branch '4.20' 2025-02-01 21:20:08 +01:00
Daan Hoogland 085bd3bda5 Merge branch '4.19' into 4.20 2025-02-01 17:51:50 +01:00
Abhishek Kumar 0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
Wei Zhou fbb1ff78d6
Static Routes: fix check on wrong global configuration (#10066) 2025-01-31 11:04:13 +01:00
Suresh Kumar Anaparti 3b108b968f
Support for Management Server Maintenance Mode (#9854)
* Support for Management Server Maintenance

- New APIs: prepareForMaintenance and cancelMaintenance, with required parameter - managementserverid.

- New management server states for maintenance: PreparingForMaintenance, Maintenance.

- listHosts API with optional parameter – managementserverid, to list the hosts connected to the management server.

- Support management server maintenance when more than one active management servers available.

- Triggers transfer agents to other available management servers for maintenance, new agent command MigrateAgentConnectionCommand to initiate transfer of indirect agents.

- New global config 'management.server.maintenance.timeout', to set the timeout (in mins) for the management server maintenance window, default: 60 mins.

- UI changes: Prepare and Cancel Maintenance in Management Server section, Connected Agents tab, New fields for hosts and management servers.

* Updated pending jobs check timer task with ScheduledExecutorService

* keep maintenance state on trigger shutdown call when ms is in maintenance

* add pending jobs count to ms response

* during ms heartbeat, update state to up only when it's down

* allow vm work jobs of async job created before prepare for maintenance

* Revert "keep maintenance state on trigger shutdown call when ms is in maintenance"

This reverts commit 607e13364679eac897f4d146bb3325ea7a61ba17.

* skip maintenance test when multiple management servers are not available, and not configured in host setting for kvm
2025-01-29 13:31:15 +05:30
Daan Hoogland 048649d351 Merge release branch 4.20 to main
* 4.20:
  server: investigate pending HA work when executing in new MS session (#10167)
  extra null guard (#10264)
2025-01-28 14:34:19 +01:00
Daan Hoogland 717ce981d4 Merge release branch 4.19 to 4.20
* 4.19:
  extra null guard (#10264)
2025-01-28 14:33:49 +01:00
Abhishek Kumar 33a37da9ec
server: investigate pending HA work when executing in new MS session (#10167)
For HA work items that are created for host state change, checks must be
done when execution is called in a new management server session.

A new column, reason, has been added in cloud.op_ha_work table to track
the reason for HA work.

When HighAvailabilityManager starts it finds and puts all pending HA
work items in Investigating state. During execution of the HA work if it
is found in investigating state, checks are done to verify if the work
is still valid. If the jobs is found to be invalid it is cancelled.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-01-28 14:39:31 +05:30
dahn f652ad0d98
extra null guard (#10264) 2025-01-27 14:14:31 +01:00
Daan Hoogland 98f5663954 Merge branch '4.20' 2025-01-24 17:10:43 +01:00
Daan Hoogland 34d2a3bc86 Merge branch '4.19' into 4.20 2025-01-24 17:01:42 +01:00
dahn 0a77eb7f85
deal with NPE during host reconnect (#10158)
* log to see what command is being processed

* exception names
2025-01-24 15:39:56 +05:30
Daan Hoogland fadb39ece7 Merge release branch 4.20 to main
* 4.20:
  merge errors fixed
  Restrict the migration of volumes attached to VMs in Starting state (#9725)
  server, plugin: enhance storage stats for IOPS (#10034)
  Introducing granular command timeouts global setting (#9659)
  Improve logging to include more identifiable information (#9873)
2025-01-08 14:01:19 +01:00
Harikrishna 9bc283e5c2
Introducing granular command timeouts global setting (#9659)
* Introducing granular command timeouts global setting

* fix marvin tests

* Fixed log messages

* some more log message fix

* Fix empty value setting

* Converted the global setting to non-dynamic

* set wait on command only when granular wait is defined. This is to keep the backward compatibility

* Improve error logging
2025-01-07 17:06:32 +05:30
Vishesh a4224e58cc
Improve logging to include more identifiable information (#9873)
* Improve logging to include more identifiable information for kvm plugin

* Update logging for scaleio plugin

* Improve logging to include more identifiable information for default volume storage plugin

* Improve logging to include more identifiable information for agent managers

* Improve logging to include more identifiable information for Listeners

* Replace ids with objects or uuids


* Improve logging to include more identifiable information for engine

* Improve logging to include more identifiable information for server

* Fixups in engine

* Improve logging to include more identifiable information for plugins

* Improve logging to include more identifiable information for Cmd classes

* Fix toString method for StorageFilterTO.java
2025-01-06 16:42:37 +05:30
Daan Hoogland 9295a1624d Merge release branch 4.20 to main
* 4.20:
  VR: apply iptables rules when add/remove static routes (#10064)
  Certificate and VM hostname validation improvements (#10051)
  set ulimit for server according to redhat spec (#10040)
  kvm-storage: provide isVMMigrate information to storage plugins (#10093)
  Allow config drive deletion of migrated VM, on host maintenance (#10045)
  linstor: improve heartbeat check with also asking linstor (#10105)
  server: simplify role change validation (#9173)
  UI: create VPC network offering with conserve mode (#10082)
  server: fix typo removeaccessvpn in VirtualRouterElement (#10086)
  UI: remove duplicated Instance Name in Public IP details page (#10087)
  UI: Fixes in the Usage UI (#10000)
  SAML2: add cookie with HttpOnly too #10013 (#10047)
  ui: Allow font-awesome icon usage and optimise icon size inconsistency (#9744)
2024-12-20 14:37:49 +01:00
Daan Hoogland b7f0aac519 Merge branch '4.19' into 4.20 2024-12-20 14:34:39 +01:00
Suresh Kumar Anaparti b4ad04badf
Allow config drive deletion of migrated VM, on host maintenance (#10045) 2024-12-18 09:12:28 +01:00
João Jandre d9774a8462 Updating pom.xml version numbers for release 4.21.0.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:47:06 -03:00
João Jandre c63c7ee63e Updating pom.xml version numbers for release 4.20.1.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:40:45 -03:00
João Jandre 2fe3fcef7c Updating pom.xml version numbers for release 4.20.0.0
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-19 08:54:07 -03:00