Commit Graph

727 Commits

Author SHA1 Message Date
Daan Hoogland 22da57f922 Merge branch '4.22' 2025-12-22 14:13:50 +01:00
Daan Hoogland 55ab7c5589 Merge branch '4.20' into 4.22 2025-12-22 13:23:37 +01:00
vladimirpetrov b394b5ba74
Fix terms, typos and grammar mistakes in the API, error messages, events, etc. (#7857)
This PR aligns the use of terminology, renaming VM / virtual machine references to 'Instance' and also capitalising the terms Templates, Network, Snapshot, User, Account in CloudStack APIs, error and log messages, events, tooltips, etc. Many typos, grammar and spelling mistakes were fixed, also terms like IPv4, VPN, VPC, etc. were properly capitalised. Some error messages were cleaned for better readability. The test cases, expecting some exception strings were adjusted accordingly.

Here is the wiki page, describing the changes in details:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Object+Naming+and+Title+Case+Convention

---------

Co-authored-by: Manoj Kumar <manojkr.itbhu@gmail.com>
Co-authored-by: Harikrishna <harikrishna.patnala@gmail.com>
2025-12-22 15:18:58 +05:30
dahn c81295439f
removed code in comments (#11145) 2025-12-08 16:31:48 +01:00
Suresh Kumar Anaparti b0d74fe00c
Merge branch '4.22' 2025-12-05 18:59:03 +05:30
Suresh Kumar Anaparti a0ba2aaf3f
Merge branch '4.20' into 4.22 2025-12-05 18:41:18 +05:30
Suresh Kumar Anaparti e4414d1c44
Fix agent wait before reconnect (#12153) 2025-12-03 11:19:47 +05:30
Daan Hoogland 9032fe3fb5 merge LTS branch 4.22 into main 2025-11-26 11:55:50 +01:00
John Bampton 4ed86a2627
pre-commit upgrade codespell; fix spelling; (#10144) 2025-11-14 14:17:10 +01:00
Suresh Kumar Anaparti 2dd1e6d786
Enable UEFI on KVM hosts (by default), and configure with some default settings (#11740) 2025-11-07 14:54:02 +01:00
Harikrishna Patnala dbda673e1f Updating pom.xml version numbers for release 4.23.0.0-SNAPSHOT
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-11-05 16:54:39 +05:30
Harikrishna Patnala d160731b9f Updating pom.xml version numbers for release 4.22.1.0-SNAPSHOT
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-11-05 16:07:07 +05:30
Harikrishna Patnala 71f47d6130 Updating pom.xml version numbers for release 4.22.0.0
Signed-off-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
2025-10-30 19:23:56 +05:30
Wei Zhou e333ce9782
Updating pom.xml version numbers for release 4.20.3.0-SNAPSHOT 2025-10-24 09:13:19 +02:00
Wei Zhou 4dc3931233
Updating pom.xml version numbers for release 4.20.2.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2025-10-16 11:42:56 +02:00
Harikrishna Patnala 8b9f5fd8f9 Merge branch '4.20' 2025-10-16 13:39:40 +05:30
Rohit Yadav 6f931dbd00
agent: increase timeout for host arch retrieval (#11254) (#11822)
Cherry-picked from 44f80648a9

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-10-14 10:53:45 +02:00
Nicolas Vazquez b106d6e190
VMware to KVM Migrations improvements (#11594)
* Add source VM name on virt-v2v migration log entries

* Improve the feedback by displaying the running importing tasks

* Add source VM name prefix on more conversion logs

* Improve listing and also list completed tasks

* Pass extra parameters to virt-v2v if administrator allows via global setting

* Add Force converting directly to storage pool option

* Refactor based on review comments

* Add properties for env vars for the instance conversion

* Add separate component for Import VM Tasks

* applying copilot suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix importing unmanaged instances due to incorrect internal name

* Add VM prefix on each log operation for conversion

* Log the original VM name instead of the cloned VM in case of cloning

* Allow searching storage pool by UUID after conversion to support SharedMountPoint

* Fix search pools logic

* Improve UI and add checks for force convert to pool parameter

* Support Local storage when forceconverttopool is set to true

* Add config key to for allowed extra params and add validation

* Fix params lists

* Fix compile error

* Remove extra stubbings

* Fix extra params execution

---------

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-10-10 20:00:29 -03:00
John Bampton f9a72de500
misc: pre-commit auto remove unneeded trailing whitespace (#11289) 2025-09-17 14:21:56 +02:00
John Bampton 57309314a1
pre-commit: clean up Python flake8 excludes with black (#9793) 2025-09-17 12:40:56 +02:00
Wei Zhou ca0c3530ad
utils: add UuidUtils.nameUUIDFromBytes (#11136)
* utils: add UuidUtils.nameUUIDFromBytes

* Fix PR 13922
2025-09-01 08:10:31 +02:00
Suresh Kumar Anaparti 1033be4b31
Updating pom.xml version numbers for release 4.22.0.0-SNAPSHOT
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-28 12:00:42 +05:30
Suresh Kumar Anaparti f9513b47bf
Updating pom.xml version numbers for release 4.21.0.0
Signed-off-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2025-08-22 11:42:37 +05:30
Bernardo De Marco Gonçalves ba2d70ab21
[KVM] CPU Features for System VMs (#10964)
* CPU features for System VMs

* Apply guest.cpu.features for System VMs
2025-08-15 20:02:50 +05:30
Suresh Kumar Anaparti 4c3f29de1e
Agent manager connection handling improvements (#11376)
* Agent manager connection handling improvements

* Fix to send LB check interval in ready command
2025-08-05 15:07:02 +05:30
Abhishek Kumar 44f80648a9
agent: increase timeout for host arch retrieval (#11254)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2025-08-01 17:42:08 +05:30
João Jandre 5ea1ada59a
Allow full clone volumes with thin provisioning in KVM (#11177)
It adds a configuration called create.full.clone to the agent.properties file. When set to true, all QCOW2 volumes created will be full-clone. If false (default), the current behavior remains, where only FAT and SPARSE volumes are full-clone and THIN volumes are linked-clone.
2025-07-31 16:12:17 +05:30
Suresh Kumar Anaparti 712492230a
Shutdown MS maintenance jobs when finished (#11330) 2025-07-31 10:00:29 +05:30
Vishesh f6ad184ea2
Feature: Add support for GPU with KVM hosts (#11143)
This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM.

It allows the operator to discover the GPU devices on the KVM host and create a Compute Offering with GPU support based on the available GPU devices on the host. Once the operator has created the Compute offering, it can be used by users to launch Instances with GPU devices.
2025-07-29 13:46:24 +05:30
Suresh Kumar Anaparti c94f75c7ea
PowerFlex/ScaleIO - Wait after SDC service start/restart/stop, and retry to fetch SDC id/guid (#11099)
* [PowerFlex/ScaleIO] Added wait time after SDC service start/restart/stop, and retries to fetch SDC id/guid

* Added agent property 'powerflex.sdc.service.wait' for the time (in secs) to wait after SDC service start/restart/stop

* code improvements
2025-07-16 12:32:09 +05:30
João Jandre 53eb2c5b9b
File-based disk-only VM snapshot with KVM as hypervisor (#10632)
Co-authored-by: João Jandre <joao@scclouds.com.br>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
2025-07-16 08:54:07 +02:00
Suresh Kumar Anaparti 3220eb442a
PowerFlex/ScaleIO - MDM and host SDC connection enhancements (#11047)
* Cumulative enhancements fix for ScaleIO: MDM add/remove, Host prepare/unprepare, validate Storage Pool can be created in Agent.

- Implemented validation to fail Host disconnect from Storage Pool if there are Volumes attached and SDC client MDM removal requires scini service to be restarted
- Implemented Storage Pool validation by checking whether MDM addresses from configuration file and from memory (using CLI) matches, otherwise file ModifyStoragePool command.
- Introduced configuration key to apply timeout after making MDM changes for ScaleIO: powerflex.mdm.change.apply.timeout.ms (default 1000ms)
- Implemented logic to apply timeout after making MDM changes for ScaleIO in prepare and unprepare logic
- Added detection of MDM removal support via CLI
- If MDM removal support via CLI supported then use CLI, fall back to edit drv_cfg.txt and restart scini instead

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
2025-07-16 08:25:28 +02:00
Suresh Kumar Anaparti be22bfe2c9
Management Server - Prepare for Maintenance and Cancel Maintenance improvements (#10995)
* Management Server - Prepare for Maintenance and Cancel Maintenance improvements:
- Added new setting 'management.server.maintenance.ignore.maintenance.hosts' to ignore hosts in maintenance states  while preparing management server for maintenance. This skips agent transfer and agents count check for hosts in maintenance.
- Rebalance indirect agents after cancel maintenance, using rebalance parameter in cancelMaintenance API
- Force maintenance after maintenance window timeout, using forced parameter in prepareForMaintenance API.
- Propagate 'indirect.agent.lb.check.interval' setting change to the host agents.

* rebases fixes

* code improvements, cleanup

* [UI] Set rebalance true by default in cancel maintenance dialog

* Update MS state after executing cluster cmd in the target MS, and some code improvements

* code improvements

* Ensure the host lb algorithm 'shuffle' is applied once before disabling the indirect agent lb check background task
2025-07-03 12:17:04 +05:30
Pearl Dsilva b5e2c181f9 Updating pom.xml version numbers for release 4.20.2.0-SNAPSHOT
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-06-06 15:38:12 +05:30
Pearl Dsilva c61a5eb430 Updating pom.xml version numbers for release 4.20.1.0
Signed-off-by: Pearl Dsilva <pearl1594@gmail.com>
2025-05-30 12:43:00 +05:30
João Jandre 6fdaf51ddc
KVM incremental snapshot feature (#9270)
* KVM incremental snapshot feature

* fix log

* fix merge issues

* fix creation of folder

* fix snapshot update

* Check for hypervisor type during parent search

* fix some small bugs

* fix tests

* Address reviews

* do not remove storPool snapshots

* add support for downloading diff snaps

* Add multiple zones support

* make copied snapshots have normal names

* address reviews

* Fix in progress

* continue fix

* Fix bulk delete

* change log to trace

* Start fix on multiple secondary storages for a single zone

* Fix multiple secondary storages for a single zone

* Fix tests

* fix log

* remove bitmaps when deleting snapshots

* minor fixes

* update sql to new file

* Fix merge issues

* Create new snap chain when changing configuration

* add verification

* Fix snapshot operation selector

* fix bitmap removal

* fix chain on different storages

* address reviews

* fix small issue

* fix test

---------

Co-authored-by: João Jandre <joao@scclouds.com.br>
2025-05-12 10:50:30 -03:00
Wei Zhou fd74895ad0
New feature: Reconcile commands (CopyCommand, MigrateCommand, MigrateVolumeCommand) (#10514) 2025-05-02 09:15:03 +02:00
Daan Hoogland 8af021c6f6 Merge branch '4.20' 2025-03-27 17:03:13 +01:00
Rohit Yadav 6b4adbb20a
Preview-Experimental Support EL10 as Management Server and KVM host (#10496)
* cloudstack: add support for EL10
This adds support for Fedora 40 and (upcoming) EL10 distro to be used
as mgmt/usage server, mysql/nfs & KVM host. Python3 version has changed
to 3.12.9 which isn't automatically determining the python-path.
* python: WIP code, this fails right now
Need to discuss/check if we can skip this code. Where/how is cgroup
setup used with KVM agent.
* prep cloudutils to be EL10 ready
Fixes issue for Fedora, it was running old EL6 hooks which isn't
applicable for modern Fedora version that are closer to EL8/9/10
2025-03-26 11:23:35 -04:00
Suresh Kumar Anaparti 9dceae4614
MS maintenance improvements (#10417)
* Update last agents during ms maintenance, and some code improvements

* Send 503 (Service Unavailable) response status when maintenance or shutdown is initiated
[Any load balancer in the clustered environment can avoid routing requests to this MS node]

* Migrate systemvm agents before routing host agents, and some code improvements

* Added events for ms maintenance and shutdown operations

* Added the following ms maintenance and shutdown improvements

- block new agent connections during prepare for maintenance of ms

- maintain avoids ms list

- propagate updated management servers list and lb algorithm in host and indirect.agent.lb.algorithm settings respectively, to systemvm (non-routing) agents

- updated setup ms list and migrate agent connections to executor service

- migrate agent connection through executor, and send the answer to the ms host that initiated the migration

- re-initialize ssl handshake executor if it is shutdown

- don't allow prepare for maintenance or shutdown when other management server nodes are in preparing states

- don't allow trigger shutdown when management server is up and other management server nodes are in preparing states

- stop agent connections monitor on ms maintenance

- update avoid ms list in ready command

- updated connected host from the client connection

- update last agents in ms metrics from the database

- updated some agent config descriptions

- update last management server in the hosts during shutdown

- added agents and lastagents in management server response

- updated management server maintenance & shutdown unit tests

- some code improvements

* refactored code / addressed comments

* removed shutdown testcase (maybe, calling System.exit)

* Revert "removed shutdown testcase (maybe, calling System.exit)"

This reverts commit e14b071715.

* avoid system.exit during shutdown test

* code improvements

* testcase fix

* Fix cutoff time in agent connections monitor thread
2025-03-19 14:18:05 +05:30
Daan Hoogland 2654890e86 Merge branch '4.20' 2025-02-01 21:20:08 +01:00
Abhishek Kumar 0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
niyamsw 5df15a7aa6
KVM/s390x Support: Add support for KVM on s390x architecture (#10038)
Signed-off-by: Niyam Siwach <niyam@ibm.com>
Signed-off-by: Himanshu Mishra <Himanshu.Mishra2@ibm.com>
2025-01-29 17:34:16 +01:00
Suresh Kumar Anaparti 3b108b968f
Support for Management Server Maintenance Mode (#9854)
* Support for Management Server Maintenance

- New APIs: prepareForMaintenance and cancelMaintenance, with required parameter - managementserverid.

- New management server states for maintenance: PreparingForMaintenance, Maintenance.

- listHosts API with optional parameter – managementserverid, to list the hosts connected to the management server.

- Support management server maintenance when more than one active management servers available.

- Triggers transfer agents to other available management servers for maintenance, new agent command MigrateAgentConnectionCommand to initiate transfer of indirect agents.

- New global config 'management.server.maintenance.timeout', to set the timeout (in mins) for the management server maintenance window, default: 60 mins.

- UI changes: Prepare and Cancel Maintenance in Management Server section, Connected Agents tab, New fields for hosts and management servers.

* Updated pending jobs check timer task with ScheduledExecutorService

* keep maintenance state on trigger shutdown call when ms is in maintenance

* add pending jobs count to ms response

* during ms heartbeat, update state to up only when it's down

* allow vm work jobs of async job created before prepare for maintenance

* Revert "keep maintenance state on trigger shutdown call when ms is in maintenance"

This reverts commit 607e13364679eac897f4d146bb3325ea7a61ba17.

* skip maintenance test when multiple management servers are not available, and not configured in host setting for kvm
2025-01-29 13:31:15 +05:30
Daan Hoogland 98f5663954 Merge branch '4.20' 2025-01-24 17:10:43 +01:00
Nicolas Vazquez 7e295ec4e1
[KVM] Add watchdog model none to disable use of watchdogs on KVM agent (#10203) 2025-01-23 13:15:02 +01:00
Daan Hoogland fadb39ece7 Merge release branch 4.20 to main
* 4.20:
  merge errors fixed
  Restrict the migration of volumes attached to VMs in Starting state (#9725)
  server, plugin: enhance storage stats for IOPS (#10034)
  Introducing granular command timeouts global setting (#9659)
  Improve logging to include more identifiable information (#9873)
2025-01-08 14:01:19 +01:00
Vishesh a4224e58cc
Improve logging to include more identifiable information (#9873)
* Improve logging to include more identifiable information for kvm plugin

* Update logging for scaleio plugin

* Improve logging to include more identifiable information for default volume storage plugin

* Improve logging to include more identifiable information for agent managers

* Improve logging to include more identifiable information for Listeners

* Replace ids with objects or uuids


* Improve logging to include more identifiable information for engine

* Improve logging to include more identifiable information for server

* Fixups in engine

* Improve logging to include more identifiable information for plugins

* Improve logging to include more identifiable information for Cmd classes

* Fix toString method for StorageFilterTO.java
2025-01-06 16:42:37 +05:30
Bernardo De Marco Gonçalves f75a194c09
Persist IP addresses related to VM access via CPVM (#9534) 2024-12-10 11:43:17 +01:00
João Jandre d9774a8462 Updating pom.xml version numbers for release 4.21.0.0-SNAPSHOT
Signed-off-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-11-27 11:47:06 -03:00