This PR enhances the existing CLVM implementation which was based on the deprecated CLVM technology which was based on corosync/pacemaker. With RHEL 7 having reached EOL, CLVM seems to be broken. CLVM supports RAW volumes on LVM , where as CLVM_NG support QCOW2 on LVM.
Further details: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Modernized+CLVM%3A+Enhancements+and+CLVM_NG+support
NOTE: On testing - it was identified that incremental snapshots for clvm-ng do not work as expected. As of now it's been removed from scope. So, CLVM and CLVM_NG would only support full snapshots.
* add support for proper cleanup of snapshots and prevent vol snapshot of running vm
* remove snap vol restriction for sunning vms
* refactor clvm code
* add support for live migration
* add support for migrating lvm lock
* clvm deletion called explicitly
* made necessary changes to allow migration of lock and deletion of detached volumes
* fix create vol from snap and attach
* add support to revert snapshot for clvm
* add support to revert snapshot for clvm
* make zero fill configurable
* make setting non-dynamic & fix test
* fix locking at vol/vm creation
* fix revert snapshot format type and handle revert snapshot functionality for clvm
* 1. Create clvmlockmanager and move common code \n
2. handle attaching volumes to stopped VMs \n
3. Handle lock transfer when VM is started on another host
* add license
* remove command/answer classes from sonar coverage check
* add support for new gen clvm with template (qcow2) backing
* Add support for clvm_ng - which allows qcow2 on block storage , linked clones, etc
* fix test and use physical size + 50% of virtual size for backing file, while virtual size + pe for disk
* migrate clvm volumes as full clone and allow migration from clvm to nfs
* fix clvm_ng to nfs migration, and handle overhead calc
* support live migration from clvm_ng to nfs and vice-versa
* add support to migrate to and from clvm to nfs
* fix creation of volume on destination host during migration to clvm/clvm-ng
* support live vm migration between clvm -> clvm-ng (vice-versa), nfs -> clvm (vice-versa) and nfs->clvm-ng (vice-versa)
* add unit tests for clvm/clvm_ng operations
* Add support for incremental volume snapshots for clvm_ng
* prevent snapshot backup for incremental clvm_ng snaps, fix build failure, add unit tests
* fix lockhost on creation of volumes from snap and fix bitmap issue when migrating a vol with incremental snap
* restrict pre and post migration commands to only kvm hosts where vm has CLVM/CLVM-NG volumes
* evist lock tracking - use lvs command to get lock host than DB
* add test for pre/post migration
* Create a CLVM storage adaptor
* update existing clvm get stats method
* fix precommit check failure
* Apply suggestions from code review
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* Apply suggestions from code review
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* improve lock host retrieval logic and quicker retrival using db host as first check point and then fanning out
* add proper support for resizing of clvm_ng which calculated PE correctly for qcow2 metadata
* fallback to full snapshots for clvm-ng - incremental not supported in 4.23
* expunge volume detail of lock host on vm expunge
* if vmmigration with volume is done to the same clvm volume group, then dont do data transfer, just lock transfer and vm
* add clvm pools with deterministic uuid , so as to prevent adding the same pool twic
* added a small improvement to factor in a senario when lv is inactive on all hosts, could happen in storage outage issue
* address comment - extract common code for endpoint identification if clvm pool type
* Address comments - add early return guard to reduce indentation
* minor improvement - when migrating vm with volumes, if there's a failures, change the clvm vols to exclusive on source from shared, and on success, change dest vol to exclusive only for cross-pool migration
* cleanup unused code and tests for incremental snaps for clvmng and other cleanups
* allow storage browser to list lv in clvm, fix clvm shrink, overprovisioning factor isnt used for clvm pools - so set it to 1 and prevented display of provisioning type for clvm
* no need to have locktransfercommand to execute in sequence
* increase lv cmd timeouts to consider cluster load
---------
Co-authored-by: Pearl Dsilva <pearl1954@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
This PR introduces the initial implementation of Veeam integration support for KVM in CloudStack by adding a UHAPI-compatible server and image server components.
Veeam Backup & Replication interacts with virtualization platforms using its Universal Hypervisor API (UHAPI). To enable backup and restore workflows for CloudStack-managed KVM environments, this change introduces a UHAPI server that exposes CloudStack resources through a UHAPI-compatible interface.
In addition to the control plane APIs, an image server component is introduced to handle the data transfer operations required during backup and restore workflows.
The integration consists of two main components:
1. UHAPI Server (Control Plane) named CloudStack Veeam Control Service
A lightweight UHAPI server runs inside the CloudStack management server and exposes endpoints under:
/ovirt-engine
- /api - For APIs
- /sso - For authentication
- /services/pki-resource - For certificates
This server provides inventory discovery APIs required by Veeam and translates CloudStack resources into the structures expected by UHAPI.
The server:
- exposes infrastructure inventory
- handles authentication and session tokens
- maps CloudStack resources to UHAPI-compatible representations
2. Image Server (Data Plane) named CloudStack Image Service
A separate image server component is introduced to handle backup and restore data transfer operations.
This component:
- serves disk image data during backup
- receives image data during restore operations
- exposes endpoints used by Veeam worker components
- integrates with CloudStack storage to read and write VM disk data
The separation between both these components server ensures that:
- metadata APIs and control operations remain lightweight
- bulk image transfer operations are handled independently
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhisar Sinha <63767682+abh1sar@users.noreply.github.com>
Co-authored-by: abh1sar <abhisar.sinha@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
* Refactoring Allocator classes
* Break into smaller methods random and firfit allocators.
* Added unit tests for random and firstfit allocators
* Move random allocator from cloud-plugins to cloud-server
* Add BaseAllocator abstract class for duplicate code
* Add missing license
* Add missing license to unit test file
* Remove host allocator random dependency
* Change exception message on smoke tests
* Remove conditional as it was never actually reached in the original flow
* Fix tests
* Fix flipped parameters
* Fix NPE while listing hosts for migration when suitableHosts is null
* Remove unnecessary stubbings
* Fix checkstyle
* Remove unnecessary file
* Rename exception error messages
* Apply suggestions from code review
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* Rename UserVmDetailVO references to VMInstanceDetailVO
* Remove unused imports
* Add new line at EOF
* Remove unnecessary random allocator pom
* Fix GPU allocation mistake
* Fix failing tests
---------
Co-authored-by: Fabricio Duarte <fabricio.duarte@scclouds.com.br>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* initial attempt at network.loadbalancer.haproxy.idle.timeout implementation
* implement test cases
* move idleTimeout configuration test to its own test case
* Allow copy of templates from secondary storages of other zone when adding a new secondary storage
* Add API param and UI changes on add secondary storage page
* Make copy template across zones non blocking
* Code fixes
* unused imports
* Add copy template flag in zone wizard and remove NFS checks
* Fix UI
* Label fixes
* code optimizations
* code refactoring
* missing changes
* Combine template copy and download into a single asynchronous operation
* unused import and fixed conflicts
* unused code
* update config message
* Fix configuration setting value on add secondary storage page
* Removed unused code
* Update unit tests
* Cleanup userconcentratedpod_random and userconcentratedpod_firstfit allocation algorithm
* use firstfit instead of random for userconcentratedpod_firstfit
This PR aligns the use of terminology, renaming VM / virtual machine references to 'Instance' and also capitalising the terms Templates, Network, Snapshot, User, Account in CloudStack APIs, error and log messages, events, tooltips, etc. Many typos, grammar and spelling mistakes were fixed, also terms like IPv4, VPN, VPC, etc. were properly capitalised. Some error messages were cleaned for better readability. The test cases, expecting some exception strings were adjusted accordingly.
Here is the wiki page, describing the changes in details:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Object+Naming+and+Title+Case+Convention
---------
Co-authored-by: Manoj Kumar <manojkr.itbhu@gmail.com>
Co-authored-by: Harikrishna <harikrishna.patnala@gmail.com>
* 4.22:
Update templateConfig.sh to not break with directorys with space on t… (#10898)
Fix VM and volume metrics listing regressions (#12284)
packaging: use latest cmk release link directly (#11429)
api:rename RegisterCmd.java => RegisterUserKeyCmd.java (#12259)
Prioritize copying templates from other secondary storages instead of downloading them (#10363)
Show time correctly in the backup schedule UI (#12012)
kvm: use preallocation option for fat disk resize (#11986)
Python exception processing static routes fixed (#11967)
KVM memballooning requires free page reporting and autodeflate (#11932)
api: create/register/upload template with empty template tag (#12234)
* Migrate volume improvements, to bypass secondary storage when copy volume between pools is allowed directly
* Bypass secondary storage for copy volume between zone-wide pools and
- local storage on host in the same zone
- cluser-wide pools in the same zone
* Bypass secondary storage for volumes on ceph/rdb pool when the scope permits
* Fix dest disk format while migrating volume from ceph/rbd to nfs, and some code improvements
* unit tests
* Update suitable disk offering(s) for volume(s) after migrate VM with volumes when change in pool type (shared or local)
Currently, Migrate VM with volume(s) bypasses the service and disk offerings of the volumes, as the target pools for migration are specified,
which ignores the offerings. Offering change is required when pool type (shared or local) is changed, mainly
- when volume on shared pool is migrated to local pool
- when volume on local pool is migrated to shared pool
* Update with proper message while migrate volume when target pool and offering type mismatches (both are not shared/local)
* Consider host scope first during endpoint selection while copying between primary storages
* Update disk offering count (for listDiskOfferings api) while removing offerings with tags mismatch with storage tags
* Return details of the storage pool in the response including url, and update capacityBytes and capacityIops if applicable while creating storage pool
* Added capacitybytes parameter to the storage pool response in sync with the capacityiops response parameter and createStoragePool cmd request parameter (existing disksizetotal parameter in the storage pool response can be deprecated)
* Don't keep url in details
* Persist the capacityBytes and capacityIops in the storage_pool_details table while creating storage pool as well, for consistency - as these are updated with during update storage pool
* rebase with main fixes
This PR adds support for specifying user data (cloud-init) for system VMs via Zone Scoped global settings. This allows the operators to customize the System VMs and setup monitoring, logging or execute any custom commands.
We set the user data from the global setting in /var/cache/cloud/cmdline, and use the NoCloud datasource to process user data. cloud-init service is still disabled in the system VMs and it's executed as part of the cloud-postinit service which executes the postinit.sh script.
Added global settings:
systemvm.userdata.enabled - Disabled by default. Needs to be enabled to utilize the feature.
console.proxy.vm.userdata - UUID of the User data to be used for Console Proxy
secstorage.vm.userdata - UUID of the User data to be used for Secondary Storage VM
virtual.router.userdata - UUID of the User data to be used for Virtual Routers
CS creates transient KVM domain.xml. When instance is unmanaged from CS, explicit dump of domain has to be taken to manage is outside of CS.
With this PR
domainXML gets backed up and becomes persistent for further management of Instance.
Stopped instance also can be unmanaged, last host for instance is considered for defining domain
hostid param is supported in unmanageVirtualMachine API for KVM hypervisor and for stopped Instances
hostid field in response of unmanageVirtualMachine, representing host used for unmanage operation
Disable unmanaging instance with config drive, can unmanage from API using forced=true param for KVM
* Remove allocated snapshots / vm snapshots on start
* Check and Cleanup snapshots / vm snapshots on MS start
* rebase fixes
* Update volume state (from Snapshotting) on MS start when its snapshot job not finished and snapshot in Creating state
* [routers] distiction between fatal failure and warning or unknown on healthchecks
* UI status for router health checks
* status from scripts varied
* automation signalled errors
* revert removal of update sql
* upgradeversion
* move config item and further cleanup
* handling services better
* backwards compatible response
---------
Co-authored-by: Daan Hoogland <dahn@apache.org>
This feature adds the ability to create a new instance from a VM backup for dummy, NAS and Veeam backup providers. It works even if the original instance used to create the backup was expunged or unmanaged. There are two parts to this functionality:
Saving all configuration details that the VM had at the time of taking the backup. And using them to create an instance from backup.
Enabling a user to expunge/unmanage an instance that has backups.
The Netris Plugin introduces Netris as a network service provider in CloudStack to be able to create and manage Virtual Private Clouds (VPCs) in CloudStack, being able to orchestrate the following network functionalities:
- Network segmentation with Netris-VXLAN isolation method
- Routing between "public" IP and network segments with an ACS ROUTED mode offering
- SourceNAT, DNAT, 1:1 NAT between "public" IP and network segments with an ACS NATTED mode offering
- Routing between VPC network segments (tiers in ACS nomenclature)
- Access Lists (ACLs) between VPC tiers and "public" network (TCP, UDP, ICMP) both as global egress rules and "public" IP specific ingress rules.
- ACLs between VPC network tiers (TCP, UDP, ICMP)
- External load balancing – between VPC network tiers and "public" IP
- Internal load balancing – between VPC network tiers
- CloudStack Virtual Router services (DHCP, DNS, UserData, Password Injection, etc…)
* Option to deploy a VM with existing volume/snapshot
* smoke test changes
check if the hypervisor is KVM
check if the primary storage's scope is ZONE wide
* skip all tests if the storage isn't Zone-Wide and the hypervisor isn't KVM
* support StorPool tags
add StorPool tags to a volume created from snapshot or to a volume which
will be attached as a ROOT to a new VM
* Add StorPool tags on the new ROOT volume
* Add the StorPool's tags when volume is created from a snapshot or a
volume is attached as a ROOT to a VM
* Addressed review
* Introducing Storage Access Groups to define the host and storage pool connections
In CloudStack, when a primary storage is added at the Zone or Cluster scope, it is by default connected to all hosts within that scope. This default behavior can be refined using storage access groups, which allow operators to control and limit which hosts can access specific storage pools.
Storage access groups can be assigned to hosts, clusters, pods, zones, and primary storage pools. When a storage access group is set on a cluster/pod/zone, all hosts within that scope inherit the group. Connectivity between a host and a storage pool is then governed by whether they share the same storage access group.
A storage pool with a storage access group will connect only to hosts that have the same storage access group. A storage pool without a storage access group will connect to all hosts, including those with or without a storage access group.