Adds the delete-with-chain-repair semantics agreed in the RFC review:
scripts/vm/hypervisor/kvm/nasbackup.sh
- New '-o rebase' operation: rebases an existing on-NAS qcow2 onto
a new backing parent. Uses a SAFE rebase (no -u) so the target
absorbs blocks of the about-to-be-deleted parent before the
backing pointer is moved up to the grandparent. Writes the new
backing reference relative to the target's directory so it
survives mount-point changes.
- New CLI flags --rebase-target, --rebase-new-backing (both passed
mount-relative).
RebaseBackupCommand + LibvirtRebaseBackupCommandWrapper
- New agent command that wraps the script's rebase operation. The
provider sends one of these per child that needs re-pointing.
NASBackupProvider.deleteBackup
- Now plans the chain repair before touching files via
computeChainRepair():
* No chain metadata -> single-file delete (legacy behaviour)
* Tail incremental -> single delete, no rebase
* Middle incremental -> rebase immediate child onto our
parent, then delete; shift
chain_position of all later
descendants by -1
* Full with descendants -> refuse unless forced=true; with
forced=true delete full + every
descendant newest-first
- Updates parent_backup_id, chain_position metadata in
backup_details after each rebase so the model in the DB matches
the on-disk chain.
This implements the cascade-delete behaviour requested in @abh1sar's
review point #7.
Refs: apache/cloudstack#12899
Two changes that together let an incremental NAS backup be restored
without manual chain assembly:
scripts/vm/hypervisor/kvm/nasbackup.sh
- qemu-img rebase now writes a backing-file path that is RELATIVE to
the new qcow2's directory (e.g. ../<parent-ts>/root.<uuid>.qcow2)
rather than the absolute path on the current mount point. NAS mount
points are ephemeral (mktemp -d), so an absolute reference would
not resolve when the backup is re-mounted at restore time. Relative
references are resolved by qemu-img against the file's own
directory, so the chain stays valid no matter where the NAS is
mounted next.
- Verifies the parent file exists on the NAS before rebasing.
LibvirtRestoreBackupCommandWrapper
- For file-based primary storage (local, NFS-file), the existing
code rsync'd the source qcow2 to the volume. That copies only the
differential blocks of an incremental, leaving a volume whose
backing-file reference points at a path the primary storage host
doesn't have. Now: detect a backing-chain via qemu-img info JSON
and flatten via 'qemu-img convert -O qcow2', which follows the
chain and produces a self-contained qcow2. Full backups continue
to use rsync (faster, no chain to flatten).
- The block-storage path (RBD/Linstor) already used qemu-img convert
via the QemuImg helper, which auto-flattens chains, so that path
needed no change.
Refs: apache/cloudstack#12899
CloudStack rebuilds the libvirt domain XML on every VM start, which means
persistent QEMU dirty bitmaps don't survive a stop/start cycle. Rather
than hooking into the VM start lifecycle (intrusive across the
orchestration layer), this commit handles the missing bitmap *lazily* at
the next backup attempt:
nasbackup.sh
- When -M incremental is requested, the script first checks
`virsh checkpoint-list` for the parent bitmap. If absent, it
recreates the checkpoint on the running domain so libvirt accepts
the <incremental> reference. The next incremental will be larger
than usual (it captures all writes since recreate, not since the
previous incremental) but is correct; subsequent ones return to
normal size.
- On recreation, emits BITMAP_RECREATED=<name> on stdout for the
orchestrator to record.
BackupAnswer
+ bitmapRecreated field surfaced from the agent.
LibvirtTakeBackupCommandWrapper
- Strips BITMAP_RECREATED= line from stdout before size parsing.
- Sets answer.setBitmapRecreated(...).
NASBackupChainKeys
+ BITMAP_RECREATED key for backup_details.
NASBackupProvider
- When the agent reports a recreated bitmap, persists it under
backup_details and logs an info-level message so operators can
correlate larger-than-usual incrementals with VM restarts.
This satisfies the bitmap-loss-on-VM-restart concern from the RFC review
without touching VirtualMachineManager / StartCommand / agent lifecycle.
Refs: apache/cloudstack#12899
Adds three new optional CLI flags to nasbackup.sh:
-M|--mode <full|incremental>
--bitmap-new <name> (checkpoint to create with this backup)
--bitmap-parent <name> (incremental: parent bitmap to read changes since)
--parent-path <path> (incremental: parent backup file for rebase)
Behavior:
- When -M is omitted, behavior is unchanged (legacy full-only, no checkpoint
created), so existing callers are not affected.
- With -M full + --bitmap-new, a full backup is taken AND a libvirt
checkpoint of that name is registered atomically (via backup-begin's
--checkpointxml), giving the next incremental its starting bitmap.
- With -M incremental, libvirt's <incremental> element references the
parent bitmap; only changed blocks are written. After completion,
qemu-img rebase wires the new file to its parent so the chain on the
NAS is self-describing for restore.
- Stopped VMs cannot use backup-begin; if -M incremental is requested
while VM is stopped, the script falls back to a full and emits
INCREMENTAL_FALLBACK= on stderr so the orchestrator can record it
correctly in the chain.
- The script echoes BITMAP_CREATED=<name> on success so the Java caller
can store it under backup_details (NASBackupChainKeys.BITMAP_NAME).
Works across local file, NFS-file, and LINSTOR primary storage. Ceph RBD
running-VM support is a pre-existing limitation of this script, not
affected by this change.
Refs: apache/cloudstack#12899
* Fix domain parsing for GPU
* Add Display controller to GPU class check
this adds support for the amd instinct mi2xx accelorator crards in the discovery script.
Co-authored-by: Piet Braat <piet@phiea.nl>
* extension/proxmox: improve host vm power reporting
Add `statuses` action in extensions to report VM power states
This PR introduces support for retrieving the power state of all VMs on a host directly from an extension using the new `statuses` action.
When available, this provides a single aggregated response, reducing the need for multiple calls.
If the extension does not implement `statuses`, the server will gracefully fall back to querying individual VMs using the existing `status` action.
This helps with updating the host in CloudStack after out-of-band migrations for the VM.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address review
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
---------
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* XenServer 8.4/XCP-ng 8.3: Support vTPM
* fix issue
* add log for windows 11 or other such guests OSs that require vtpm
* remove secure bootmode requirement
* Fix uefi setting on host for xenserver 8.4
This PR introduces console access support for instances deployed using Orchestrator Extensions, available via either VNC or a direct URL.
- CloudStack queries the extension using the getconsole action.
- For VNC-based access, the extension must return host/port/ticket details. CloudStack then forwards these to the Console Proxy VM (CPVM) in the instance’s zone. It is assumed that the CPVM can reach the specified host and port.
- For direct URL access, the extension returns a console URL with the protocol set to `direct`. The URL is then provided directly to the user.
- The built-in Proxmox Orchestrator Extension now supports console access via VNC. The extension calls the Proxmox API to fetch console details and returns them in the required format.
Also, adds changes to send caller details to the extension payload.
```
# cat /var/lib/cloudstack/management/extensions/Proxmox/02b650f6-bb98-49cb-8cac-82b7a78f43a2.json | jq
{
"caller": {
"roleid": "6b86674b-7e61-11f0-ba77-1e00c8000158",
"rolename": "Root Admin",
"name": "admin",
"roletype": "Admin",
"id": "93567ed9-7e61-11f0-ba77-1e00c8000158",
"type": "ADMIN"
},
"virtualmachineid": "126f4562-1f0f-4313-875e-6150cabeb72f",
...
```
Documentation PR: https://github.com/apache/cloudstack-documentation/pull/560
---------
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* scripts: fix external provision to use correct power state
The valid states are poweron and poweroff.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* strip string while processing powerstate for HyperV
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* ignore warning that spills over to exten output string
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
---------
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature adds the ability to create a new instance from a VM backup for dummy, NAS and Veeam backup providers. It works even if the original instance used to create the backup was expunged or unmanaged. There are two parts to this functionality:
Saving all configuration details that the VM had at the time of taking the backup. And using them to create an instance from backup.
Enabling a user to expunge/unmanage an instance that has backups.
This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM.
It allows the operator to discover the GPU devices on the KVM host and create a Compute Offering with GPU support based on the available GPU devices on the host. Once the operator has created the Compute offering, it can be used by users to launch Instances with GPU devices.
The Extensions Framework in Apache CloudStack is designed to provide a flexible and standardised mechanism for integrating external systems and custom workflows into CloudStack’s orchestration process. By defining structured hook points during key operations—such as virtual machine deployment, resource preparation, and lifecycle events—the framework allows administrators and developers to extend CloudStack’s behaviour without modifying its core codebase.
* NAS B&R Plugin enhancements
* Prevent printing mount opts which may include password by removing from response
* revert marvin change
* add sanity checks to validate minimum qemu and libvirt versions
* check is user running script is part of libvirt group
* revert changes of retore expunged VM
* add code coverage ignore file
* remove check
* issue with listing schedules and add defensive checks
* redirect logs to agent log file
* add some more debugging
* remove test file
* prevent deletion of cks cluster when vms associated to a backup offering
* delete all snapshot policies when bkp offering is disassociated from a VM
* Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766)
* Fix updateTemplatePermission UI in non-english language
* Improve fix
---------
* Add nobrl in the mountopts for cifs file system
* Fix restoration of VM / volumes with cifs
* add cifs utils for el8
* add cifs-utils for ubuntu cloudstack-agent
* syntax error
* remove required constraint on both vmid and id params for the delete bkp schedule command
This is a simple NAS backup plugin for KVM which may be later expanded for other hypervisors. This backup plugin aims to use shared NAS storage on KVM hosts such as NFS (or CephFS and others in future), which is used to backup fully cloned VMs for backup & restore operations. This may NOT be as efficient and performant as some of the other B&R providers, but maybe useful for some KVM environments who are okay to only have full-instance backups and limited functionality.
Design & Implementation follows the `networker` B&R plugin, which is simply:
- Implement B&R plugin interfaces
- Use cmd-answer pattern to execute backup and restore operations on KVM host when VM is running (or needs to be restored) - instead of a B&R API client, relies on answers from KVM agent which executes the operations
- Backups are full VM domain snapshots, copied to a VM-specific folders on a NAS target (NFS) along with a domain XML
- Backup uses libvirt feature: https://libvirt.org/kbase/live_full_disk_backup.html orchestrated via virsh/bash script (nasbackup.sh) as the libvirt-java lacks the bindings
- Supported instance volume storage for restore operations: NFS & local storage
Refer the doc PR for feature limitations and usage details:
https://github.com/apache/cloudstack-documentation/pull/429
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Co-authored-by: Pearl Dsilva <pearl1594@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA
This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage.
If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host
On Oracle Linux 9.0, version shows as undefined and even Host.OS shows as "Red".
This change fixes the script to use '/etc/os-release' ins such cases.
Signed-off-by: Abhishek Kumar <abhishek.kumar@shapeblue.com>
* Improve log when live patching fails
* change patching path from /tmp to /var/cache/clou
* add iptable rule for console proxy (novnc)
* temporary template paths
* revert pom xml to original paths
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency
* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp
* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup
* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes
* Add ssh to k8s nodes details in the Access tab on the UI
* test
* Refactor ca/cert patching logic
* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script
* remove all references of systemvm.iso
* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs
* fix script timeout
* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand
* remove commented code + change core user to cloud for cks nodes
* Update ownership of ssh directory
* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)
* Add UI changes + move changes from patch file to runcmd
* test: validate performance for template modification during seeding
* create vms folder in cloudstack-commons directory - debian rules
* remove logic for on the fly template convert + update k8s test
* fix syntax issue - causing issue with shared network tests
* Code cleanup
* refactor patching logic - certs
* move logic of fixing rootdiskcontroller from upgrade to kubernetes service
* add livepatch option to restart network & vpc
* smooth upgrade of cks clusters
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency
* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp
* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup
* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes
* Add ssh to k8s nodes details in the Access tab on the UI
* test
* Refactor ca/cert patching logic
* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script
* remove all references of systemvm.iso
* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs
* fix script timeout
* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand
* remove commented code + change core user to cloud for cks nodes
* Update ownership of ssh directory
* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)
* Add UI changes + move changes from patch file to runcmd
* test: validate performance for template modification during seeding
* create vms folder in cloudstack-commons directory - debian rules
* remove logic for on the fly template convert + update k8s test
* fix syntax issue - causing issue with shared network tests
* Code cleanup
* add cgroup config for containerd
* add systemd config for kubelet
* add additional info during image registry config
* address comments
* add temp links of download.cloudstack.org
* address part of the comments
* address comments
* update containerd config - as version has upgraded to 1.5 from 1.4.12 in 4.17.0
* address comments - simplify
* fix vue3 related icon changes
* allow network commands when router template version is lower but is patched
* add internal LB to the list of routers to be patched on network restart with live patch
* add unit tests for API param validations and new helper utilities - file scp & checksum validations
* perform patching only for non-user i.e., system VMs
* add test to validate params
* remove unused import
* add column to domain_router to display software version and support networkrestart with livePatch from router view
* Requires upgrade column to consider package (cloud-scripts) checksum to identify if true/false
* use router software version instead of checksum
* show N/A if no software version reported i.e., in upgraded envs
* fix deb failure
* update pom to official links of systemVM template