cloudstack

Commit Graph

Author	SHA1	Message	Date
James Peru	b8d069e127	feat(backup): cascade-delete + chain repair for NAS incrementals Adds the delete-with-chain-repair semantics agreed in the RFC review: scripts/vm/hypervisor/kvm/nasbackup.sh - New '-o rebase' operation: rebases an existing on-NAS qcow2 onto a new backing parent. Uses a SAFE rebase (no -u) so the target absorbs blocks of the about-to-be-deleted parent before the backing pointer is moved up to the grandparent. Writes the new backing reference relative to the target's directory so it survives mount-point changes. - New CLI flags --rebase-target, --rebase-new-backing (both passed mount-relative). RebaseBackupCommand + LibvirtRebaseBackupCommandWrapper - New agent command that wraps the script's rebase operation. The provider sends one of these per child that needs re-pointing. NASBackupProvider.deleteBackup - Now plans the chain repair before touching files via computeChainRepair(): * No chain metadata -> single-file delete (legacy behaviour) * Tail incremental -> single delete, no rebase * Middle incremental -> rebase immediate child onto our parent, then delete; shift chain_position of all later descendants by -1 * Full with descendants -> refuse unless forced=true; with forced=true delete full + every descendant newest-first - Updates parent_backup_id, chain_position metadata in backup_details after each rebase so the model in the DB matches the on-disk chain. This implements the cascade-delete behaviour requested in @abh1sar's review point #7. Refs: apache/cloudstack#12899	2026-04-27 19:24:02 +03:00
James Peru	39303fbf88	feat(backup): restore path follows incremental backing-chain Two changes that together let an incremental NAS backup be restored without manual chain assembly: scripts/vm/hypervisor/kvm/nasbackup.sh - qemu-img rebase now writes a backing-file path that is RELATIVE to the new qcow2's directory (e.g. ../<parent-ts>/root.<uuid>.qcow2) rather than the absolute path on the current mount point. NAS mount points are ephemeral (mktemp -d), so an absolute reference would not resolve when the backup is re-mounted at restore time. Relative references are resolved by qemu-img against the file's own directory, so the chain stays valid no matter where the NAS is mounted next. - Verifies the parent file exists on the NAS before rebasing. LibvirtRestoreBackupCommandWrapper - For file-based primary storage (local, NFS-file), the existing code rsync'd the source qcow2 to the volume. That copies only the differential blocks of an incremental, leaving a volume whose backing-file reference points at a path the primary storage host doesn't have. Now: detect a backing-chain via qemu-img info JSON and flatten via 'qemu-img convert -O qcow2', which follows the chain and produces a self-contained qcow2. Full backups continue to use rsync (faster, no chain to flatten). - The block-storage path (RBD/Linstor) already used qemu-img convert via the QemuImg helper, which auto-flattens chains, so that path needed no change. Refs: apache/cloudstack#12899	2026-04-27 19:18:33 +03:00
James Peru	43e2f7504a	feat(backup): on-demand bitmap recreation for incremental NAS backup CloudStack rebuilds the libvirt domain XML on every VM start, which means persistent QEMU dirty bitmaps don't survive a stop/start cycle. Rather than hooking into the VM start lifecycle (intrusive across the orchestration layer), this commit handles the missing bitmap lazily at the next backup attempt: nasbackup.sh - When -M incremental is requested, the script first checks `virsh checkpoint-list` for the parent bitmap. If absent, it recreates the checkpoint on the running domain so libvirt accepts the <incremental> reference. The next incremental will be larger than usual (it captures all writes since recreate, not since the previous incremental) but is correct; subsequent ones return to normal size. - On recreation, emits BITMAP_RECREATED=<name> on stdout for the orchestrator to record. BackupAnswer + bitmapRecreated field surfaced from the agent. LibvirtTakeBackupCommandWrapper - Strips BITMAP_RECREATED= line from stdout before size parsing. - Sets answer.setBitmapRecreated(...). NASBackupChainKeys + BITMAP_RECREATED key for backup_details. NASBackupProvider - When the agent reports a recreated bitmap, persists it under backup_details and logs an info-level message so operators can correlate larger-than-usual incrementals with VM restarts. This satisfies the bitmap-loss-on-VM-restart concern from the RFC review without touching VirtualMachineManager / StartCommand / agent lifecycle. Refs: apache/cloudstack#12899	2026-04-27 19:10:46 +03:00
James Peru	fbb916b254	feat(backup): nasbackup.sh full+incremental modes via backup-begin Adds three new optional CLI flags to nasbackup.sh: -M\|--mode <full\|incremental> --bitmap-new <name> (checkpoint to create with this backup) --bitmap-parent <name> (incremental: parent bitmap to read changes since) --parent-path <path> (incremental: parent backup file for rebase) Behavior: - When -M is omitted, behavior is unchanged (legacy full-only, no checkpoint created), so existing callers are not affected. - With -M full + --bitmap-new, a full backup is taken AND a libvirt checkpoint of that name is registered atomically (via backup-begin's --checkpointxml), giving the next incremental its starting bitmap. - With -M incremental, libvirt's <incremental> element references the parent bitmap; only changed blocks are written. After completion, qemu-img rebase wires the new file to its parent so the chain on the NAS is self-describing for restore. - Stopped VMs cannot use backup-begin; if -M incremental is requested while VM is stopped, the script falls back to a full and emits INCREMENTAL_FALLBACK= on stderr so the orchestrator can record it correctly in the chain. - The script echoes BITMAP_CREATED=<name> on success so the Java caller can store it under backup_details (NASBackupChainKeys.BITMAP_NAME). Works across local file, NFS-file, and LINSTOR primary storage. Ceph RBD running-VM support is a pre-existing limitation of this script, not affected by this change. Refs: apache/cloudstack#12899	2026-04-27 18:53:20 +03:00
Daan Hoogland	82bfa9fb3f	Merge branch '4.22'	2026-04-14 14:50:44 +02:00
Wei Zhou	e297644ce1	KVM: Enable HA heartbeat on ShareMountPoint (#12773 )	2026-04-10 14:12:40 +05:30
Suresh Kumar Anaparti	11538df710	Merge branch '4.22'	2026-04-10 12:02:40 +05:30
Vishesh	416679fae1	Fix domain parsing for GPU & add Display controller in the supported PCI class (#12981 ) * Fix domain parsing for GPU * Add Display controller to GPU class check this adds support for the amd instinct mi2xx accelorator crards in the discovery script. Co-authored-by: Piet Braat <piet@phiea.nl>	2026-04-10 09:23:07 +05:30
Suresh Kumar Anaparti	c3614098da	Merge branch '4.22'	2026-04-08 18:09:43 +05:30
Abhisar Sinha	03de62bf38	Support Linstor Primary Storage for NAS BnR (#12796 )	2026-04-08 15:14:20 +05:30
John Bampton	39126a4339	Standardize and auto add license headers for Shell files with pre-commit (#12070 ) * Add shebang to shell scripts	2025-11-14 14:23:41 +01:00
Suresh Kumar Anaparti	b7a11cb203	NAS backup provider: Support restore from backup to volumes on Ceph storage pool(s), and take backup for stopped instances with volumes on Ceph storage pool(s) (#11684 ) Co-authored-by: Abhisar Sinha <63767682+abh1sar@users.noreply.github.com>	2025-10-06 09:13:28 +02:00
dk-blackfuel	d60f455b00	Fix detection of Mi3xx GPUs (#11715 )	2025-09-30 18:34:58 +05:30
Vishesh	2c493d1933	Add support for nvidia vGPU support with vendor specific framework (#11432 )	2025-08-15 15:54:11 +05:30
Vishesh	bcd738caa6	Fix GPU discovery script to make it run with mdev for SR-IOV enabled devices (#11340 )	2025-07-31 18:29:35 +05:30
Abhisar Sinha	a87c5c2b3a	Create new Instance from VM backup (#10140 ) This feature adds the ability to create a new instance from a VM backup for dummy, NAS and Veeam backup providers. It works even if the original instance used to create the backup was expunged or unmanaged. There are two parts to this functionality: Saving all configuration details that the VM had at the time of taking the backup. And using them to create an instance from backup. Enabling a user to expunge/unmanage an instance that has backups.	2025-07-31 15:47:22 +05:30
Vishesh	f6ad184ea2	Feature: Add support for GPU with KVM hosts (#11143 ) This PR allows attaching of GPU devices via PCI, mdev or VF to an Instance for KVM. It allows the operator to discover the GPU devices on the KVM host and create a Compute Offering with GPU support based on the available GPU devices on the host. Once the operator has created the Compute offering, it can be used by users to launch Instances with GPU devices.	2025-07-29 13:46:24 +05:30
Pearl Dsilva	7f4e6a9d51	NAS B&R Plugin enhancements (#9666 ) * NAS B&R Plugin enhancements * Prevent printing mount opts which may include password by removing from response * revert marvin change * add sanity checks to validate minimum qemu and libvirt versions * check is user running script is part of libvirt group * revert changes of retore expunged VM * add code coverage ignore file * remove check * issue with listing schedules and add defensive checks * redirect logs to agent log file * add some more debugging * remove test file * prevent deletion of cks cluster when vms associated to a backup offering * delete all snapshot policies when bkp offering is disassociated from a VM * Fix `updateTemplatePermission` when the UI is set to a language other than English (#9766) * Fix updateTemplatePermission UI in non-english language * Improve fix --------- * Add nobrl in the mountopts for cifs file system * Fix restoration of VM / volumes with cifs * add cifs utils for el8 * add cifs-utils for ubuntu cloudstack-agent * syntax error * remove required constraint on both vmid and id params for the delete bkp schedule command	2025-03-04 11:32:09 -05:00
Rohit Yadav	85765c3125	backup: simple NAS backup plugin for KVM (#9451 ) This is a simple NAS backup plugin for KVM which may be later expanded for other hypervisors. This backup plugin aims to use shared NAS storage on KVM hosts such as NFS (or CephFS and others in future), which is used to backup fully cloned VMs for backup & restore operations. This may NOT be as efficient and performant as some of the other B&R providers, but maybe useful for some KVM environments who are okay to only have full-instance backups and limited functionality. Design & Implementation follows the `networker` B&R plugin, which is simply: - Implement B&R plugin interfaces - Use cmd-answer pattern to execute backup and restore operations on KVM host when VM is running (or needs to be restored) - instead of a B&R API client, relies on answers from KVM agent which executes the operations - Backups are full VM domain snapshots, copied to a VM-specific folders on a NAS target (NFS) along with a domain XML - Backup uses libvirt feature: https://libvirt.org/kbase/live_full_disk_backup.html orchestrated via virsh/bash script (nasbackup.sh) as the libvirt-java lacks the bindings - Supported instance volume storage for restore operations: NFS & local storage Refer the doc PR for feature limitations and usage details: https://github.com/apache/cloudstack-documentation/pull/429 Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com> Co-authored-by: Pearl Dsilva <pearl1594@gmail.com> Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>	2024-09-05 22:19:13 +05:30
John Bampton	28e8e2d009	pre-commit: add hook to trim trailing whitespace (#8205 )	2024-05-28 09:01:30 +02:00
slavkap	2bb182c3e1	KVM Host HA enhancement for StorPool storage (#8045 ) Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage. If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host	2023-11-04 12:35:37 +05:30
John Bampton	6f4503488b	pre-commit: apply `end-of-file-fixer` to all files (#7551 )	2023-08-02 13:47:21 +02:00
fermosan	9009dd1db8	Emc networker b&r (#6550 ) Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2023-01-09 15:46:25 +01:00
John Bampton	6401c850b7	Fix spelling (#6064 ) * Fix spelling - `interupted` to `interrupted` - `paramter` to `parameter` * Fix more typos	2022-03-08 13:02:35 -03:00
Gabriel Beims Bräscher	b4db3db617	Use default timeout and retransmission values for the NFS mount. (#6019 ) This also allows the mount command to apply NFS mount custom values set by ADMINS via '/etc/nfsmount.conf'.	2022-03-02 09:07:08 -03:00
davidjumani	6ac834a358	Adding AutoScaling for cks + CKS CoreOS EOL update + systemvmtemplate improvements (#4329 ) Adding AutoScaling support for cks Kubernetes PR : kubernetes/autoscaler#3629 Also replaces CoreOS with Debian Fixes #4198 Co-authored-by: Pearl Dsilva <pearl1594@gmail.com> Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com> Co-authored-by: Wei Zhou <w.zhou@global.leaseweb.com> Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2021-10-06 21:17:41 +05:30
Daniel Augusto Veronezi Salvador	82df04ecc8	Improve HA logs (#5241 ) Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>	2021-07-30 21:13:16 +02:00
Daniel Augusto Veronezi Salvador	99f2919ef4	Improve logs on kvmvmactivity.sh (#4704 ) Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>	2021-05-10 16:26:55 +02:00
Spaceman1984	23fa647985	kvm: sending std output to dev/null to prevent garbage output (#4123 ) When scripts/vm/hypervisor/kvm/kvmvmactivity.sh is called with an incorrect file name, an error is printed which is then interpreted as output from the script. When an incorrect file name is passed the script prints out: stat: cannot stat ‘b51d7336-d964-44ee-be60-bf62783dabc’: No such file or directory =====> DEAD <====== The KVMHAVMActivityChecker.java checkingHB() process is expecting just =====> DEAD <====== but gets the unexpected error message and interprets the file as alive.	2020-06-04 08:17:59 +05:30
Rohit Yadav	9ff819da2c	systemvm: new qemu-guest-agent based patching for KVM (#3278 ) This introduces a new patching script for patching systemvms on KVM using qemu-guest-agent that runs inside the systemvm on startup. This also removes the vport device which was previously used by the legacy patching script and instead uses the modern and new uniform guest agent vport for host-guest communication. Also updates the sytemvmtemplate build config to use the latest Debian 9.9.0 iso. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2019-05-10 23:42:19 +05:30
Rohit Yadav	c6e53f6cc6	kvm: reset KVM host on heartbeat failure (#2984 ) On actual testing, I could see that kvmheartbeat.sh script fails on NFS server failure and stops the agent only. Any HA VMs could be launched in different hosts, and recovery of NFS server could lead to a state where a HA enabled VM runs on two hosts and can potentially cause disk corruptions. In most cases, VM disk corruption will be worse than VM downtime. I've kept the sleep interval between check/rounds but reduced it to 10s. The change in behaviour was introduced in #2722. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2018-10-30 15:13:59 +05:30
Slair1	023dcec5ef	CLOUDSTACK-10310 Fix KVM reboot on storage issue (#2722 )	2018-08-20 10:28:03 +02:00
Rohit Yadav	b0d7844cf0	CLOUDSTACK-10109: Fix regression from PR #2295 (#2394 ) This fixes regression introduced in PR #2295: - Pass assign=true to fetch new public IP - Use wait_until instead of sleep+wait in tests - Loop through list of public IP ranges to match the systemvm gateway - Fix potential NPE seen when adding simulator host(s) - Removes aria2 installation from setup_agent.sh using yum, it's already dependency for cloudstack-agent package Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2018-01-10 00:44:00 +05:30
Nicolas Vazquez	e86bb41e0e	CLOUDSTACK-10146: Bypass Secondary Storage for KVM templates (#2379 ) This feature allows using templates and ISOs avoiding secondary storage as intermediate cache on KVM. The virtual machine deployment process is enhanced to supported bypassed registered templates and ISOs, delegating the work of downloading them to primary storage to the KVM agent instead of the SSVM agent. Template and ISO registration: - When hypervisor is KVM, a checkbox is displayed with 'Direct Download' label. - API methods registerTemplate and registerISO are both extended with this new parameter directdownload. - On template or ISO registration, no download job is sent to SSVM agent, CloudStack would only persist an entry on template_store_ref indicating that template or ISO has been marked as 'Direct Download' (bypassing Secondary Storage). These entries are persisted as: template_id = Template or ISO id on vm_template table store_id NULL download_state = BYPASSED state = Ready (Note: these entries allow users to deploy virtual machine from registered templates or ISOs) - An URL validation command is sent to a random KVM host to check if template/ISO location can be reached. Metalink are also supported by this feature. In case of a metalink, it is fetched and URL check is performed on each of its URLs. - Checksum should be provided as indicated on #2246: {ALGORITHM}CHKSUMHASH - After template or ISO is registered, it would be displayed in the UI Virtual machine deployment: When a 'Direct Download' template is selected for deployment, CloudStack would delegate template downloading to destination storage pool via destination host by a new pluggable download manager. Download manager would handle template downloading depending on URL protocol. In case of HTTP, request headers can be set by the user via vm_template_details. Those details should be persisted as: Key: HTTP_HEADER Value: HEADERNAME:HEADERVALUE In case of HTTPS, a new API method is added uploadTemplateDirectDownloadCertificate to allow user importing a client certificate into all KVM hosts' keystore before deployment. After template or ISO is downloaded to primary storage, usual entry would be persisted on template_spool_ref indicating the mapping between template/ISO and storage pool.	2018-01-09 12:22:18 +05:30
Boris Stoyanov	f917ab660e	CLOUDSTACK-9782: Improve host HA tests - All tests should pass on KVM, Simulator - Add test cases covering FSM state transitions and actions Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2017-08-30 18:06:48 +02:00
Rohit Yadav	212e5ccfa7	CLOUDSTACK-9782: Host HA and KVM HA provider Host-HA offers investigation, fencing and recovery mechanisms for host that for any reason are malfunctioning. It uses Activity and Health checks to determine current host state based on which it may degrade a host or try to recover it. On failing to recover it, it may try to fence the host. The core feature is implemented in a hypervisor agnostic way, with two separate implementations of the driver/provider for Simulator and KVM hypervisors. The framework also allows for implementation of other hypervisor specific provider implementation in future. The Host-HA provider implementation for KVM hypervisor uses the out-of-band management sub-system to issue IPMI calls to reset (recover) or poweroff (fence) a host. The Host-HA provider implementation for Simulator provides a means of testing and validating the core framework implementation. Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com> Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2017-08-30 18:06:48 +02:00
Daan Hoogland	70ef0788c9	CLOUDSTACK-9408: Fix download urls in sql and scripts This fixes the agreed upon url on download.cloudstack.org in various sql files and misc scripts. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2017-04-20 12:33:33 +05:30
Sverrir Berg	751d3552dc	patchviasocket improve error handling more detailed error if host file not found or cannot be opened using mkstemp and mkdtemp for improved security improve resource cleanup in error conditions in unit test	2016-05-20 15:42:34 +00:00
Sverrir A. Berg	0acd3c12a2	Convert patchviasocket to python (removes perl dependency for KVM agent) As requested here: https://github.com/apache/cloudstack/pull/1495 No scripts are using perl so that install requirement can be removed. The new scripts are using standard python packages only. Includes extensive unit test.	2016-05-20 15:42:34 +00:00
Remi Bergsma	87fdb521f0	CLOUDSTACK-8443: don't try to fix co-mounted cgroups This setting works on CentOS 6 / RHEL 6 but does nothing, as "cpu" cgroup is not mounted. On CentOS 7 / RHEL 7 systemd does mount cgroups and "cpu" is co-mounted with "cpuacc". Hence, if we specify "cpu" then this results in an error because it can only use them both, or none. By removing the setting, we rely on the default of qemu, which is: cgroup_controllers = ["cpu", "devices", "memory", "blkio", "cpuacct", "net_cls"] Only if they are really mounted, they will be used. So, this will work on both version 6 and 7. The 'fix script' didn't work well, as after a reboot you'd still have qemu throwing errors. Now we can handle the co-mountedcgroups.	2015-08-24 15:49:40 +02:00
Remi Bergsma	d1cb4c7d50	RHEL 7 and CentOS 7 need the same fix	2015-08-19 16:30:24 +02:00
Remi Bergsma	14013d5d1b	fixing white space and formatting	2015-08-19 16:24:44 +02:00
Remi Bergsma	7bce656b40	make sure sync cannot block reboot The recent discussed improvement has the risk that if 'sync' hangs, the reboot may be delayed in the same way as the 'reboot' command would do. To work around, we're adding a 5 second timeout. If it cannot sync in 5 seconds, it will not succeed anyway and we should proceed the reset. @snuf: Could we use your OVM3 heartbeat script for other hypervisors as well? One way to do it seems like a nice idea :-)	2015-04-09 12:18:21 +02:00
Remi Bergsma	c59308b0ee	write logfile just before rebooting the host As discussed with @wido @pyr and @nuxro added an extra log line. Tested it and it logs fine (tested to local disk) when syncing first: Apr 3 15:31:23 mcctest2 heartbeat: kvmheartbeat.sh system because it was unable to write the heartbeat to the storage By the way, it did also log to the agent.log but this extra log has the benefit of ending up in the system log so you'll probably find it easier there. Existing logs: 2015-04-03 15:27:23,943 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 0 2015-04-03 15:28:23,944 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 1 2015-04-03 15:29:23,946 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 2 2015-04-03 15:30:23,948 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 3 2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 4 2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout; reboot the host This closes #145 Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>	2015-04-04 14:17:37 +05:30
Remi Bergsma	2b41f98346	reboot much faster in case of storage failure When storage cannot be reached, it does not make sense to reboot as it will try to flush buffers, umount NFS mounts, etc. This will not work and thus cause a long delay. With this change, the box will reboot immediately (like pressing the reset button).	2015-04-01 19:45:16 +02:00
Kishan Kavala	4f3de024de	Add script to ensure cgroups are not co-mounted in rhel7/lxc. If required, script will unmount co-mounted cgroups and remount them seperately	2014-09-11 14:34:40 +05:30
tuna	c7dab82dc4	move cloudstack_pluginlib	2013-12-09 23:33:15 +07:00
tuna	3df8b912fc	add kvm support & LB service	2013-12-09 23:33:14 +07:00
Sheng Yang	83c13fcf27	CLOUDSTACK-2614: Fix the permission of patchviasocket.pl It's non-executable now, which cause trouble on deb package.	2013-05-29 14:24:49 -07:00
Marcus Sorensen	f66b9b570f	Send only \n rather than \r\n to agent socket when sending cmdline to system VMS BUG-ID: CLOUDSTACK-1732 Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1365622030 -0600	2013-04-10 13:27:10 -06:00

1 2

76 Commits