Commit Graph

38748 Commits

Author SHA1 Message Date
James Peru 39303fbf88 feat(backup): restore path follows incremental backing-chain
Two changes that together let an incremental NAS backup be restored
without manual chain assembly:

  scripts/vm/hypervisor/kvm/nasbackup.sh
    - qemu-img rebase now writes a backing-file path that is RELATIVE to
      the new qcow2's directory (e.g. ../<parent-ts>/root.<uuid>.qcow2)
      rather than the absolute path on the current mount point. NAS mount
      points are ephemeral (mktemp -d), so an absolute reference would
      not resolve when the backup is re-mounted at restore time. Relative
      references are resolved by qemu-img against the file's own
      directory, so the chain stays valid no matter where the NAS is
      mounted next.
    - Verifies the parent file exists on the NAS before rebasing.

  LibvirtRestoreBackupCommandWrapper
    - For file-based primary storage (local, NFS-file), the existing
      code rsync'd the source qcow2 to the volume. That copies only the
      differential blocks of an incremental, leaving a volume whose
      backing-file reference points at a path the primary storage host
      doesn't have. Now: detect a backing-chain via qemu-img info JSON
      and flatten via 'qemu-img convert -O qcow2', which follows the
      chain and produces a self-contained qcow2. Full backups continue
      to use rsync (faster, no chain to flatten).
    - The block-storage path (RBD/Linstor) already used qemu-img convert
      via the QemuImg helper, which auto-flattens chains, so that path
      needed no change.

Refs: apache/cloudstack#12899
2026-04-27 19:18:33 +03:00
James Peru 43e2f7504a feat(backup): on-demand bitmap recreation for incremental NAS backup
CloudStack rebuilds the libvirt domain XML on every VM start, which means
persistent QEMU dirty bitmaps don't survive a stop/start cycle. Rather
than hooking into the VM start lifecycle (intrusive across the
orchestration layer), this commit handles the missing bitmap *lazily* at
the next backup attempt:

  nasbackup.sh
    - When -M incremental is requested, the script first checks
      `virsh checkpoint-list` for the parent bitmap. If absent, it
      recreates the checkpoint on the running domain so libvirt accepts
      the <incremental> reference. The next incremental will be larger
      than usual (it captures all writes since recreate, not since the
      previous incremental) but is correct; subsequent ones return to
      normal size.
    - On recreation, emits BITMAP_RECREATED=<name> on stdout for the
      orchestrator to record.

  BackupAnswer
    + bitmapRecreated field surfaced from the agent.

  LibvirtTakeBackupCommandWrapper
    - Strips BITMAP_RECREATED= line from stdout before size parsing.
    - Sets answer.setBitmapRecreated(...).

  NASBackupChainKeys
    + BITMAP_RECREATED key for backup_details.

  NASBackupProvider
    - When the agent reports a recreated bitmap, persists it under
      backup_details and logs an info-level message so operators can
      correlate larger-than-usual incrementals with VM restarts.

This satisfies the bitmap-loss-on-VM-restart concern from the RFC review
without touching VirtualMachineManager / StartCommand / agent lifecycle.

Refs: apache/cloudstack#12899
2026-04-27 19:10:46 +03:00
James Peru 1f2aebca36 feat(backup): orchestrate full vs incremental in NAS provider
Adds the Java side of the incremental NAS backup feature:

  TakeBackupCommand
    + mode, bitmapNew, bitmapParent, parentPath fields (null for legacy
      callers — script preserves its existing behaviour when these are
      omitted).

  BackupAnswer
    + bitmapCreated (echoed by the agent on success)
    + incrementalFallback (true when an incremental was requested but the
      agent had to fall back to full because the VM was stopped).

  LibvirtTakeBackupCommandWrapper
    - Forwards the new fields to nasbackup.sh.
    - Strips the new BITMAP_CREATED= / INCREMENTAL_FALLBACK= marker lines
      out of stdout before the existing numeric-suffix size parser runs,
      so the script can keep the same "size as last line(s)" contract.
    - Surfaces both markers on the BackupAnswer.

  NASBackupProvider
    - decideChain(vm) walks backup_details (chain_id, chain_position,
      bitmap_name) for the latest BackedUp backup of the VM and decides:
        * Stopped VM      -> full (libvirt backup-begin needs running QEMU)
        * No prior chain  -> full (chain_position=0)
        * chain_position+1 >= nas.backup.full.every -> new full
        * otherwise       -> incremental, parent=last bitmap
    - Generates timestamp-based bitmap names ("backup-<epoch>") matching
      what the script then registers as the libvirt checkpoint name.
    - persistChainMetadata() writes parent_backup_id, bitmap_name,
      chain_id, chain_position, type into the existing backup_details
      key/value table (per the RFC review — no new columns on backups).
    - Honours the agent's INCREMENTAL_FALLBACK= signal: re-records the
      backup as a full and starts a fresh chain.
    - createBackupObject() now takes a type argument so the BackupVO
      reflects the actual decision instead of always being "FULL".

Refs: apache/cloudstack#12899
2026-04-27 19:07:24 +03:00
James Peru fbb916b254 feat(backup): nasbackup.sh full+incremental modes via backup-begin
Adds three new optional CLI flags to nasbackup.sh:
  -M|--mode <full|incremental>
  --bitmap-new <name>          (checkpoint to create with this backup)
  --bitmap-parent <name>       (incremental: parent bitmap to read changes since)
  --parent-path <path>         (incremental: parent backup file for rebase)

Behavior:
  - When -M is omitted, behavior is unchanged (legacy full-only, no checkpoint
    created), so existing callers are not affected.
  - With -M full + --bitmap-new, a full backup is taken AND a libvirt
    checkpoint of that name is registered atomically (via backup-begin's
    --checkpointxml), giving the next incremental its starting bitmap.
  - With -M incremental, libvirt's <incremental> element references the
    parent bitmap; only changed blocks are written. After completion,
    qemu-img rebase wires the new file to its parent so the chain on the
    NAS is self-describing for restore.
  - Stopped VMs cannot use backup-begin; if -M incremental is requested
    while VM is stopped, the script falls back to a full and emits
    INCREMENTAL_FALLBACK= on stderr so the orchestrator can record it
    correctly in the chain.
  - The script echoes BITMAP_CREATED=<name> on success so the Java caller
    can store it under backup_details (NASBackupChainKeys.BITMAP_NAME).

Works across local file, NFS-file, and LINSTOR primary storage. Ceph RBD
running-VM support is a pre-existing limitation of this script, not
affected by this change.

Refs: apache/cloudstack#12899
2026-04-27 18:53:20 +03:00
James Peru 1981469099 feat(backup): add chain-metadata keys + nas.backup.full.every config
NASBackupChainKeys defines the keys this provider stores under the
existing backup_details kv table (parent_backup_id, bitmap_name,
chain_id, chain_position, type). This keeps the backups table
provider-agnostic per the RFC review.

nas.backup.full.every is a zone-scoped ConfigKey that controls how
often a full backup is taken; the remaining backups in the cycle are
incremental. Counts backups (not days), so it works for hourly,
daily, and ad-hoc schedules. Default 10. Set to 1 to disable
incrementals (every backup is full).

Refs: apache/cloudstack#12899
2026-04-27 18:49:38 +03:00
James Peru f2a9202d74 docs: add RFC for incremental NAS backup support (KVM)
Adds the design document for incremental NAS backups using QEMU dirty
bitmaps and libvirt's backup-begin API. Reduces daily backup storage
80-95% for large VMs.

Refs: apache/cloudstack#12899
2026-04-27 18:46:22 +03:00
Henrique Sato 6f4445c5c1
Add offering preset variables for `Network` and `VPC` Quota tariffs (#11810)
* Add offering preset variable to Network and VPC tariffs

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>

* Add tests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
2026-04-27 09:36:37 -03:00
Suresh Kumar Anaparti 856d83a15e
Merge branch '4.22' 2026-04-23 23:53:24 +05:30
dahn 64ac0822b4
merge conflict fixes (#13046)
* merge conflict fixes

* fix pre-commit issue

Co-authored-by: Daan Hoogland <dahn@apache.org>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2026-04-23 23:46:54 +05:30
Nicolas Vazquez be89e6f7c3
[KVM] Reorder migration logs to prevent populating agent logs on migrations (#12883)
* Move logs for values of the migration settings out of the loop

* Apply suggestions from code review

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>

---------

Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2026-04-17 23:39:19 -03:00
Henrique Sato 3166e64891
Add support for new variables to the GUI whitelabel runtime system (#12760)
* Add support for new variables to the GUI whitelabel runtime system

* Address review
2026-04-17 10:59:50 -03:00
Wei Zhou f820d0125d
fix end of files and codespell errors 2026-04-17 13:58:21 +02:00
Wei Zhou 6c1437b7dd
fix end of file schema-42200to42210.sql 2026-04-17 13:56:17 +02:00
Daniil Zhyliaiev 4df32ae79f
fix: NsxResource.executeRequest DeleteNsxNatRuleCommand comparison bug (#12833)
Fixes an issue in NsxResource.executeRequest where Network.Service
comparison failed when DeleteNsxNatRuleCommand was executed in a
different process. Due to serialization/deserialization, the
deserialized Network.Service instance was not equal to the static
instances Network.Service.StaticNat and Network.Service.PortForwarding,
causing the comparison to always return false.

Co-authored-by: Andrey Volchkov <avolchkov@playtika.com>
(cherry picked from commit 30dd234b00)
2026-04-17 04:53:36 +05:30
Suresh Kumar Anaparti 2d6280b9da
Merge branch '4.22' 2026-04-17 04:35:25 +05:30
Suresh Kumar Anaparti 13a2c7793c
Merge branch '4.20' into 4.22 2026-04-17 03:12:33 +05:30
Brad House - Nexthop 83f705ddc5
Static Routes with nexthop non-functional for private gateways (#12859)
* Fix static routes to be added to PBR tables in VPC routers

Static routes were only being added to the main routing table, but
policy-based routing (PBR) is active on VPC routers. This caused
traffic coming in from specific interfaces to not find the static
routes, as they use interface-specific routing tables (Table_ethX).

This fix:
- Adds a helper method to find which interface a gateway belongs to
  by matching the gateway IP against configured interface subnets
- Modifies route add/delete operations to update both the main table
  and the appropriate interface-specific PBR table
- Uses existing CsAddress databag metadata to avoid OS queries
- Handles both add and revoke operations for proper cleanup
- Adds comprehensive logging for troubleshooting

Fixes #12857

* Add iptables FORWARD rules for nexthop-based static routes

When static routes use nexthop (gateway) instead of referencing a
private gateway's public IP, the iptables FORWARD rules were not
being generated. This caused traffic to be dropped by ACLs.

This fix:
- Adds a shared helper CsHelper.find_device_for_gateway() to determine
  which interface a gateway belongs to by checking subnet membership
- Updates CsStaticRoutes to use the shared helper instead of duplicating
  the device-finding logic
- Modifies CsAddress firewall rule generation to handle both old-style
  (ip_address-based) and new-style (nexthop-based) static routes
- Generates the required FORWARD and PREROUTING rules for nexthop routes:
  * -A PREROUTING -s <network> ! -d <interface_ip>/32 -i <dev> -j ACL_OUTBOUND_<dev>
  * -A FORWARD -d <network> -o <dev> -j ACL_INBOUND_<dev>
  * -A FORWARD -d <network> -o <dev> -m state --state RELATED,ESTABLISHED -j ACCEPT

Fixes the second part of #12857

* network matching grep fix, don't let 1.2.3.4/32 match 11.2.3.4/32
2026-04-16 16:15:43 +05:30
Brad House 6e810989b6
HAProxy Configuration: network.loadbalancer.haproxy.idle.timeout (#12586)
* initial attempt at network.loadbalancer.haproxy.idle.timeout implementation

* implement test cases

* move idleTimeout configuration test to its own test case
2026-04-16 14:49:54 +05:30
Daniil Zhyliaiev e0fe953791
fix: NSX SDK list operations are pageable: the API returns a non-null and non-empty (#12834)
`cursor` field when more pages are available. The previous implementation only
fetched the first page and ignored pagination.

This change updates the list retrieval flow to:
- follow the `cursor` chain until no further pages exist
- accumulate items from all pages
- return a single merged result to the caller

This ensures that list operations return the complete dataset rather than just
the first page.

Co-authored-by: Andrey Volchkov <avolchkov@playtika.com>
2026-04-16 14:15:30 +05:30
Daniil Zhyliaiev 05c59630e0
fix: LB Creation avoid 404 API errors due to non-needed patches (#12835) 2026-04-16 13:58:20 +05:30
Wei Zhou 1fc4cb90bf
Routed VR: accept packets from related and established connections (#12986) 2026-04-15 15:36:26 +05:30
Abhishek Kumar c6936889f5
server: prevent adding vm compute details when not applicable (#12637) 2026-04-15 10:41:20 +02:00
Daan Hoogland f5e75771bc merge forwards fix 2026-04-15 09:58:27 +02:00
Daan Hoogland c298f8f360 Merge release branch 4.22.0.1 to 4.22
* tag '4.22.0.1':
  Implement limit validations on updateBucket
  Address reviews
2026-04-15 08:58:24 +02:00
Fabricio Duarte 2511fdffaa Implement limit validations on updateBucket 2026-04-15 08:53:37 +02:00
Fabricio Duarte 13842a626d Address reviews 2026-04-15 08:52:35 +02:00
Nicolas Vazquez 160876c6d7
Fix: API Thread held forever during force deleting across MS (#12968) 2026-04-15 08:41:26 +02:00
Erik Böck 5013cf2af6
Fix user password reset mail template value (#12882)
* Fix default user password reset email template

* improve readabilty

* change update query

* Specify database for update

* Fix SQL statement

* Use CONCAT_WS sql method to create multiline string

---------

Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2026-04-15 10:06:39 +05:30
Harikrishna 0c86899cc1
Added VDDK support in VMware to KVM migrations (#12970) 2026-04-14 22:33:01 +05:30
Daan Hoogland 82bfa9fb3f Merge branch '4.22' 2026-04-14 14:50:44 +02:00
Daan Hoogland 23f633ae83 Merge tag '4.22.0.1' into 4.22 2026-04-14 13:15:14 +02:00
Daan Hoogland 1085da4ef8 Merge commit '19b4ef106931aa1d6a8fed06984009d86760e4de' into 4.22 2026-04-14 13:15:05 +02:00
Suresh Kumar Anaparti d75acb6efc
Fix rollback disk snapshots on instance snapshot failure (#12949) 2026-04-14 15:21:05 +05:30
Suresh Kumar Anaparti 38abe2df0b
Allow list async jobs by resource type alone (#13011) 2026-04-14 15:20:13 +05:30
Suresh Kumar Anaparti feb6076930
Remove unused config consoleproxy.cmd.port (#12807)
* Remove unused config 'consoleproxy.cmd.port'

* Remove the config key

---------

Co-authored-by: dahn <daan@onecht.net>
2026-04-14 13:40:00 +05:30
julien-vaz 161b4177c2
Add logs for storage pools reordering (#10419)
Co-authored-by: Julien Hervot de Mattos Vaz <julien.vaz@scclouds.com.br>
2026-04-14 09:51:05 +02:00
Henrique Sato ed575cc0a1
New config.json variable to set the ACS default language (#12863)
* New config.json variable to set the ACS default language

* Address review
2026-04-13 14:37:45 -03:00
Jtolelo ae455ee193 VPC restart cleanup for Public networks with multi-CIDR data (#12622)
* Fix VPC restart with multi-CIDR networks: handle comma-separated CIDR in NetworkVO.equals()

When a network has multiple CIDRs (e.g. '192.168.2.0/24,160.0.0.0/24'),
NetworkVO.equals() passes the raw comma-separated string to
NetUtils.isNetworkAWithinNetworkB() which expects a single CIDR,
causing 'cidr is not formatted correctly' error during VPC restart
with cleanup=true.

Extract only the first CIDR value before passing to NetUtils.

* Fix root cause: skip CIDR/gateway updates for Public traffic type networks

addCidrAndGatewayForIpv4/Ipv6 (introduced by PR #11249) was called for all
network types without checking if the network is Public. This caused
comma-separated CIDRs to be stored on Public networks, which then triggered
'cidr is not formatted correctly' errors during VPC restart.

Add TrafficType.Public guard in both the VLAN creation (addCidr) and
VLAN deletion (removeCidr) paths in ConfigurationManagerImpl.

* Sanitize legacy network-level addressing fields for Public networks

---------

Co-authored-by: dahn <daan@onecht.net>
2026-04-13 15:40:26 +02:00
Suresh Kumar Anaparti 47c5bb8ee7 Support list/query async jobs by resource (#12983)
* Add resource filtering to async job query commands

* Fix logical condition in AsyncJobDaoImpl and ResourceIdSupport

* resource type case-insensitive validation

* fix resource type and id search

---------

Co-authored-by: mprokopchuk <mprokopchuk@gmail.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
2026-04-13 15:39:49 +02:00
sandeeplocharla 5b696c0ec7
Create, Delete, Enable, Disable, Enter, Cancel maintenance of Primary StoragePool with ONTAP storage (#12563)
* Create & Delete, Enable & Disable, Enter & Cancel maintenance of Primary StoragePool with ONTAP storage
Co-authored-by: Rajiv Jain <Rajiv.Jain@netapp.com>

Create & Delete, Enable & Disable, Enter & Cancel maintenance of Primary StoragePool with ONTAP storage
Co-authored-by: Rajiv Jain<rajiv1@netapp.com>

Edited readme file

Fixed license check issue

Removed dependency that's not conforming with ACF guidelines

* Fixed the initial review comments

* Fixed some rebase issues

---------

Co-authored-by: Locharla, Sandeep <Sandeep.Locharla@netapp.com>
2026-04-13 08:38:15 -03:00
Abhisar Sinha 8eb162cb99 Updating pom.xml version numbers for release 4.20.4.0-SNAPSHOT 2026-04-13 15:48:18 +05:30
Daan Hoogland d6f4fc3ac4 Updating pom.xml version numbers for release 22.0.1 2026-04-13 11:53:00 +02:00
Abhishek Kumar 19b4ef1069 server: reserve backup, bucket resource limits during operations
Changes to check resource limits with reservations for the following
resource types:
- backup
- backup_storage
- bnucket
- object_storage

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2026-04-13 11:21:41 +02:00
Fabricio Duarte 9f57a4dd19
Unhide setting `js.interpretation.enabled` (#12605)
* Unhide setting 'js.interpretation.enabled'

* Fix grammar mistake
2026-04-10 23:45:07 -03:00
João Jandre 7c7b2ae75d
Fix KVM incremental volume snapshot creation (#12666) 2026-04-11 00:12:44 +05:30
Manoj Kumar b196e97cc3
Prevent deletion of account and domain if either of them has deleted protected instance (#12901) 2026-04-10 15:51:22 +02:00
Abhisar Sinha df7ff97271
Create volume on a specified storage pool (#12966) 2026-04-10 14:27:39 +02:00
Wei Zhou 273699cf56
kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist (#12928)
* kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist

* kvm: add LibvirtCheckVirtualMachineCommandWrapperTest

Co-authored-by: dahn <daan.hoogland@gmail.com>
2026-04-10 16:01:29 +05:30
poddm 8f3c6fad7a
set snapcpg config on copy (#12955) 2026-04-10 15:18:45 +05:30
Bernardo De Marco Gonçalves 27e4d979f1
Clean up backup references to their schedules when the schedules are deleted (#12401)
* clean up backup schedule references after their deletion

* drop unused column

* address reviews
2026-04-10 14:51:52 +05:30