Commit Graph

38744 Commits

Author SHA1 Message Date
Eugenio Grosso 723bf1445f kvm/flasharray: address review feedback on NVMe-TCP PR
Apply the review comments from the first round on #13061:

* FlashArrayAdapter.snapshot() and both getSnapshot() entry points now
  wrap the returned FlashArrayVolume in withAddressType(). Without this,
  snapshots taken against an NVMe-TCP pool had the constructor-default
  AddressType.FIBERWWN and ProviderSnapshot.getAddress() emitted an FC
  style WWN instead of the NVMe EUI-128, which the adaptive driver then
  persisted as the snapshot path. Verified end-to-end against Purity 6.7.7:
  a fresh NVMe-TCP snapshot now lands with install_path starting 006c... ,
  matching the source volume's EUI (previously it was 6-24a9370...).

* FlashArrayAdapter.attach() - retry path after 'Connection already
  exists' no longer requires a hostgroup-scoped match for NVMe-TCP. If
  hostgroup is not configured, or the existing connection is host-scoped,
  fall back to matching by host name, same as the Fibre Channel branch.
  Also normalize the 'volume lun is not found' message when no
  connection list is returned.

* FlashArrayAdapter.attach() - initial 'Volume attach did not return lun
  information' exception message now mentions both lun (FC) and nsid
  (NVMe-TCP) so the error is not misleading on NVMe deployments.

* FlashArrayAdapter.getVolumeByAddress() - validate the EUI-128 length
  before slicing. A short/malformed address used to throw
  StringIndexOutOfBoundsException deep inside getFlashArrayItem and be
  swallowed as 'not found'; now a clear RuntimeException is raised with
  the expected vs actual length.

* FlashArrayVolume.getAddress() - same defensive check when building an
  EUI-128 from the FlashArray volume serial; if the serial is shorter
  than 24 hex chars, fail with a clear message instead of SIOOBE.

* MultipathNVMeOFAdapterBase.connectPhysicalDisk() - Integer.parseInt of
  the STORAGE_POOL_DISK_WAIT detail is now guarded; a non-numeric value
  falls back to the default rather than aborting the connect.

* MultipathNVMeOFAdapterBase.rescanAllControllers() - honour the boolean
  return from Process.waitFor(). If an nvme ns-rescan invocation does
  not complete in NS_RESCAN_TIMEOUT_SECS we destroyForcibly() it, so
  hung nvme-cli processes do not accumulate while the namespace poll
  loop retries.

* NVMeTCPAdapter - rename LOGGER_NVMETCP to LOGGER to match the naming
  convention used in the other KVM adapters.

Signed-off-by: Eugenio Grosso <eugenio.grosso@gmail.com>
2026-04-23 12:21:31 +00:00
Eugenio Grosso c0cdfa41da kvm: implement copyPhysicalDisk on MultipathNVMeOFAdapterBase
The NVMe-oF KVM adapter refused every template copy request from the
adaptive storage orchestrator with UnsupportedOperationException, which
made it impossible to use an NVMe-TCP pool as primary storage for a VM
root disk: every deploy that landed a root volume on the pool failed
as soon as CloudStack tried to lay down the template.

Implement it the same way FiberChannel (SCSI) does: the storage provider
creates and connects a raw namespace ahead of time, then the adapter
resolves the host-side /dev/disk/by-id/nvme-eui.<NGUID> path via the
existing getPhysicalDisk plumbing (which will nvme ns-rescan and wait
for the symlink if the kernel has not yet picked it up) and qemu-img
converts the source image into the raw block device.

User-space encrypted source or destination volumes are rejected: the
FlashArray already encrypts at rest and layering qemu-img LUKS on top
of a hostgroup-scoped namespace shared between hosts is not a sensible
layering. Source encryption would also break on migration because the
passphrase does not travel.

With this change a CloudStack KVM VM can have its ROOT volume on an
NVMe-TCP pool (tested end-to-end on 4.23-SNAPSHOT against Purity 6.7.7:
template copy, first boot, live migrate with data disk, VM snapshot
with quiesce, and revert all work).

Signed-off-by: Eugenio Grosso <eugenio.grosso@gmail.com>
2026-04-22 20:52:05 +00:00
Eugenio Grosso ff03d9f4f3 docs: note NVMe-TCP support on the FlashArray adaptive plugin in PendingReleaseNotes 2026-04-20 22:50:54 +00:00
Eugenio Grosso b27512c431 adaptive: pick NVMeTCP pool type when transport=nvme-tcp
The adaptive storage framework hard-coded FiberChannel as the KVM-side
pool type for every provider it fronts. With a separate NVMeTCP pool
type now available (and a dedicated NVMe-oF adapter on the KVM side),
teach the lifecycle to route a pool to the right adapter based on a
transport= URL parameter:

  https://user:pass@host/api?...&transport=nvme-tcp

  -> StoragePoolType.NVMeTCP -> NVMeTCPAdapter on the KVM host

When the query parameter is absent the default stays FiberChannel, so
existing FC deployments on Primera or FlashArray continue to work
unchanged.

The choice is made in the shared AdaptiveDataStoreLifeCycleImpl rather
than inside each vendor plugin so every adaptive provider (FlashArray,
Primera, any future one) speaks the same configuration vocabulary.
2026-04-20 22:50:09 +00:00
Eugenio Grosso 20ba972e78 kvm: add MultipathNVMeOFAdapterBase and NVMeTCPAdapter
Introduce an NVMe-over-Fabrics counterpart to the existing
MultipathSCSIAdapterBase / FiberChannelAdapter pair.

NVMe-oF is conceptually distinct from SCSI - it speaks the NVMe command
set, identifies namespaces by EUI-128 NGUIDs, and is multipathed by the
kernel natively rather than by device-mapper - so keeping it out of the
SCSI code path avoids special-casing inside every method that handles
volume paths, connect, disconnect, or size lookup.

MultipathNVMeOFAdapterBase (abstract)

  * Parses volume paths of the form
        type=NVMETCP; address=<eui>; connid.<host>=<nsid>; ...
    into an AddressInfo whose path is
        /dev/disk/by-id/nvme-eui.<eui>
    which is the udev symlink the kernel emits for every NVMe namespace.

  * connectPhysicalDisk polls the udev path and, on every iteration,
    triggers nvme ns-rescan on all local NVMe controllers, to cover
    target/firmware combinations that do not send an asynchronous event
    notification when a new namespace is mapped.

  * disconnectPhysicalDisk is a no-op; the kernel drops the namespace
    when the target removes the host-group connection. The
    ByPath variant only claims paths starting with
    /dev/disk/by-id/nvme-eui. so foreign paths still fall through to
    other adapters.

  * Delegates getPhysicalDisk, isConnected, and getPhysicalDiskSize to
    plain test -b / blockdev --getsize64 calls - no SCSI rescan, no dm
    multipath, no multipath-map cleanup timer.

  * createPhysicalDisk / createTemplateFromDisk / listPhysicalDisks /
    copyPhysicalDisk all throw UnsupportedOperationException - these
    are the responsibility of the storage provider, not the KVM
    adapter, same as the SCSI base.

MultipathNVMeOFPool

  * KVMStoragePool mirror of MultipathSCSIPool. Defaults to
    Storage.StoragePoolType.NVMeTCP in the parameterless-fallback
    constructor.

NVMeTCPAdapter

  * Concrete adapter that registers itself for
    Storage.StoragePoolType.NVMeTCP via the reflection-based scan in
    KVMStoragePoolManager. Carries no logic of its own beyond binding
    the base to the pool type.

A similar MultipathNVMeOFAdapterBase-derived NVMeRoCEAdapter (or
NVMeFCAdapter) can later be added by adding one concrete subclass and a
new pool-type value; the base does not assume any particular
fabric-level transport.
2026-04-20 22:44:38 +00:00
Eugenio Grosso 7d1ec8ff8a storage: add NVMeTCP storage pool type
NVMe-oF over TCP (NVMe-TCP) is conceptually a separate storage fabric
from Fibre Channel / iSCSI: it speaks the NVMe command set rather than
SCSI, identifies namespaces by EUI-128 NGUIDs rather than WWNs, and on
Linux is multipathed natively by the nvme driver rather than by
device-mapper multipath. Giving it its own StoragePoolType lets the
KVM agent dispatch the adaptive driver to a dedicated NVMe-oF adapter
(added in the next commit) without polluting the existing Fibre Channel
code path.

The new value is wired into the same format-routing and derivePath
fall-through paths that already special-case FiberChannel in
KVMStorageProcessor: NVMe-TCP volumes are also RAW and carry their
device path in DataObjectTO.path rather than in a managedStoreTarget
detail.
2026-04-20 22:39:43 +00:00
Eugenio Grosso 1b44cfa604 flasharray: support NVMe-TCP transport
Teach FlashArrayAdapter to talk to a pool over NVMe over TCP instead of
Fibre Channel.

The transport is selected from a new transport= option on the storage
pool URL (or the equivalent storage_pool_details entry), e.g.

    https://user:pass@fa:443/api?pod=cs&transport=nvme-tcp&hostgroup=cluster1

Defaults remain Fibre Channel / WWN addressing when transport is absent
or anything other than nvme-tcp, so existing FC pools are unaffected.

Beyond the transport parsing itself the adapter now:

  * Tracks a per-pool volumeAddressType (AddressType.NVMETCP or
    FIBERWWN) and stamps every volume it hands back to the framework
    with it (withAddressType), so the adaptive driver path stores the
    correct type=... field in the CloudStack volume path (used later
    by the KVM driver to locate the device).

  * Attaches pod-backed NVMe-TCP volumes at the host-group level
    (POST /connections?host_group_names=...) instead of per-host, so
    the array assigns a consistent NSID to every member host; falls
    back to per-host attach for FC or when no hostgroup is configured.

  * Tolerates a missing nsid in the FlashArray connections response
    for NVMe-TCP - Purity does not return one for host-group NVMe
    connections; the namespace is identified on the host by EUI-128
    from FlashArrayVolume.getAddress(), so a placeholder value is
    returned to the caller purely for informational tracking.

  * Resolves NVMETCP addresses back to volumes in getVolumeByAddress
    by reversing the EUI-128 layout (strip optional eui. prefix, drop
    leading 00 and the embedded Pure OUI).

  * Indexes NVMe connections in getConnectionIdMap by host name (the
    array returns one entry per host inside a host-group connection),
    so connid.<hostname> tokens in the path still match in
    parseAndValidatePath on the KVM side.

Followed by a matching adaptive/KVM driver change (separate commit).
2026-04-20 22:26:05 +00:00
Eugenio Grosso c3c0f0cedd adapter: add NVMETCP address type and FlashArrayConnection.nsid
Preparatory data-model changes for NVMe-TCP support on the adaptive
storage framework. No behaviour change for existing Fibre Channel
users - the extra enum value, field, and getter/setter are only
exercised by callers that explicitly use them.

ProviderVolume.AddressType gains a NVMETCP value alongside FIBERWWN,
so adapters can declare that a volume is addressed by an NVMe EUI-128
(NGUID) rather than a SCSI WWN.

FlashArrayVolume.getAddress() produces the NGUID layout expected by
the Linux kernel for a FlashArray NVMe namespace:

    00 + serial[0:14] + 24a937 (Pure 6-hex OUI) + serial[14:24]

which matches the /dev/disk/by-id/nvme-eui.<id> symlink emitted by
udev. Fibre Channel callers (addressType != NVMETCP) still get the
existing 6 + 24a9370 + serial form.

FlashArrayConnection gains a nsid field to carry the namespace id the
FlashArray REST API attaches to host-group-scoped NVMe connections,
when it is present.
2026-04-20 22:06:00 +00:00
Henrique Sato 3166e64891
Add support for new variables to the GUI whitelabel runtime system (#12760)
* Add support for new variables to the GUI whitelabel runtime system

* Address review
2026-04-17 10:59:50 -03:00
Wei Zhou f820d0125d
fix end of files and codespell errors 2026-04-17 13:58:21 +02:00
Suresh Kumar Anaparti 2d6280b9da
Merge branch '4.22' 2026-04-17 04:35:25 +05:30
Suresh Kumar Anaparti 13a2c7793c
Merge branch '4.20' into 4.22 2026-04-17 03:12:33 +05:30
Brad House - Nexthop 83f705ddc5
Static Routes with nexthop non-functional for private gateways (#12859)
* Fix static routes to be added to PBR tables in VPC routers

Static routes were only being added to the main routing table, but
policy-based routing (PBR) is active on VPC routers. This caused
traffic coming in from specific interfaces to not find the static
routes, as they use interface-specific routing tables (Table_ethX).

This fix:
- Adds a helper method to find which interface a gateway belongs to
  by matching the gateway IP against configured interface subnets
- Modifies route add/delete operations to update both the main table
  and the appropriate interface-specific PBR table
- Uses existing CsAddress databag metadata to avoid OS queries
- Handles both add and revoke operations for proper cleanup
- Adds comprehensive logging for troubleshooting

Fixes #12857

* Add iptables FORWARD rules for nexthop-based static routes

When static routes use nexthop (gateway) instead of referencing a
private gateway's public IP, the iptables FORWARD rules were not
being generated. This caused traffic to be dropped by ACLs.

This fix:
- Adds a shared helper CsHelper.find_device_for_gateway() to determine
  which interface a gateway belongs to by checking subnet membership
- Updates CsStaticRoutes to use the shared helper instead of duplicating
  the device-finding logic
- Modifies CsAddress firewall rule generation to handle both old-style
  (ip_address-based) and new-style (nexthop-based) static routes
- Generates the required FORWARD and PREROUTING rules for nexthop routes:
  * -A PREROUTING -s <network> ! -d <interface_ip>/32 -i <dev> -j ACL_OUTBOUND_<dev>
  * -A FORWARD -d <network> -o <dev> -j ACL_INBOUND_<dev>
  * -A FORWARD -d <network> -o <dev> -m state --state RELATED,ESTABLISHED -j ACCEPT

Fixes the second part of #12857

* network matching grep fix, don't let 1.2.3.4/32 match 11.2.3.4/32
2026-04-16 16:15:43 +05:30
Brad House 6e810989b6
HAProxy Configuration: network.loadbalancer.haproxy.idle.timeout (#12586)
* initial attempt at network.loadbalancer.haproxy.idle.timeout implementation

* implement test cases

* move idleTimeout configuration test to its own test case
2026-04-16 14:49:54 +05:30
Daniil Zhyliaiev e0fe953791
fix: NSX SDK list operations are pageable: the API returns a non-null and non-empty (#12834)
`cursor` field when more pages are available. The previous implementation only
fetched the first page and ignored pagination.

This change updates the list retrieval flow to:
- follow the `cursor` chain until no further pages exist
- accumulate items from all pages
- return a single merged result to the caller

This ensures that list operations return the complete dataset rather than just
the first page.

Co-authored-by: Andrey Volchkov <avolchkov@playtika.com>
2026-04-16 14:15:30 +05:30
Daniil Zhyliaiev 05c59630e0
fix: LB Creation avoid 404 API errors due to non-needed patches (#12835) 2026-04-16 13:58:20 +05:30
Wei Zhou 1fc4cb90bf
Routed VR: accept packets from related and established connections (#12986) 2026-04-15 15:36:26 +05:30
Abhishek Kumar c6936889f5
server: prevent adding vm compute details when not applicable (#12637) 2026-04-15 10:41:20 +02:00
Daan Hoogland f5e75771bc merge forwards fix 2026-04-15 09:58:27 +02:00
Daan Hoogland c298f8f360 Merge release branch 4.22.0.1 to 4.22
* tag '4.22.0.1':
  Implement limit validations on updateBucket
  Address reviews
2026-04-15 08:58:24 +02:00
Fabricio Duarte 2511fdffaa Implement limit validations on updateBucket 2026-04-15 08:53:37 +02:00
Fabricio Duarte 13842a626d Address reviews 2026-04-15 08:52:35 +02:00
Nicolas Vazquez 160876c6d7
Fix: API Thread held forever during force deleting across MS (#12968) 2026-04-15 08:41:26 +02:00
Erik Böck 5013cf2af6
Fix user password reset mail template value (#12882)
* Fix default user password reset email template

* improve readabilty

* change update query

* Specify database for update

* Fix SQL statement

* Use CONCAT_WS sql method to create multiline string

---------

Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2026-04-15 10:06:39 +05:30
Harikrishna 0c86899cc1
Added VDDK support in VMware to KVM migrations (#12970) 2026-04-14 22:33:01 +05:30
Daan Hoogland 82bfa9fb3f Merge branch '4.22' 2026-04-14 14:50:44 +02:00
Daan Hoogland 23f633ae83 Merge tag '4.22.0.1' into 4.22 2026-04-14 13:15:14 +02:00
Daan Hoogland 1085da4ef8 Merge commit '19b4ef106931aa1d6a8fed06984009d86760e4de' into 4.22 2026-04-14 13:15:05 +02:00
Suresh Kumar Anaparti d75acb6efc
Fix rollback disk snapshots on instance snapshot failure (#12949) 2026-04-14 15:21:05 +05:30
Suresh Kumar Anaparti 38abe2df0b
Allow list async jobs by resource type alone (#13011) 2026-04-14 15:20:13 +05:30
Suresh Kumar Anaparti feb6076930
Remove unused config consoleproxy.cmd.port (#12807)
* Remove unused config 'consoleproxy.cmd.port'

* Remove the config key

---------

Co-authored-by: dahn <daan@onecht.net>
2026-04-14 13:40:00 +05:30
julien-vaz 161b4177c2
Add logs for storage pools reordering (#10419)
Co-authored-by: Julien Hervot de Mattos Vaz <julien.vaz@scclouds.com.br>
2026-04-14 09:51:05 +02:00
Henrique Sato ed575cc0a1
New config.json variable to set the ACS default language (#12863)
* New config.json variable to set the ACS default language

* Address review
2026-04-13 14:37:45 -03:00
Jtolelo ae455ee193 VPC restart cleanup for Public networks with multi-CIDR data (#12622)
* Fix VPC restart with multi-CIDR networks: handle comma-separated CIDR in NetworkVO.equals()

When a network has multiple CIDRs (e.g. '192.168.2.0/24,160.0.0.0/24'),
NetworkVO.equals() passes the raw comma-separated string to
NetUtils.isNetworkAWithinNetworkB() which expects a single CIDR,
causing 'cidr is not formatted correctly' error during VPC restart
with cleanup=true.

Extract only the first CIDR value before passing to NetUtils.

* Fix root cause: skip CIDR/gateway updates for Public traffic type networks

addCidrAndGatewayForIpv4/Ipv6 (introduced by PR #11249) was called for all
network types without checking if the network is Public. This caused
comma-separated CIDRs to be stored on Public networks, which then triggered
'cidr is not formatted correctly' errors during VPC restart.

Add TrafficType.Public guard in both the VLAN creation (addCidr) and
VLAN deletion (removeCidr) paths in ConfigurationManagerImpl.

* Sanitize legacy network-level addressing fields for Public networks

---------

Co-authored-by: dahn <daan@onecht.net>
2026-04-13 15:40:26 +02:00
Suresh Kumar Anaparti 47c5bb8ee7 Support list/query async jobs by resource (#12983)
* Add resource filtering to async job query commands

* Fix logical condition in AsyncJobDaoImpl and ResourceIdSupport

* resource type case-insensitive validation

* fix resource type and id search

---------

Co-authored-by: mprokopchuk <mprokopchuk@gmail.com>
Co-authored-by: mprokopchuk <mprokopchuk@apple.com>
2026-04-13 15:39:49 +02:00
sandeeplocharla 5b696c0ec7
Create, Delete, Enable, Disable, Enter, Cancel maintenance of Primary StoragePool with ONTAP storage (#12563)
* Create & Delete, Enable & Disable, Enter & Cancel maintenance of Primary StoragePool with ONTAP storage
Co-authored-by: Rajiv Jain <Rajiv.Jain@netapp.com>

Create & Delete, Enable & Disable, Enter & Cancel maintenance of Primary StoragePool with ONTAP storage
Co-authored-by: Rajiv Jain<rajiv1@netapp.com>

Edited readme file

Fixed license check issue

Removed dependency that's not conforming with ACF guidelines

* Fixed the initial review comments

* Fixed some rebase issues

---------

Co-authored-by: Locharla, Sandeep <Sandeep.Locharla@netapp.com>
2026-04-13 08:38:15 -03:00
Abhisar Sinha 8eb162cb99 Updating pom.xml version numbers for release 4.20.4.0-SNAPSHOT 2026-04-13 15:48:18 +05:30
Daan Hoogland d6f4fc3ac4 Updating pom.xml version numbers for release 22.0.1 2026-04-13 11:53:00 +02:00
Abhishek Kumar 19b4ef1069 server: reserve backup, bucket resource limits during operations
Changes to check resource limits with reservations for the following
resource types:
- backup
- backup_storage
- bnucket
- object_storage

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2026-04-13 11:21:41 +02:00
Fabricio Duarte 9f57a4dd19
Unhide setting `js.interpretation.enabled` (#12605)
* Unhide setting 'js.interpretation.enabled'

* Fix grammar mistake
2026-04-10 23:45:07 -03:00
João Jandre 7c7b2ae75d
Fix KVM incremental volume snapshot creation (#12666) 2026-04-11 00:12:44 +05:30
Manoj Kumar b196e97cc3
Prevent deletion of account and domain if either of them has deleted protected instance (#12901) 2026-04-10 15:51:22 +02:00
Abhisar Sinha df7ff97271
Create volume on a specified storage pool (#12966) 2026-04-10 14:27:39 +02:00
Wei Zhou 273699cf56
kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist (#12928)
* kvm: fix wrong CheckVirtualMachineAnswer when vm does not exist

* kvm: add LibvirtCheckVirtualMachineCommandWrapperTest

Co-authored-by: dahn <daan.hoogland@gmail.com>
2026-04-10 16:01:29 +05:30
poddm 8f3c6fad7a
set snapcpg config on copy (#12955) 2026-04-10 15:18:45 +05:30
Bernardo De Marco Gonçalves 27e4d979f1
Clean up backup references to their schedules when the schedules are deleted (#12401)
* clean up backup schedule references after their deletion

* drop unused column

* address reviews
2026-04-10 14:51:52 +05:30
Vishesh 80ee7f183f
Fix six package incompatiblity with EL10 (#12799) 2026-04-10 14:47:49 +05:30
Wei Zhou e297644ce1
KVM: Enable HA heartbeat on ShareMountPoint (#12773) 2026-04-10 14:12:40 +05:30
Suresh Kumar Anaparti 11538df710
Merge branch '4.22' 2026-04-10 12:02:40 +05:30
João Jandre 2a60305792
Fix snapshot chaining on Xen (#12597) 2026-04-10 11:05:26 +05:30