* Host HA code improvements
* Fix to not cancel VM HA items when Host HA is enabled & inspection in progress, and some code improvements
- When Host HA inspection in progress, the investigor returns the Host Status as Up which cancels the VM HA items
- Don't cancel the VM HA items, instead reschedule them to try again later
* Changes to consider Recovered/Available Host HA state along with the agent connection status to determine the Host HA inspection in progress or not, and some code improvements
* Refactoring Allocator classes
* Break into smaller methods random and firfit allocators.
* Added unit tests for random and firstfit allocators
* Move random allocator from cloud-plugins to cloud-server
* Add BaseAllocator abstract class for duplicate code
* Add missing license
* Add missing license to unit test file
* Remove host allocator random dependency
* Change exception message on smoke tests
* Remove conditional as it was never actually reached in the original flow
* Fix tests
* Fix flipped parameters
* Fix NPE while listing hosts for migration when suitableHosts is null
* Remove unnecessary stubbings
* Fix checkstyle
* Remove unnecessary file
* Rename exception error messages
* Apply suggestions from code review
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* Rename UserVmDetailVO references to VMInstanceDetailVO
* Remove unused imports
* Add new line at EOF
* Remove unnecessary random allocator pom
* Fix GPU allocation mistake
* Fix failing tests
---------
Co-authored-by: Fabricio Duarte <fabricio.duarte@scclouds.com.br>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* Linstor: fix create volume from snapshot on primary storage
When creating a volume from a snapshot on Linstor primary storage
(with lin.backup.snapshots=false), the operation fails with:
"Only the following image types are currently supported: VHD, OVA,
QCOW2, RAW (for PowerFlex and FiberChannel)"
Root cause: the Linstor driver does not handle SNAPSHOT -> VOLUME in
its canCopy()/copyAsync() methods. This causes DataMotionServiceImpl
to fall through to StorageSystemDataMotionStrategy (selected because
Linstor advertises STORAGE_SYSTEM_SNAPSHOT=true). That strategy's
verifyFormatWithPoolType() rejects RAW format for Linstor pools,
since RAW is only allowed for PowerFlex and FiberChannel.
Additionally, VolumeOrchestrator.createVolumeFromSnapshot() attempts
to back up the snapshot to secondary storage when the storage plugin
does not advertise CAN_CREATE_TEMPLATE_FROM_SNAPSHOT. This backup
fails because the snapshot only exists on Linstor primary storage.
Fix:
- Add CAN_CREATE_TEMPLATE_FROM_SNAPSHOT capability so the
orchestrator skips the backup-to-secondary path
- Add canCopySnapshotToVolumeCond() to match SNAPSHOT -> VOLUME
when both are on the same Linstor primary store
- Wire it into canCopy() to intercept at DataMotionServiceImpl
before strategy selection, bypassing StorageSystemDataMotionStrategy
- Implement copySnapshotToVolume() which delegates to the existing
createResourceFromSnapshot() for native Linstor snapshot restore
This follows the same pattern used by the StorPool plugin, which
handles SNAPSHOT -> VOLUME directly in its driver rather than going
through StorageSystemDataMotionStrategy.
Tested on CloudStack 4.22 with Linstor LVM_THIN storage, creating
a volume from a 1TB CNPG Postgres database snapshot. Volume creates
successfully with correct path and deletes cleanly.
* Let CloudRuntimeException propagate from copySnapshotToVolume
Remove try/catch in copySnapshotToVolume so that CloudRuntimeException
from createResourceFromSnapshot propagates to the caller, ensuring
CloudStack properly notices and reports the failure.
* Fix CAN_CREATE_TEMPLATE_FROM_SNAPSHOT breaking template creation
Setting CAN_CREATE_TEMPLATE_FROM_SNAPSHOT unconditionally to true
caused createTemplate from snapshot to take the StorPool-specific
code path in TemplateManagerImpl, which sends a CopyCommand to a
system VM that Linstor cannot handle.
Fix: make CAN_CREATE_TEMPLATE_FROM_SNAPSHOT conditional on the same
flag as STORAGE_SYSTEM_SNAPSHOT (!BackupSnapshots). When snapshots
are backed up to secondary (the default), the old template creation
flow works. When snapshots stay on primary, the direct path is used.
Also fix checkstyle: remove unused DataObject import in test.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AddPrimaryStorage previously pinned the protocol to FiberChannel whenever the operator picked the FlashArray provider, leaving NVMe-TCP backends only reachable by hand-crafting the URL with ?transport=nvme-tcp. Surface the choice in the form:
- protocols dropdown for FlashArray now offers FiberChannel and NVMeTCP (Primera stays FC-only).
- when NVMeTCP is selected, the submit handler appends transport=nvme-tcp to the FlashArray URL so the adaptive lifecycle pivot in pickPoolType() resolves StoragePoolType.NVMeTCP server-side.
- the generic Path field, already hidden for FiberChannel, is also hidden for NVMeTCP for parity.
Signed-off-by: Eugenio Grosso <eugenio.grosso@gmail.com>
* Fix bulk power state query missing VM lifecycle state field
The IdsPowerStateSelectSearch partial select did not include the VM
lifecycle state, causing isPowerStateInSyncWithInstanceState to always
return true when state was null. This prevented retry of failed
StopCommands on subsequent ping cycles.
* Add defensive check for instance host ID to prevent NPE
Co-authored-by: Sachin R Doddaguni <s_rudrappadoddagu@apple.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
Apply the review comments from the first round on #13061:
* FlashArrayAdapter.snapshot() and both getSnapshot() entry points now
wrap the returned FlashArrayVolume in withAddressType(). Without this,
snapshots taken against an NVMe-TCP pool had the constructor-default
AddressType.FIBERWWN and ProviderSnapshot.getAddress() emitted an FC
style WWN instead of the NVMe EUI-128, which the adaptive driver then
persisted as the snapshot path. Verified end-to-end against Purity 6.7.7:
a fresh NVMe-TCP snapshot now lands with install_path starting 006c... ,
matching the source volume's EUI (previously it was 6-24a9370...).
* FlashArrayAdapter.attach() - retry path after 'Connection already
exists' no longer requires a hostgroup-scoped match for NVMe-TCP. If
hostgroup is not configured, or the existing connection is host-scoped,
fall back to matching by host name, same as the Fibre Channel branch.
Also normalize the 'volume lun is not found' message when no
connection list is returned.
* FlashArrayAdapter.attach() - initial 'Volume attach did not return lun
information' exception message now mentions both lun (FC) and nsid
(NVMe-TCP) so the error is not misleading on NVMe deployments.
* FlashArrayAdapter.getVolumeByAddress() - validate the EUI-128 length
before slicing. A short/malformed address used to throw
StringIndexOutOfBoundsException deep inside getFlashArrayItem and be
swallowed as 'not found'; now a clear RuntimeException is raised with
the expected vs actual length.
* FlashArrayVolume.getAddress() - same defensive check when building an
EUI-128 from the FlashArray volume serial; if the serial is shorter
than 24 hex chars, fail with a clear message instead of SIOOBE.
* MultipathNVMeOFAdapterBase.connectPhysicalDisk() - Integer.parseInt of
the STORAGE_POOL_DISK_WAIT detail is now guarded; a non-numeric value
falls back to the default rather than aborting the connect.
* MultipathNVMeOFAdapterBase.rescanAllControllers() - honour the boolean
return from Process.waitFor(). If an nvme ns-rescan invocation does
not complete in NS_RESCAN_TIMEOUT_SECS we destroyForcibly() it, so
hung nvme-cli processes do not accumulate while the namespace poll
loop retries.
* NVMeTCPAdapter - rename LOGGER_NVMETCP to LOGGER to match the naming
convention used in the other KVM adapters.
Signed-off-by: Eugenio Grosso <eugenio.grosso@gmail.com>
The NVMe-oF KVM adapter refused every template copy request from the
adaptive storage orchestrator with UnsupportedOperationException, which
made it impossible to use an NVMe-TCP pool as primary storage for a VM
root disk: every deploy that landed a root volume on the pool failed
as soon as CloudStack tried to lay down the template.
Implement it the same way FiberChannel (SCSI) does: the storage provider
creates and connects a raw namespace ahead of time, then the adapter
resolves the host-side /dev/disk/by-id/nvme-eui.<NGUID> path via the
existing getPhysicalDisk plumbing (which will nvme ns-rescan and wait
for the symlink if the kernel has not yet picked it up) and qemu-img
converts the source image into the raw block device.
User-space encrypted source or destination volumes are rejected: the
FlashArray already encrypts at rest and layering qemu-img LUKS on top
of a hostgroup-scoped namespace shared between hosts is not a sensible
layering. Source encryption would also break on migration because the
passphrase does not travel.
With this change a CloudStack KVM VM can have its ROOT volume on an
NVMe-TCP pool (tested end-to-end on 4.23-SNAPSHOT against Purity 6.7.7:
template copy, first boot, live migrate with data disk, VM snapshot
with quiesce, and revert all work).
Signed-off-by: Eugenio Grosso <eugenio.grosso@gmail.com>
The adaptive storage framework hard-coded FiberChannel as the KVM-side
pool type for every provider it fronts. With a separate NVMeTCP pool
type now available (and a dedicated NVMe-oF adapter on the KVM side),
teach the lifecycle to route a pool to the right adapter based on a
transport= URL parameter:
https://user:pass@host/api?...&transport=nvme-tcp
-> StoragePoolType.NVMeTCP -> NVMeTCPAdapter on the KVM host
When the query parameter is absent the default stays FiberChannel, so
existing FC deployments on Primera or FlashArray continue to work
unchanged.
The choice is made in the shared AdaptiveDataStoreLifeCycleImpl rather
than inside each vendor plugin so every adaptive provider (FlashArray,
Primera, any future one) speaks the same configuration vocabulary.
Introduce an NVMe-over-Fabrics counterpart to the existing
MultipathSCSIAdapterBase / FiberChannelAdapter pair.
NVMe-oF is conceptually distinct from SCSI - it speaks the NVMe command
set, identifies namespaces by EUI-128 NGUIDs, and is multipathed by the
kernel natively rather than by device-mapper - so keeping it out of the
SCSI code path avoids special-casing inside every method that handles
volume paths, connect, disconnect, or size lookup.
MultipathNVMeOFAdapterBase (abstract)
* Parses volume paths of the form
type=NVMETCP; address=<eui>; connid.<host>=<nsid>; ...
into an AddressInfo whose path is
/dev/disk/by-id/nvme-eui.<eui>
which is the udev symlink the kernel emits for every NVMe namespace.
* connectPhysicalDisk polls the udev path and, on every iteration,
triggers nvme ns-rescan on all local NVMe controllers, to cover
target/firmware combinations that do not send an asynchronous event
notification when a new namespace is mapped.
* disconnectPhysicalDisk is a no-op; the kernel drops the namespace
when the target removes the host-group connection. The
ByPath variant only claims paths starting with
/dev/disk/by-id/nvme-eui. so foreign paths still fall through to
other adapters.
* Delegates getPhysicalDisk, isConnected, and getPhysicalDiskSize to
plain test -b / blockdev --getsize64 calls - no SCSI rescan, no dm
multipath, no multipath-map cleanup timer.
* createPhysicalDisk / createTemplateFromDisk / listPhysicalDisks /
copyPhysicalDisk all throw UnsupportedOperationException - these
are the responsibility of the storage provider, not the KVM
adapter, same as the SCSI base.
MultipathNVMeOFPool
* KVMStoragePool mirror of MultipathSCSIPool. Defaults to
Storage.StoragePoolType.NVMeTCP in the parameterless-fallback
constructor.
NVMeTCPAdapter
* Concrete adapter that registers itself for
Storage.StoragePoolType.NVMeTCP via the reflection-based scan in
KVMStoragePoolManager. Carries no logic of its own beyond binding
the base to the pool type.
A similar MultipathNVMeOFAdapterBase-derived NVMeRoCEAdapter (or
NVMeFCAdapter) can later be added by adding one concrete subclass and a
new pool-type value; the base does not assume any particular
fabric-level transport.
NVMe-oF over TCP (NVMe-TCP) is conceptually a separate storage fabric
from Fibre Channel / iSCSI: it speaks the NVMe command set rather than
SCSI, identifies namespaces by EUI-128 NGUIDs rather than WWNs, and on
Linux is multipathed natively by the nvme driver rather than by
device-mapper multipath. Giving it its own StoragePoolType lets the
KVM agent dispatch the adaptive driver to a dedicated NVMe-oF adapter
(added in the next commit) without polluting the existing Fibre Channel
code path.
The new value is wired into the same format-routing and derivePath
fall-through paths that already special-case FiberChannel in
KVMStorageProcessor: NVMe-TCP volumes are also RAW and carry their
device path in DataObjectTO.path rather than in a managedStoreTarget
detail.
Teach FlashArrayAdapter to talk to a pool over NVMe over TCP instead of
Fibre Channel.
The transport is selected from a new transport= option on the storage
pool URL (or the equivalent storage_pool_details entry), e.g.
https://user:pass@fa:443/api?pod=cs&transport=nvme-tcp&hostgroup=cluster1
Defaults remain Fibre Channel / WWN addressing when transport is absent
or anything other than nvme-tcp, so existing FC pools are unaffected.
Beyond the transport parsing itself the adapter now:
* Tracks a per-pool volumeAddressType (AddressType.NVMETCP or
FIBERWWN) and stamps every volume it hands back to the framework
with it (withAddressType), so the adaptive driver path stores the
correct type=... field in the CloudStack volume path (used later
by the KVM driver to locate the device).
* Attaches pod-backed NVMe-TCP volumes at the host-group level
(POST /connections?host_group_names=...) instead of per-host, so
the array assigns a consistent NSID to every member host; falls
back to per-host attach for FC or when no hostgroup is configured.
* Tolerates a missing nsid in the FlashArray connections response
for NVMe-TCP - Purity does not return one for host-group NVMe
connections; the namespace is identified on the host by EUI-128
from FlashArrayVolume.getAddress(), so a placeholder value is
returned to the caller purely for informational tracking.
* Resolves NVMETCP addresses back to volumes in getVolumeByAddress
by reversing the EUI-128 layout (strip optional eui. prefix, drop
leading 00 and the embedded Pure OUI).
* Indexes NVMe connections in getConnectionIdMap by host name (the
array returns one entry per host inside a host-group connection),
so connid.<hostname> tokens in the path still match in
parseAndValidatePath on the KVM side.
Followed by a matching adaptive/KVM driver change (separate commit).
Preparatory data-model changes for NVMe-TCP support on the adaptive
storage framework. No behaviour change for existing Fibre Channel
users - the extra enum value, field, and getter/setter are only
exercised by callers that explicitly use them.
ProviderVolume.AddressType gains a NVMETCP value alongside FIBERWWN,
so adapters can declare that a volume is addressed by an NVMe EUI-128
(NGUID) rather than a SCSI WWN.
FlashArrayVolume.getAddress() produces the NGUID layout expected by
the Linux kernel for a FlashArray NVMe namespace:
00 + serial[0:14] + 24a937 (Pure 6-hex OUI) + serial[14:24]
which matches the /dev/disk/by-id/nvme-eui.<id> symlink emitted by
udev. Fibre Channel callers (addressType != NVMETCP) still get the
existing 6 + 24a9370 + serial form.
FlashArrayConnection gains a nsid field to carry the namespace id the
FlashArray REST API attaches to host-group-scoped NVMe connections,
when it is present.
* Move logs for values of the migration settings out of the loop
* Apply suggestions from code review
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
---------
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Fixes an issue in NsxResource.executeRequest where Network.Service
comparison failed when DeleteNsxNatRuleCommand was executed in a
different process. Due to serialization/deserialization, the
deserialized Network.Service instance was not equal to the static
instances Network.Service.StaticNat and Network.Service.PortForwarding,
causing the comparison to always return false.
Co-authored-by: Andrey Volchkov <avolchkov@playtika.com>
(cherry picked from commit 30dd234b00)
* Fix static routes to be added to PBR tables in VPC routers
Static routes were only being added to the main routing table, but
policy-based routing (PBR) is active on VPC routers. This caused
traffic coming in from specific interfaces to not find the static
routes, as they use interface-specific routing tables (Table_ethX).
This fix:
- Adds a helper method to find which interface a gateway belongs to
by matching the gateway IP against configured interface subnets
- Modifies route add/delete operations to update both the main table
and the appropriate interface-specific PBR table
- Uses existing CsAddress databag metadata to avoid OS queries
- Handles both add and revoke operations for proper cleanup
- Adds comprehensive logging for troubleshooting
Fixes#12857
* Add iptables FORWARD rules for nexthop-based static routes
When static routes use nexthop (gateway) instead of referencing a
private gateway's public IP, the iptables FORWARD rules were not
being generated. This caused traffic to be dropped by ACLs.
This fix:
- Adds a shared helper CsHelper.find_device_for_gateway() to determine
which interface a gateway belongs to by checking subnet membership
- Updates CsStaticRoutes to use the shared helper instead of duplicating
the device-finding logic
- Modifies CsAddress firewall rule generation to handle both old-style
(ip_address-based) and new-style (nexthop-based) static routes
- Generates the required FORWARD and PREROUTING rules for nexthop routes:
* -A PREROUTING -s <network> ! -d <interface_ip>/32 -i <dev> -j ACL_OUTBOUND_<dev>
* -A FORWARD -d <network> -o <dev> -j ACL_INBOUND_<dev>
* -A FORWARD -d <network> -o <dev> -m state --state RELATED,ESTABLISHED -j ACCEPT
Fixes the second part of #12857
* network matching grep fix, don't let 1.2.3.4/32 match 11.2.3.4/32
* initial attempt at network.loadbalancer.haproxy.idle.timeout implementation
* implement test cases
* move idleTimeout configuration test to its own test case