Companion to the NIC-level URI override -- when an extension declares
vif.binding=lswitch, the Network row itself should advertise
``broadcast_domain_type=Lswitch`` and ``broadcast_uri=ovn://cs-net-<id>``
so listNetworks / details views are consistent with what the OVN
control plane represents.
Without this hook the GuestNetworkGuru still allocated a VLAN at
design time, which leaks back into the UI:
VLAN/VNI: 138
Broadcast URI: vlan://138
Apply the override on a successful ``implement-network`` script return
in NetworkExtensionElement.implement(). The VLAN that the guru
allocated stays as a ghost in op_dc_vnet_alloc -- it is never used on
the wire because the VIF attaches to br-int and traffic flows through
OVN's logical pipeline over geneve. Releasing the VLAN back to the
pool would require intercepting the design phase and is out of scope
for this hook.
Verified end-to-end: i-2-27-VM on network 216 now lists
networks.broadcast_uri = ovn://cs-net-216
networks.broadcast_domain_type = Lswitch
nics.broadcast_uri = ovn://cs-net-216
nics.isolation_uri = ovn://cs-net-216
The OVN NB LSP / OVS iface-id / OVN SB Port_Binding remain bound, as
before.
Delta 1 already overrides nic.setBroadcastType(Lswitch) on the
NicProfile during prepare() so the KVM agent picks the OVS Lswitch
path. But the underlying nics row still carried the cosmetic
``vlan://<id>`` URI allocated by GuestNetworkGuru at design-time, which
is misleading on listNics / DB queries: a NIC sitting on an OVN
logical switch should not advertise a VLAN URI.
Override broadcast_uri and isolation_uri on the NicProfile to
``ovn://cs-net-<networkId>`` (the convention used by the legacy
ovn-plugin) and persist the same on the nics row via nicDao.update.
The VLAN that the guru allocated stays as a ghost in
op_dc_vnet_alloc -- it is never used on the wire because the VIF
attaches to br-int and traffic flows through OVN's logical pipeline
over geneve. Releasing the VLAN back to the pool would require
intercepting the design phase, which is out of scope for this hook.
Verified end-to-end: i-2-24-VM on network 214 now lists
broadcast_uri = ovn://cs-net-214
isolation_uri = ovn://cs-net-214
and the OVN NB LSP / OVS iface-id / OVN SB Port_Binding remain
correctly bound to the chassis, as before.
CloudStack's existing OvsVifDriver already binds NICs correctly when the
NicProfile's BroadcastDomainType is Lswitch: it emits libvirt
<virtualport type='openvswitch' interfaceid='<nic.getUuid()>'/> and
libvirt sets external_ids:iface-id atomically with tap creation. No
agent patch is required for OVS-backed extensions to consume this path
-- they just need (a) a way to opt in, and (b) nic.getUuid() carried in
per-NIC script commands so the SDN-side port identifier can match.
Add the framework hooks to enable this without any KVM agent change:
* ExtensionHelper.VIF_BINDING_DETAIL_KEY ("vif.binding") -- new
top-level extension detail. Currently supported value: "lswitch".
* NetworkExtensionElement.prepare(...) -- when the extension owning the
NIC's network declares vif.binding=lswitch in its extension_details,
override nic.setBroadcastType(Networks.BroadcastDomainType.Lswitch).
OvsVifDriver on the KVM agent then picks the existing Lswitch path
unchanged. Without the opt-in, the previous default (typically Vlan)
is preserved -- existing reference extensions like network-namespace
are unaffected.
* NetworkExtensionElement.getNicUuidArgs(network, nic) -- helper that
returns ["--nic-uuid", "<uuid>"] only when vif.binding=lswitch is
declared. Wired into add-dhcp-entry, remove-dhcp-entry,
add-dns-entry, save-vm-data, save-password, save-userdata,
save-sshkey, and save-hypervisor-hostname. Extensions that do not
declare the hint never see --nic-uuid, so backwards-compatible.
* README -- new section "VIF Binding for OVS-backed Extensions"
documenting the contract end-to-end: cmk createExtension snippet,
what prepare() does, how --nic-uuid flows, why the extension never
writes iface-id remotely on the boot path. Also notes the new
argument in the add-dhcp-entry table.
Result: an OVN extension (or any future OVS-backed extension) gets
correct VIF binding by adding a single detail key at extension creation
time. No host-side agent patch, no libvirt patch, no OVS schema
change.
* Host HA code improvements
* Fix to not cancel VM HA items when Host HA is enabled & inspection in progress, and some code improvements
- When Host HA inspection in progress, the investigor returns the Host Status as Up which cancels the VM HA items
- Don't cancel the VM HA items, instead reschedule them to try again later
* Changes to consider Recovered/Available Host HA state along with the agent connection status to determine the Host HA inspection in progress or not, and some code improvements
* Refactoring Allocator classes
* Break into smaller methods random and firfit allocators.
* Added unit tests for random and firstfit allocators
* Move random allocator from cloud-plugins to cloud-server
* Add BaseAllocator abstract class for duplicate code
* Add missing license
* Add missing license to unit test file
* Remove host allocator random dependency
* Change exception message on smoke tests
* Remove conditional as it was never actually reached in the original flow
* Fix tests
* Fix flipped parameters
* Fix NPE while listing hosts for migration when suitableHosts is null
* Remove unnecessary stubbings
* Fix checkstyle
* Remove unnecessary file
* Rename exception error messages
* Apply suggestions from code review
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* Rename UserVmDetailVO references to VMInstanceDetailVO
* Remove unused imports
* Add new line at EOF
* Remove unnecessary random allocator pom
* Fix GPU allocation mistake
* Fix failing tests
---------
Co-authored-by: Fabricio Duarte <fabricio.duarte@scclouds.com.br>
Co-authored-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
* Linstor: fix create volume from snapshot on primary storage
When creating a volume from a snapshot on Linstor primary storage
(with lin.backup.snapshots=false), the operation fails with:
"Only the following image types are currently supported: VHD, OVA,
QCOW2, RAW (for PowerFlex and FiberChannel)"
Root cause: the Linstor driver does not handle SNAPSHOT -> VOLUME in
its canCopy()/copyAsync() methods. This causes DataMotionServiceImpl
to fall through to StorageSystemDataMotionStrategy (selected because
Linstor advertises STORAGE_SYSTEM_SNAPSHOT=true). That strategy's
verifyFormatWithPoolType() rejects RAW format for Linstor pools,
since RAW is only allowed for PowerFlex and FiberChannel.
Additionally, VolumeOrchestrator.createVolumeFromSnapshot() attempts
to back up the snapshot to secondary storage when the storage plugin
does not advertise CAN_CREATE_TEMPLATE_FROM_SNAPSHOT. This backup
fails because the snapshot only exists on Linstor primary storage.
Fix:
- Add CAN_CREATE_TEMPLATE_FROM_SNAPSHOT capability so the
orchestrator skips the backup-to-secondary path
- Add canCopySnapshotToVolumeCond() to match SNAPSHOT -> VOLUME
when both are on the same Linstor primary store
- Wire it into canCopy() to intercept at DataMotionServiceImpl
before strategy selection, bypassing StorageSystemDataMotionStrategy
- Implement copySnapshotToVolume() which delegates to the existing
createResourceFromSnapshot() for native Linstor snapshot restore
This follows the same pattern used by the StorPool plugin, which
handles SNAPSHOT -> VOLUME directly in its driver rather than going
through StorageSystemDataMotionStrategy.
Tested on CloudStack 4.22 with Linstor LVM_THIN storage, creating
a volume from a 1TB CNPG Postgres database snapshot. Volume creates
successfully with correct path and deletes cleanly.
* Let CloudRuntimeException propagate from copySnapshotToVolume
Remove try/catch in copySnapshotToVolume so that CloudRuntimeException
from createResourceFromSnapshot propagates to the caller, ensuring
CloudStack properly notices and reports the failure.
* Fix CAN_CREATE_TEMPLATE_FROM_SNAPSHOT breaking template creation
Setting CAN_CREATE_TEMPLATE_FROM_SNAPSHOT unconditionally to true
caused createTemplate from snapshot to take the StorPool-specific
code path in TemplateManagerImpl, which sends a CopyCommand to a
system VM that Linstor cannot handle.
Fix: make CAN_CREATE_TEMPLATE_FROM_SNAPSHOT conditional on the same
flag as STORAGE_SYSTEM_SNAPSHOT (!BackupSnapshots). When snapshots
are backed up to secondary (the default), the old template creation
flow works. When snapshots stay on primary, the direct path is used.
Also fix checkstyle: remove unused DataObject import in test.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix bulk power state query missing VM lifecycle state field
The IdsPowerStateSelectSearch partial select did not include the VM
lifecycle state, causing isPowerStateInSyncWithInstanceState to always
return true when state was null. This prevented retry of failed
StopCommands on subsequent ping cycles.
* Add defensive check for instance host ID to prevent NPE
Co-authored-by: Sachin R Doddaguni <s_rudrappadoddagu@apple.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>