We didn't account for caching the volume stats for each used Linstor
cluster, so the first asked Linstor cluster would prevent caching
for all the others and so null was returned.
Now we have invalidate counters for each Linstor cluster and
also store the cache result with the Linstor cluster address prefixed.
Somehow deleteDatastore was never implemented, that meant:
templates haven't been cleaned up on datastore delete and
also agents have never been informed about storage pool removal.
If a -rst resource wasn't deleted because of a failed copy,
a reoccurring snapshot attempt couldn't be done, because there
was still the "old" -rst resource. To prevent this always
try to remove the -rst resource before, if it doesn't exist it is a noop.
Particular Linstor needs can use this information to only allow
dual volume access for live migration and not enable it in general,
which can and will lead to data corruption if for some reason
2 VMs get started on 2 different hosts.
If a node doesn't have a DRBD connection to another node,
additionally ask Linstor-Controller if the node is alive.
Otherwise we would have simply said no and the node might still be alive.
This is always the case in a non hyperconverged setup.
If a secondary storage pool is used by e.g.
2 concurrent snapshot->template actions,
if the first action finished it removed the netfs mount
point for the other action.
Now the storage pools are usage ref-counted and will only
deleted if there are no more users.
In non-hyperconverged setups, diskless nodes don't have a connection
to each other, so setting properties there had no effect.
Now it is checked if a connection exists,
between the live migration nodes and if not,
it will set the allow-two-primaries on resource-definition level.
The client.setBasePath() would overwrite the Linstor controller IP/host
for all current client users. This is basically a race condition
that triggered as soon as you had configured 2 different primary storages
with different Linstor controllers.
* New feature: Change storage pool scope
* Added checks for Ceph/RBD
* Update op_host_capacity table on primary storage scope change
* Storage pool scope change integration test
* pull 8875 : Addressed review comments
* Pull 8875: remove storage checks, AbstractPrimayStorageLifeCycleImpl class
* Pull 8875: Fixed integration test failure
* Pull 8875: Review comments
* Pull 8875: review comments + broke changeStoragePoolScope into smaller functions
* Added UT for changeStoragePoolScope
* Rename AbstractPrimaryDataStoreLifeCycleImpl to BasePrimaryDataStoreLifeCycleImpl
* Pull 8875: Dao review comments
* Pull 8875: Rename changeStoragePoolScope.vue to ChangeStoragePoolScope.vue
* Pull 8875: Created a new smokes test file + A single warning msg in ui
* Pull 8875: Added cleanup in test_primary_storage_scope.py
* Pull 8875: Type in en.json
* Pull 8875: cleanup array in test_primary_storage_scope.py
* Pull:8875 Removing extra whitespace at eof of StorageManagerImplTest
* Pull 8875: Added UT for PrimaryDataStoreHelper and BasePrimaryDataStoreLifeCycleImpl
* Pull 8875: Added license header
* Pull 8875: Fixed sql query for vmstates
* Pull 8875: Changed icon plus info on disabled mode in apidoc
* Pull 8875: Change scope should not work for local storage
* Pull 8875: Change scope completion event
* Pull 8875: Added api findAffectedVmsForStorageScopeChange
* Pull 8875: Added UT for findAffectedVmsForStorageScopeChange and removed listByPoolIdVMStatesNotInCluster
* Pull 8875: Review comments + Vm name in response
* Pull 8875: listByVmsNotInClusterUsingPool was returning duplicate VM entries because of multiple volumes in the VM satisfying the criteria
* Pull 8875: fixed listAffectedVmsForStorageScopeChange UT
* listAffectedVmsForStorageScopeChange should work if the pool is not disabled
* Fix listAffectedVmsForStorageScopeChangeTest UT
* Pull 8875: add volume.removed not null check in VmsNotInClusterUsingPool query
* Pull 8875: minor refactoring in changeStoragePoolScopeToCluster
* Update server/src/main/java/com/cloud/storage/StorageManagerImpl.java
* fix eof
* changeStoragePoolScopeToZone should connect pool to all Up hosts
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* linstor: update to java-linstor 0.5.1
* linstor: Support VM-Instance Disk snapshots
This adds VM-Instance disk snapshot support for
Linstor primary storage. Instance snapshots are stored on
the used Linstor storage pool backend and can be converted
into regular volume snapshots and also reverted.
Instance VM snapshots are not fully atomic but with the
create multi snapshot feature as good as it gets.
Snapshots are done over multiple volumes in the same devicemanager run.
disconnectPhysicalDisk(String, KVMStoragePool) seems to calls the plugin
with the resource name instead of the device path, so we also have
to search for resource names, while cleaning up.
For live migrate we need the allow-two-primaries option,
but we don't know exactly if we are called for a migration operation.
Now also check if at least any of the resources is in use somewhere and
only then set the option.
* linstor: Outline get storagepools from resourcegroup into function
* linstor: move getHostname() to kvm/Pool and reimplement
* linstor: implement CloudStack HA support
* linstor: Add util method getBestErrorMessage from main
* linstor: failed remove of allow-two-primaries is no fatal error
* linstor: Fix failure if a Linstor node is down while migrating
If a Linstor node is down while migrating resource, allow-two-primaries
setting will fail because we can't reach the downed node. But it will
still set the property on the other nodes and migration should work.
We now just report an error instead of completely failing.
On no access to the storage nodes, we now create a temporary resource from the snapshot and copy that data into the secondary storage. Revert works the same, just that we now also look additionally for any Linstor agent node.
Also enables now backup snapshot by default.
This whole BackupSnapshot functionality was introduced in 4.19,
so I would be happy if this still could be merged.
This PR adds an config option for the Linstor primary storage driver, that allows you to automatically backup
volume snapshots to the secondary storage.
Additionally it will not mangle the need java-linstor dependency into the client.jar, but instead just copy
the java-linstor.jar into lib.
Config option is called: lin.backup.snapshots and is default false
The scope of this change should be limited, as it only touches the Linstor driver and a part of copyAsync
was implemented with 2 new Linstor specific commands.
Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA
This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage.
If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host
Making a diskful resource was meant as an optimization,
but cannot work on non hyperconverged setups,
as the storage nodes (diskful) are not part of the cloudstack cluster.
A TODO was overseen and never implemented,
which could trigger the following bug:
If Linstor didn't create a resource (diskless or diskfull) on
the cloudstack choosen node, it would not be able to copy the
template data there, it even seems no error was
triggered and the new template file silently just became
empty/corrupt.