cloudstack

Commit Graph

Author	SHA1	Message	Date
Suresh Kumar Anaparti	4359198904	KVM Host HA improvements - Fix to not cancel VM HA items when Host HA inspection in progress, and some code improvements (#13088 ) * Host HA code improvements * Fix to not cancel VM HA items when Host HA is enabled & inspection in progress, and some code improvements - When Host HA inspection in progress, the investigor returns the Host Status as Up which cancels the VM HA items - Don't cancel the VM HA items, instead reschedule them to try again later * Changes to consider Recovered/Available Host HA state along with the agent connection status to determine the Host HA inspection in progress or not, and some code improvements	2026-05-08 19:50:50 +05:30
Wei Zhou	e297644ce1	KVM: Enable HA heartbeat on ShareMountPoint (#12773 )	2026-04-10 14:12:40 +05:30
Abhishek Kumar	33a37da9ec	server: investigate pending HA work when executing in new MS session (#10167 ) For HA work items that are created for host state change, checks must be done when execution is called in a new management server session. A new column, reason, has been added in cloud.op_ha_work table to track the reason for HA work. When HighAvailabilityManager starts it finds and puts all pending HA work items in Investigating state. During execution of the HA work if it is found in investigating state, checks are done to verify if the work is still valid. If the jobs is found to be invalid it is cancelled. Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>	2025-01-28 14:39:31 +05:30
Suresh Kumar Anaparti	330ed25a6c	Support to enable/disable VM High Availability manager and related alerts (#10118 ) - Adds new config 'vm.ha.enabled' with Zone scope, to enable/disable VM High Availability manager. This is enable by default (for backward compatibilty). When enabled, the VM HA WorkItems (for VM Stop, Restart, Migration, Destroy) can be created and the scheduled items are executed. When disabled, new VM HA WorkItems are not allowed and the scheduled items are retried until max retries configured at 'vm.ha.migration.max.retries' (executed in case HA is re-enabled during retry attempts), and then purged after 'time.between.failures' by the cleanup thread that runs regularly at 'time.between.cleanup'. - Adds new config 'vm.ha.alerts.enabled' with Zone scope, to enable/disable alerts for the VM HA operations. This is enabled by default.	2024-12-26 17:45:32 +05:30
Abhishek Kumar	3e6900ac1a	api,server: purge expunged resources (#8999 ) This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources: Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings. API - New admin-only API purgeExpungedResources. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate. Currently, API is not supported in the UI. Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge. Following new global settings have been added: expunged.resources.purge.enabled: Default: false. Whether to run a background task to purge the expunged resources expunged.resources.purge.resources: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the expunged resources. Currently only VirtualMachine is supported. An empty "value will result in considering all resource types for purging expunged.resources.purge.interval: Default: 86400. Interval (in seconds) for the background task to purge the expunged resources expunged.resources.purge.delay: Default: 300. Initial delay (in seconds) to start the background task to purge the expunged resources task. expunged.resources.purge.batch.size: Default: 50. Batch size to be used during expunged resources purging. expunged.resources.purge.start.time: Default: (empty). Start time to be used by the background task to purge the expunged resources. Use format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss. expunged.resources.purge.keep.past.days: Default: 30. The number of days in the past from the execution time of the background task to purge the expunged resources for which the expunged resources must not be purged. To enable purging expunged resource till the execution of the background task, set the value to zero. expunged.resource.purge.job.delay: Default: 180. Delay (in seconds) to execute the purging of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used. Documentation PR: apache/cloudstack-documentation#397 Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com> Co-authored-by: Wei Zhou <weizhou@apache.org> Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>	2024-06-20 11:34:44 +05:30
slavkap	2bb182c3e1	KVM Host HA enhancement for StorPool storage (#8045 ) Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage. If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host	2023-11-04 12:35:37 +05:30
John Bampton	4eb110af73	Remove unneeded duplicate words (#7850 )	2023-09-18 13:16:33 +02:00
Wei Zhou	e2183ed666	forceha: fix two issues when (1)stop vm from inside (2) force remove host (#4647 ) * forceha: fix vm is not started if it is poweroff from inside steps to reproduce the issue (1) make sure force.ha is true in global setting. if not, change it to true, and restart mgt server (2) create a service offering , ha is not enabled (3) create a vm (4) log into the vm, and power off via cli. expected result: vm is started again by cloudstack actual result: vm is not started. * forceha: fix vms are still running if host is force-removed when host can be force removed, however vms are stopped in cloudstack, but not stopped on host ``` (localcloud) 🐱 > delete host id="a5625393-444d-4d0a-b31d-62baf88a8be1" forced=true { "success": true }``` after some minutes, vms are still runnning on host ``` root@mgt01:~# ssh node63 virsh list Id Name State --------------------------- 1 i-2-19-VM running 2 i-2-11-VM running ``` error message are ``` Cannot transmit host 2 to Enabled state com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state = Enabled event = DeleteHost at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1216) at com.cloud.resource.ResourceManagerImpl$1.doInTransactionWithoutResult(ResourceManagerImpl.java:907) ``` * forceha: Make ForceHA dynamic	2021-05-14 23:14:39 +05:30
dahn	4780a27255	Add missing HA config keys (#3776 ) (#3814 ) * Add missing HA config keys (#3776) * merge conflict-bugs fixed Co-authored-by: mdominka <50666672+mdominka@users.noreply.github.com>	2020-01-15 12:24:05 +01:00
Anurag Awasthi	4b43c2684f	Better tracking host maintanence and handling of migration jobs (#3425 ) * Service layer changes for new way of tracking maintanence progress * Fixes after offline code review * Fix marvin tests * Change state name and add documentation * Fix test * Fix and add more unit tests for different caseS * Fix and enhance Marvin Tests * Fixes for corner cases * More fixes and logging * UI fixes * Some minor changes and reducing VMs on host for more contained tests * Fixed ssh client auth problem causing test failure * Code review changes + fixes + some more logging * Fix flaky tests by adding delays between host states * Added fetching only enabled hosts for tests * Make port blocking KVM specific and refactor to handle failure * Make failing migrations due to tagged host instead of port blocking * Added additional check for migrating VMs * Refactor to use single place for methods checking maintenance states	2019-12-19 16:36:20 +01:00
Sven Vogel	cf6e616d5b	Revert "Add missing HA config keys (#3737 )" (#3774 ) This reverts commit `16527f1eb0`.	2019-12-18 14:54:27 +01:00
mdominka	16527f1eb0	Add missing HA config keys (#3737 ) * Add missing HA config keys * Change time value to seconds * Change Integer to Long * Using ConfigKey defaultValue * Do some code refactoring * Simplify code	2019-12-17 15:24:53 +01:00
Marc-Aurèle Brothier	893a88d225	CLOUDSTACK-10105: Use maven standard project structure in all projects (#2283 ) Remove maven standard module (which only a few were using) and get ride of maven customization for the projects structure. - moved all directories to src/main/java, src/main/resources, src/main/scripts, src/test/java, src/test/resources - grep scan to search for src/com and src/org left over - grep for <project>/scripts to fix pom.xml configuration - remove custom <build> configuration in pom.xml Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>	2018-01-20 03:19:27 +05:30

13 Commits