This PR aligns the use of terminology, renaming VM / virtual machine references to 'Instance' and also capitalising the terms Templates, Network, Snapshot, User, Account in CloudStack APIs, error and log messages, events, tooltips, etc. Many typos, grammar and spelling mistakes were fixed, also terms like IPv4, VPN, VPC, etc. were properly capitalised. Some error messages were cleaned for better readability. The test cases, expecting some exception strings were adjusted accordingly.
Here is the wiki page, describing the changes in details:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Object+Naming+and+Title+Case+Convention
---------
Co-authored-by: Manoj Kumar <manojkr.itbhu@gmail.com>
Co-authored-by: Harikrishna <harikrishna.patnala@gmail.com>
For HA work items that are created for host state change, checks must be
done when execution is called in a new management server session.
A new column, reason, has been added in cloud.op_ha_work table to track
the reason for HA work.
When HighAvailabilityManager starts it finds and puts all pending HA
work items in Investigating state. During execution of the HA work if it
is found in investigating state, checks are done to verify if the work
is still valid. If the jobs is found to be invalid it is cancelled.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Improve logging to include more identifiable information for kvm plugin
* Update logging for scaleio plugin
* Improve logging to include more identifiable information for default volume storage plugin
* Improve logging to include more identifiable information for agent managers
* Improve logging to include more identifiable information for Listeners
* Replace ids with objects or uuids
* Improve logging to include more identifiable information for engine
* Improve logging to include more identifiable information for server
* Fixups in engine
* Improve logging to include more identifiable information for plugins
* Improve logging to include more identifiable information for Cmd classes
* Fix toString method for StorageFilterTO.java
- Adds new config 'vm.ha.enabled' with Zone scope, to enable/disable VM High Availability manager. This is enable by default (for backward compatibilty).
When enabled, the VM HA WorkItems (for VM Stop, Restart, Migration, Destroy) can be created and the scheduled items are executed.
When disabled, new VM HA WorkItems are not allowed and the scheduled items are retried until max retries configured at 'vm.ha.migration.max.retries' (executed in case HA is re-enabled during retry attempts), and then purged after 'time.between.failures' by the cleanup thread that runs regularly at 'time.between.cleanup'.
- Adds new config 'vm.ha.alerts.enabled' with Zone scope, to enable/disable alerts for the VM HA operations. This is enabled by default.
This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources:
Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings.
API - New admin-only API purgeExpungedResources. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate. Currently, API is not supported in the UI.
Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge.
Following new global settings have been added:
expunged.resources.purge.enabled: Default: false. Whether to run a background task to purge the expunged resources
expunged.resources.purge.resources: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the expunged resources. Currently only VirtualMachine is supported. An empty "value will result in considering all resource types for purging
expunged.resources.purge.interval: Default: 86400. Interval (in seconds) for the background task to purge the expunged resources
expunged.resources.purge.delay: Default: 300. Initial delay (in seconds) to start the background task to purge the expunged resources task.
expunged.resources.purge.batch.size: Default: 50. Batch size to be used during expunged resources purging.
expunged.resources.purge.start.time: Default: (empty). Start time to be used by the background task to purge the expunged resources. Use format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
expunged.resources.purge.keep.past.days: Default: 30. The number of days in the past from the execution time of the background task to purge the expunged resources for which the expunged resources must not be purged. To enable purging expunged resource till the execution of the background task, set the value to zero.
expunged.resource.purge.job.delay: Default: 180. Delay (in seconds) to execute the purging of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used.
Documentation PR: apache/cloudstack-documentation#397
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
* Normalize logs
All classes that could have their loggers inherited from their fathers had their own loggers deleted;
Most loggers didn't have to be static, so most of them were normalized so that they wouldn't be;
All loggers are protected now;
Static logger's name are now 'LOGGER';
Non-static logger's name are now 'logger';
New class DbUpgradeAbstractImpl created so that all Upgraders extend it and inherit its logger
* Upgrade log4j
* fix errors caused by the merge
* Refactor cglibThrowableRenderer functionality to log4j2 and upgrade the last configuration files
* fix sonarcloud bug
* Fix errors caused by merge, remove some unused loggers, and rename a variable that was mistakenly renamed on the normalization commit
* Readd snmpTrapAppender, remove TestAppender
* Regenerate changes
* regenerate changes
* refactor last custom appender
* fix systemvm configuration xml
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* Fix utils pom
* fix some tests
* regenerate changes
* Fix jar being printed on exception
* fix logging in system VMs, fix commands not having log4j2 classpath.
* regenerate changes
* Fix some unwanted renomeations
* fix end of file
* regenerate changes
* regenerate changes
* fix merge error
* regenerate changes
* fix tests
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* readd reload4j to tungsten as juniper depends on it
* Regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* re-add reload4j dependency to network-contrail, as juniper depends on it
* regenerate changes
* regenerate changes
* regenerate changes
* fix typo
* regenerate changes
* regenerate changes
* Fix end of files
* regenerate changes
* add logj42 to cloud-utils-SHADED.jar
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* fix some tests
* Regenerate changes
* Regenerate changes
* fix test
* Regenerate changes
* Regenerate changes
Extending the current functionality of KVM Host HA for the StorPool storage plugin and the option for easy integration for the rest of the storage plugins to support Host HA
This extension works like the current NFS storage implementation. It allows it to be used simultaneously with NFS and StorPool storage or only with StorPool primary storage.
If it is used with different primary storages like NFS and StorPool, and one of the health checks fails for storage, there is an option to report the failure to the management with the global config kvm.ha.fence.on.storage.heartbeat.failure. By default this option is disabled when enabled the Host HA service will continue with the checks on the host and eventually will fence the host
* alert: Send alert for ha'ed vm's
When ha is performed on vm's send the alert for it so that
its for admins to know which vm's got ha'ed else its time
consuming to get those details from logs
* feedback change
Co-authored-by: Rakesh Venkatesh <rakeshv@apache.org>
Currently, our compute offerings and disk offerings are tightly coupled with respect to many aspects. For example, if a compute offering is created, a corresponding disk offering entry is also created with the same ID as the reference. Also creating compute offering takes few disk-related parameters which anyway goes to the corresponding disk offering only. I think this design was initially made to address compute offering for the root volume created from a template. Also changing the offering of a volume is tightly coupled with storage tags and has to be done in different APIs either migrateVolume or resizeVolume. Changing of disk offering should be seamless and should consider new storage tags, new size and place the volume in appropriate state as defined in disk offering.
more details are mentioned here https://cwiki.apache.org/confluence/display/CLOUDSTACK/Compute+offering+and+disk+offering+refactoring
* Schema changes and disk offering column change from "type" to "compute_only"
* Few more changes
* Decoupled service offering and disk offering
* Remove diskofferingid from vminstance VO
* Decouple service offering and disk offering states
* diskoffering getsize() is only for strict disk offerings
* Fix deployVM flow
* Added new API params to compute offering creation
* Add diskofferingstrictness to serviceoffering vo under quota
* Added overrideDiskOfferingId parameter in deploy VM API which will override disk offering for the root disk both in template and ISO case
Added diskSizeStrictness parameter in create Disk offering API which will decide whether to restrict resize or disk offering change of a volume
* Fix User vm response to show proper service offering and disk offerings
* Added disk size strictness in disk offering response
* Added disk offering strictness to the service offering response
* Remove comments
* Added UI changes for Disk offering strictness in add compute offering form and Disk size strictness in add disk offering form
* Added diskoffering details to the service offering response
* Added UI changes in deployvm wizard to accept override disk offering id
* Fix delete compute offering
* Fix VM deployment from custom service offering
* Move uselocalstorage column access from service offering to disk offering
* UI: Separated compute and disk releated parameters in add compute offering wizard, also added association to disk offering
* Fixed diskoffering automatic selection on add compute offering wizard
* UI: move compute only toggle button outside the box in add compute offering wizard
* Added volumeId parameter to listDiskOfferings API and the disksizestrictness flag of the current disk offering is honored while list disk offerings
* Added configuration parameter to decide whether to check volume tags on the destination storagepool during migration
* Added disk offering change checks during resize volume operation
* Added new API changeofferingforVolume API and corresponding changes
* Add UI form for changeOfferingForVolume API
* Fix UI conflicts
* Fix service offering usage as disk offering
* Fix unit test failures
* fix user_vm_view
* Addressed review comments
* Fixed service_offering_view
* Fix service offering edit flow
* Fix service offering constructor to address custom offering
* Fix domain_router_view to get proper service offering id
* Removed unused import
* Addressed review comments and fixed update service offering flow with storage tags
* Added marvin test cases for checking disk offering strictness
* review comments addressed
* Remove system_use column from disk offering join
* update volume_view to update system_use column from service offering and not disk offering
* Fix changeOfferingForVolume API for custom disk offering
* Fix global setting implementation
* Fix list volumes, after changing system_use column from disk offering to service offering in volume_view
* Changes for override root disk offering in deployvm wizard in case of custom offering
* Fix a unit test case
* Fixed recent unit test cases with new serviceofferingvo constructor
* Fix unit test in VolumeApiServiceImpl
* Added storage id for the list disk offering API and corresponding UI changes in migrateVolume and changeOfferingForVolume flow
* Rename global configuration parameter from storage.pool.tags.disk.offering.strictness to match.storage.pool.tags.with.disk.offering
* Fix smoke test failures
* Added tool tip for migrate volume UI form
* Address review comments and fix UI form of deploy VM in case of ISO.
* Fixed resize volume UI form for data disk
* UI changes to disable override root disk size when override root disk offering is enabled
* UI fix in deploy vm wizard
* Fix listdiskoffering after rebasing with main
* Fixed UI in migrate and changeofferingfor volume to handle empty disk offering list
Removed the volume's current disk offering from listDiskOffering response list
* Added custom Iops to resize volume form and removed the current disk offering during change offering for volume UI form
* Fix false response on updateDiskOffering API
* Added search field for changeofferingforvolume UI form
* Fix resize volume and migrate volume to update volume path if DRS is applied on volume in datastore cluster
* Removed DB changes from 4.16 upgrade file
* Resolving merge conflicts with main 4.17
* Added support for auto migration and auto resize of the root volume upon changing the service offering for VM.
* UI: Added automigrate checkbox in scale VM form
* Addes since attributes to new API params
* Added shrinkOK parameter to changeofferingforvolume API
* Added shrinkOk param to UI in changeOfferingforVolume form
* Added shrinkOk flag to scaleVM and changeServiceForVirtualMachines and UI form
* Removed old foreign key constraint on IDs of service offering and disk offering
* Allow resize and automigrate of root volume if required in all cases of service offering change
* Allow only resize to higher disk size from UI
* Fixing vue syntax error
* Make UI changes to provide root disk size box when the linked disk offering is of custom
* Converted from check box to toggle in scale VM, changeoffering, resize and migrate volume forms
* Fix resize volume operation to update the VM settings
* Fix migratevolume form to pick selected storage pool id in list diskofferings API
Inclusivity changes for CloudStack
- Change default git branch name from 'master' to 'main' (post renaming/changing default git branch to 'main' in git repo)
- Rename some offensive words/terms as appropriate for inclusiveness.
This PR updates the default git branch to 'main', as part of #4887.
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* server: destroy ssvm, cpvm on last host maintenance
When a single or last UP host enters into maintenance just stopping SSVM and CPVM will leave behind VMs on hypervisor side. As these system vms will be recreated they can be destroyed.
Fixes#3719
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix methods
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* immediately destroy systemvms
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix destroy
Added bypassHostMaintenance flag in Comma.java class to allow command to be handled by host agent even when host is in maintenace.
Flag is set true only for delete commands for ssvm and cpvm.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unit test fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing return statement
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix
VM should be stopped with cleanup before calling expunge else it server may through error with host in PrepareForMaintenance state.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* rename
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* forceha: fix vm is not started if it is poweroff from inside
steps to reproduce the issue
(1) make sure force.ha is true in global setting. if not, change it to true, and restart mgt server
(2) create a service offering , ha is not enabled
(3) create a vm
(4) log into the vm, and power off via cli.
expected result: vm is started again by cloudstack
actual result: vm is not started.
* forceha: fix vms are still running if host is force-removed
when host can be force removed, however vms are stopped in cloudstack, but not stopped on host
```
(localcloud) 🐱 > delete host id="a5625393-444d-4d0a-b31d-62baf88a8be1" forced=true
{
"success": true
}```
after some minutes, vms are still runnning on host
```
root@mgt01:~# ssh node63 virsh list
Id Name State
---------------------------
1 i-2-19-VM running
2 i-2-11-VM running
```
error message are
```
Cannot transmit host 2 to Enabled state
com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state = Enabled event = DeleteHost
at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1216)
at com.cloud.resource.ResourceManagerImpl$1.doInTransactionWithoutResult(ResourceManagerImpl.java:907)
```
* forceha: Make ForceHA dynamic
* Service layer changes for new way of tracking maintanence progress
* Fixes after offline code review
* Fix marvin tests
* Change state name and add documentation
* Fix test
* Fix and add more unit tests for different caseS
* Fix and enhance Marvin Tests
* Fixes for corner cases
* More fixes and logging
* UI fixes
* Some minor changes and reducing VMs on host for more contained tests
* Fixed ssh client auth problem causing test failure
* Code review changes + fixes + some more logging
* Fix flaky tests by adding delays between host states
* Added fetching only enabled hosts for tests
* Make port blocking KVM specific and refactor to handle failure
* Make failing migrations due to tagged host instead of port blocking
* Added additional check for migrating VMs
* Refactor to use single place for methods checking maintenance states
* Add missing HA config keys
* Change time value to seconds
* Change Integer to Long
* Using ConfigKey defaultValue
* Do some code refactoring
* Simplify code
* api: add command to list management servers
* api: add number of mangement servers in listInfrastructure command
* ui: add block for mangement servers on infra page
* api name resolution method cleanup
These boolean-return methods are named as "getXXX".
Other boolean-return methods are named as "isXXX".
Considering there methods will return boolean values, it should be more clear and consistent to rename them as "isXXX".
(rebase #2602 and #2816)
Remove maven standard module (which only a few were using) and get ride of maven customization for the projects structure.
- moved all directories to src/main/java, src/main/resources, src/main/scripts, src/test/java, src/test/resources
- grep scan to search for src/com and src/org left over
- grep for <project>/scripts to fix pom.xml configuration
- remove custom <build> configuration in pom.xml
Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>