* api,agent,server,engine-schema: scalability improvements
Following changes and improvements have been added:
- Improvements in handling of PingRoutingCommand
1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
3. Optimized scanning stalled VMs
- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`
- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine
- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.
- Added caching for account/use role API access with expiration after write set to 60 seconds.
- Added caching for some recurring DB retrievals
1. CapacityManager - listing service offerings - beneficial in host capacity calculation
2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
3. DownloadListener - hypervisors for zone - beneficial for host joins
5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands
- Optimized MS list retrieval for agent connect
- Optimize finding ready systemvm template for zone
- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks
- Changes in agent-agentmanager connection with NIO client-server classes
1. Optimized the use of the executor service
2. Refactore Agent class to better handle connections.
3. Do SSL handshakes within worker threads
5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.
- Improvements in StatsCollection - minimize DB retrievals.
- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.
- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.
- Minor improvements in resource limit calculations wrt DB retrievals
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* test1, domaindetails, capacitymanager fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* test2 - agent tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* capacitymanagertest fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* change
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert marvin/setup.py
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix indent
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* use space in sql
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address duplicate
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* update host logs
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* revert e36c6a5d07
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix npe in capacity calculation
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* move schema changes to 4.20.1 upgrade
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address comments
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix build
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add some more tests
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* checkstyle fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* remove unnecessary mocks
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* build fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* replace statics
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* engine/orchestration,utils: limit number of concurrent new agent
connections
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor - remove unused
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unregister closed connections, monitor & cleanup
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* add check for outdated vm filter in power sync
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* agent: synchronize sendRequest wait
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
---------
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Improve logging to include more identifiable information for kvm plugin
* Update logging for scaleio plugin
* Improve logging to include more identifiable information for default volume storage plugin
* Improve logging to include more identifiable information for agent managers
* Improve logging to include more identifiable information for Listeners
* Replace ids with objects or uuids
* Improve logging to include more identifiable information for engine
* Improve logging to include more identifiable information for server
* Fixups in engine
* Improve logging to include more identifiable information for plugins
* Improve logging to include more identifiable information for Cmd classes
* Fix toString method for StorageFilterTO.java
* xenserver: attach regular iso with configdrive
Fixes#7902
This PR allows attaching a regular ISO to a VM when it already has the
config drive ISO attached.
Config-drive ISO is now attached with the SR name-label
<VM-NAME>-CONFIGDRIVE-ISO. While regular ISOs continue to attach with SR
name-label <VM-NAME>-ISO. VM which already have the configdrive ISO
attached before this fix will return an appropriate error and will need
to be stopped-start.
* Update extraconfig for platform param in xen/xcpng
* Fix map param key, not to replace '-' with '_' (replace only applicable to param / map-param)
* Added unit tests
* Add license for tests file
* Normalize logs
All classes that could have their loggers inherited from their fathers had their own loggers deleted;
Most loggers didn't have to be static, so most of them were normalized so that they wouldn't be;
All loggers are protected now;
Static logger's name are now 'LOGGER';
Non-static logger's name are now 'logger';
New class DbUpgradeAbstractImpl created so that all Upgraders extend it and inherit its logger
* Upgrade log4j
* fix errors caused by the merge
* Refactor cglibThrowableRenderer functionality to log4j2 and upgrade the last configuration files
* fix sonarcloud bug
* Fix errors caused by merge, remove some unused loggers, and rename a variable that was mistakenly renamed on the normalization commit
* Readd snmpTrapAppender, remove TestAppender
* Regenerate changes
* regenerate changes
* refactor last custom appender
* fix systemvm configuration xml
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* Fix utils pom
* fix some tests
* regenerate changes
* Fix jar being printed on exception
* fix logging in system VMs, fix commands not having log4j2 classpath.
* regenerate changes
* Fix some unwanted renomeations
* fix end of file
* regenerate changes
* regenerate changes
* fix merge error
* regenerate changes
* fix tests
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* readd reload4j to tungsten as juniper depends on it
* Regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* re-add reload4j dependency to network-contrail, as juniper depends on it
* regenerate changes
* regenerate changes
* regenerate changes
* fix typo
* regenerate changes
* regenerate changes
* Fix end of files
* regenerate changes
* add logj42 to cloud-utils-SHADED.jar
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* Regenerate changes
* regenerate changes
* Regenerate changes
* Regenerate changes
* fix some tests
* Regenerate changes
* Regenerate changes
* fix test
* Regenerate changes
* Regenerate changes
* Guest OS mapping improvements
- Checks the OS mapping name in hypervisor (VMware, XenServer)
- Displays guest OS mappings in UI
* Added API getHypervisorGuestOsNames to list the guest OS names in the hypervisor, and code improvements
* Some static analysis fixes
* Removed commented code in listview
* Guest OS list
* UI changes for adding guest os and mappings
* Added guest os mappings in guest os form
* Added new filter to guest os mapping
* Name and description changes
* VMWare Host and cluster MO unit tests
* CheckGuestOsMapping command and answer unit tests
* GetHypervisorGuestOsNames command and answer unit tests
* VmwareResource unitests
* GuestOsMapper unittests
* icon changes
* Addressed review comments
* Renaming fixes
* Removed comments
* marvin tests for guest os operations
* Added marvin tests for OS mappings
* Document links and UI improvements
* Added deduplication for the list guest OS API
* Fixed linter failure
* Few bug fixes and UI changes
* Few improvements
* Addressed code smells
* Fixed UI issues after rebase
---------
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
* add global setting to allow parallel execution on vmware
* cleanup setting distribution for vmware.create.full.clone
* query setting in vmware guru
* don´t touch other hypervisor's commands
* guru hierarchy cleanup
* Improve log when live patching fails
* change patching path from /tmp to /var/cache/clou
* add iptable rule for console proxy (novnc)
* temporary template paths
* revert pom xml to original paths
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency
* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp
* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup
* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes
* Add ssh to k8s nodes details in the Access tab on the UI
* test
* Refactor ca/cert patching logic
* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script
* remove all references of systemvm.iso
* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs
* fix script timeout
* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand
* remove commented code + change core user to cloud for cks nodes
* Update ownership of ssh directory
* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)
* Add UI changes + move changes from patch file to runcmd
* test: validate performance for template modification during seeding
* create vms folder in cloudstack-commons directory - debian rules
* remove logic for on the fly template convert + update k8s test
* fix syntax issue - causing issue with shared network tests
* Code cleanup
* refactor patching logic - certs
* move logic of fixing rootdiskcontroller from upgrade to kubernetes service
* add livepatch option to restart network & vpc
* smooth upgrade of cks clusters
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency
* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp
* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup
* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes
* Add ssh to k8s nodes details in the Access tab on the UI
* test
* Refactor ca/cert patching logic
* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script
* remove all references of systemvm.iso
* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs
* fix script timeout
* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand
* remove commented code + change core user to cloud for cks nodes
* Update ownership of ssh directory
* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)
* Add UI changes + move changes from patch file to runcmd
* test: validate performance for template modification during seeding
* create vms folder in cloudstack-commons directory - debian rules
* remove logic for on the fly template convert + update k8s test
* fix syntax issue - causing issue with shared network tests
* Code cleanup
* add cgroup config for containerd
* add systemd config for kubelet
* add additional info during image registry config
* address comments
* add temp links of download.cloudstack.org
* address part of the comments
* address comments
* update containerd config - as version has upgraded to 1.5 from 1.4.12 in 4.17.0
* address comments - simplify
* fix vue3 related icon changes
* allow network commands when router template version is lower but is patched
* add internal LB to the list of routers to be patched on network restart with live patch
* add unit tests for API param validations and new helper utilities - file scp & checksum validations
* perform patching only for non-user i.e., system VMs
* add test to validate params
* remove unused import
* add column to domain_router to display software version and support networkrestart with livePatch from router view
* Requires upgrade column to consider package (cloud-scripts) checksum to identify if true/false
* use router software version instead of checksum
* show N/A if no software version reported i.e., in upgraded envs
* fix deb failure
* update pom to official links of systemVM template
* Add persistence of VM stats
* Fix API 'since' attribute
* Add license
* Address GutoVeronezi's reviews
* Fix the order of VM stats in the API response
* Fix msid in VM stats data
* Fix disk stats and add minor improvements
* Add log message
* Build string using ReflectionToStringBuilderUtils
* Rerun checks
Co-authored-by: joseflauzino <jose@scclouds.com.br>
* Add NFS version to mount command
* Remove extra line
* Extend NFS version to mount secondary storage
* Unused import
* Refactor NFS version to be granular
* Make use of the ConfigKey on the NFS version setting value
Currently, our compute offerings and disk offerings are tightly coupled with respect to many aspects. For example, if a compute offering is created, a corresponding disk offering entry is also created with the same ID as the reference. Also creating compute offering takes few disk-related parameters which anyway goes to the corresponding disk offering only. I think this design was initially made to address compute offering for the root volume created from a template. Also changing the offering of a volume is tightly coupled with storage tags and has to be done in different APIs either migrateVolume or resizeVolume. Changing of disk offering should be seamless and should consider new storage tags, new size and place the volume in appropriate state as defined in disk offering.
more details are mentioned here https://cwiki.apache.org/confluence/display/CLOUDSTACK/Compute+offering+and+disk+offering+refactoring
* Schema changes and disk offering column change from "type" to "compute_only"
* Few more changes
* Decoupled service offering and disk offering
* Remove diskofferingid from vminstance VO
* Decouple service offering and disk offering states
* diskoffering getsize() is only for strict disk offerings
* Fix deployVM flow
* Added new API params to compute offering creation
* Add diskofferingstrictness to serviceoffering vo under quota
* Added overrideDiskOfferingId parameter in deploy VM API which will override disk offering for the root disk both in template and ISO case
Added diskSizeStrictness parameter in create Disk offering API which will decide whether to restrict resize or disk offering change of a volume
* Fix User vm response to show proper service offering and disk offerings
* Added disk size strictness in disk offering response
* Added disk offering strictness to the service offering response
* Remove comments
* Added UI changes for Disk offering strictness in add compute offering form and Disk size strictness in add disk offering form
* Added diskoffering details to the service offering response
* Added UI changes in deployvm wizard to accept override disk offering id
* Fix delete compute offering
* Fix VM deployment from custom service offering
* Move uselocalstorage column access from service offering to disk offering
* UI: Separated compute and disk releated parameters in add compute offering wizard, also added association to disk offering
* Fixed diskoffering automatic selection on add compute offering wizard
* UI: move compute only toggle button outside the box in add compute offering wizard
* Added volumeId parameter to listDiskOfferings API and the disksizestrictness flag of the current disk offering is honored while list disk offerings
* Added configuration parameter to decide whether to check volume tags on the destination storagepool during migration
* Added disk offering change checks during resize volume operation
* Added new API changeofferingforVolume API and corresponding changes
* Add UI form for changeOfferingForVolume API
* Fix UI conflicts
* Fix service offering usage as disk offering
* Fix unit test failures
* fix user_vm_view
* Addressed review comments
* Fixed service_offering_view
* Fix service offering edit flow
* Fix service offering constructor to address custom offering
* Fix domain_router_view to get proper service offering id
* Removed unused import
* Addressed review comments and fixed update service offering flow with storage tags
* Added marvin test cases for checking disk offering strictness
* review comments addressed
* Remove system_use column from disk offering join
* update volume_view to update system_use column from service offering and not disk offering
* Fix changeOfferingForVolume API for custom disk offering
* Fix global setting implementation
* Fix list volumes, after changing system_use column from disk offering to service offering in volume_view
* Changes for override root disk offering in deployvm wizard in case of custom offering
* Fix a unit test case
* Fixed recent unit test cases with new serviceofferingvo constructor
* Fix unit test in VolumeApiServiceImpl
* Added storage id for the list disk offering API and corresponding UI changes in migrateVolume and changeOfferingForVolume flow
* Rename global configuration parameter from storage.pool.tags.disk.offering.strictness to match.storage.pool.tags.with.disk.offering
* Fix smoke test failures
* Added tool tip for migrate volume UI form
* Address review comments and fix UI form of deploy VM in case of ISO.
* Fixed resize volume UI form for data disk
* UI changes to disable override root disk size when override root disk offering is enabled
* UI fix in deploy vm wizard
* Fix listdiskoffering after rebasing with main
* Fixed UI in migrate and changeofferingfor volume to handle empty disk offering list
Removed the volume's current disk offering from listDiskOffering response list
* Added custom Iops to resize volume form and removed the current disk offering during change offering for volume UI form
* Fix false response on updateDiskOffering API
* Added search field for changeofferingforvolume UI form
* Fix resize volume and migrate volume to update volume path if DRS is applied on volume in datastore cluster
* Removed DB changes from 4.16 upgrade file
* Resolving merge conflicts with main 4.17
* Added support for auto migration and auto resize of the root volume upon changing the service offering for VM.
* UI: Added automigrate checkbox in scale VM form
* Addes since attributes to new API params
* Added shrinkOK parameter to changeofferingforvolume API
* Added shrinkOk param to UI in changeOfferingforVolume form
* Added shrinkOk flag to scaleVM and changeServiceForVirtualMachines and UI form
* Removed old foreign key constraint on IDs of service offering and disk offering
* Allow resize and automigrate of root volume if required in all cases of service offering change
* Allow only resize to higher disk size from UI
* Fixing vue syntax error
* Make UI changes to provide root disk size box when the linked disk offering is of custom
* Converted from check box to toggle in scale VM, changeoffering, resize and migrate volume forms
* Fix resize volume operation to update the VM settings
* Fix migratevolume form to pick selected storage pool id in list diskofferings API
* xcp-ng: fix vm boot options
Use value boot option values from VM details directly similar to other hypervisor plugins instead of relying on boot properties explicitly set for VirtualMachineTO.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>