Commit Graph

9 Commits

Author SHA1 Message Date
Abhishek Kumar 0b5a5e8043
api,agent,server,engine-schema: scalability improvements (#9840)
* api,agent,server,engine-schema: scalability improvements

Following changes and improvements have been added:

- Improvements in handling of PingRoutingCommand

    1. Added global config - `vm.sync.power.state.transitioning`, default value: true, to control syncing of power states for transitioning VMs. This can be set to false to prevent computation of transitioning state VMs.
    2. Improved VirtualMachinePowerStateSync to allow power state sync for host VMs in a batch
    3. Optimized scanning stalled VMs

- Added option to set worker threads for capacity calculation using config - `capacity.calculate.workers`

- Added caching framework based on Caffeine in-memory caching library, https://github.com/ben-manes/caffeine

- Added caching for account/use role API access with expiration after write can be configured using config - `dynamic.apichecker.cache.period`. If set to zero then there will be no caching. Default is 0.

- Added caching for account/use role API access with expiration after write set to 60 seconds.

- Added caching for some recurring DB retrievals

    1. CapacityManager - listing service offerings - beneficial in host capacity calculation
    2. LibvirtServerDiscoverer existing host for the cluster - beneficial for host joins
    3. DownloadListener - hypervisors for zone - beneficial for host joins
    5. VirtualMachineManagerImpl - VMs in progress- beneficial for processing stalled VMs during PingRoutingCommands

- Optimized MS list retrieval for agent connect

- Optimize finding ready systemvm template for zone

- Database retrieval optimisations - fix and refactor for cases where only IDs or counts are used mainly for hosts and other infra entities. Also similar cases for VMs and other entities related to host concerning background tasks

- Changes in agent-agentmanager connection with NIO client-server classes

    1. Optimized the use of the executor service
    2. Refactore Agent class to better handle connections.
    3. Do SSL handshakes within worker threads
    5. Added global configs to control the behaviour depending on the infra. SSL handshake could be a bottleneck during agent connections. Configs - `agent.ssl.handshake.min.workers` and `agent.ssl.handshake.max.workers` can be used to control number of new connections management server handles at a time. `agent.ssl.handshake.timeout` can be used to set number of seconds after which SSL handshake times out at MS end.
    6. On agent side backoff and sslhandshake timeout can be controlled by agent properties. `backoff.seconds` and `ssl.handshake.timeout` properties can be used.

- Improvements in StatsCollection - minimize DB retrievals.

- Improvements in DeploymentPlanner allow for the retrieval of only desired host fields and fewer retrievals.

- Improvements in hosts connection for a storage pool. Added config - `storage.pool.host.connect.workers` to control the number of worker threads that can be used to connect hosts to a storage pool. Worker thread approach is followed currently only for NFS and ScaleIO pools.

- Minor improvements in resource limit calculations wrt DB retrievals

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* test1, domaindetails, capacitymanager fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* test2 - agent tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* capacitymanagertest fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* change

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix missing changes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert marvin/setup.py

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix indent

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* use space in sql

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address duplicate

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* update host logs

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* revert e36c6a5d07

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix npe in capacity calculation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* move schema changes to 4.20.1 upgrade

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* address comments

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix build

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add some more tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* checkstyle fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* remove unnecessary mocks

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* build fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* replace statics

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* engine/orchestration,utils: limit number of concurrent new agent
connections

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor - remove unused

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* unregister closed connections, monitor & cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add check for outdated vm filter in power sync

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* agent: synchronize sendRequest wait

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2025-02-01 12:28:41 +05:30
John Bampton c923e673cf
pre-commit: add `XML` files to the `trailing-whitespace` check (#9131) 2024-07-12 09:42:54 +02:00
Abhishek Kumar 3e6900ac1a
api,server: purge expunged resources (#8999)
This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine). There would be three mechanisms for purging removed resources:

    Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings.
    API - New admin-only API purgeExpungedResources. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate. Currently, API is not supported in the UI.
    Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge.

Following new global settings have been added:

    expunged.resources.purge.enabled: Default: false. Whether to run a background task to purge the expunged resources
    expunged.resources.purge.resources: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the expunged resources. Currently only VirtualMachine is supported. An empty "value will result in considering all resource types for purging
    expunged.resources.purge.interval: Default: 86400. Interval (in seconds) for the background task to purge the expunged resources
    expunged.resources.purge.delay: Default: 300. Initial delay (in seconds) to start the background task to purge the expunged resources task.
    expunged.resources.purge.batch.size: Default: 50. Batch size to be used during expunged resources purging.
    expunged.resources.purge.start.time: Default: (empty). Start time to be used by the background task to purge the expunged resources. Use format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
    expunged.resources.purge.keep.past.days: Default: 30. The number of days in the past from the execution time of the background task to purge the expunged resources for which the expunged resources must not be purged. To enable purging expunged resource till the execution of the background task, set the value to zero.
    expunged.resource.purge.job.delay: Default: 180. Delay (in seconds) to execute the purging of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used.

Documentation PR: apache/cloudstack-documentation#397

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Suresh Kumar Anaparti <sureshkumar.anaparti@gmail.com>
2024-06-20 11:34:44 +05:30
João Jandre 49cecaed06
Normalize loggers and upgrade log4j 1.2 to log4j 2.19 (#7131)
* Normalize logs

All classes that could have their loggers inherited from their fathers had their own loggers deleted;
Most loggers didn't have to be static, so most of them were normalized so that they wouldn't be;
All loggers are protected now;
Static logger's name are now 'LOGGER';
Non-static logger's name are now 'logger';
New class DbUpgradeAbstractImpl created so that all Upgraders extend it and inherit its logger

* Upgrade log4j

* fix errors caused by the merge

* Refactor cglibThrowableRenderer functionality to log4j2 and upgrade the last configuration files

* fix sonarcloud bug

* Fix errors caused by merge, remove some unused loggers, and rename a variable that was mistakenly renamed on the normalization commit

* Readd snmpTrapAppender, remove TestAppender

* Regenerate changes

* regenerate changes

* refactor last custom appender

* fix systemvm configuration xml

* Regenerate changes

* Regenerate changes

* regenerate changes

* Regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* Fix utils pom

* fix some tests

* regenerate changes

* Fix jar being printed on exception

* fix logging in system VMs, fix commands not having log4j2 classpath.

* regenerate changes

* Fix some unwanted renomeations

* fix end of file

* regenerate changes

* regenerate changes

* fix merge error

* regenerate changes

* fix tests

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* readd reload4j to tungsten as juniper depends on it

* Regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* re-add reload4j dependency to network-contrail, as juniper depends on it

* regenerate changes

* regenerate changes

* regenerate changes

* fix typo

* regenerate changes

* regenerate changes

* Fix end of files

* regenerate changes

* add logj42 to cloud-utils-SHADED.jar

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* regenerate changes

* Regenerate changes

* Regenerate changes

* Regenerate changes

* regenerate changes

* Regenerate changes

* regenerate changes

* Regenerate changes

* Regenerate changes

* Regenerate changes

* regenerate changes

* Regenerate changes

* Regenerate changes

* fix some tests

* Regenerate changes

* Regenerate changes

* fix test

* Regenerate changes

* Regenerate changes
2024-02-08 09:55:41 -03:00
kishankavala 80bbb29abf
CleanUp Async Jobs after mgmt server maintenance (#8394)
This PR fixes moves resources stuck in transition state during async job cleanup

Problem:
During maintenance of the management server, other servers in the cluster or the same server after a restart initiate async job cleanup. However, this process leaves resources in a transitional state. The only recovery option currently available is to make direct database changes.

Solution:
This PR introduces a resolution by changing Volume, Virtual Machine, and Network resources from their transitional states. This adjustment enables the reattempt of failed operations without the need for manual database modifications.
2024-01-19 13:26:25 +05:30
John Bampton 6f4503488b
pre-commit: apply `end-of-file-fixer` to all files (#7551) 2023-08-02 13:47:21 +02:00
Suresh Kumar Anaparti d8c7e34b38
Improve global settings UI to be more intuitive/logical (#5797)
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
Co-authored-by: dahn <daan@onecht.net>
2023-01-31 11:23:43 +01:00
Gabriel Beims Bräscher 1ba6a49fa4
Fix JsonSyntaxException when creating API command response #4355 (#4387)
* Add missing \" on obfuscatePassword

* Add tests for AsyncJobManagerImpl.obfuscatePassword

* fix for "password},"
2020-10-20 16:49:55 +05:30
Marc-Aurèle Brothier 893a88d225 CLOUDSTACK-10105: Use maven standard project structure in all projects (#2283)
Remove maven standard module (which only a few were using) and get ride of maven customization for the projects structure.

- moved all directories to src/main/java, src/main/resources, src/main/scripts, src/test/java, src/test/resources
- grep scan to search for src/com and src/org left over
- grep for <project>/scripts to fix pom.xml configuration
- remove custom <build> configuration in pom.xml

Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>
2018-01-20 03:19:27 +05:30