Commit Graph

11655 Commits

Author SHA1 Message Date
Abhishek Kumar 4400e02a1b
framework/config,server: configkey caching (#472)
Added caching for ConfigKey value retrievals based on the Caffeine
in-memory caching library.
https://github.com/ben-manes/caffeine
Currently, expire time for a cache is 1 minute and each update of the
config key invalidates the cache.

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-03 15:53:08 +05:30
Abhishek Kumar e676b80052 revert fc2e4ffd12
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-07-29 13:08:28 +05:30
Abhishek Kumar fc2e4ffd12 server: refactor listNetworks api database retrievals (#9184)
* server: refactor listNetworks api database retrievals

* fixes

* remove unused methods

* imports

* fix empty searchcriteria issue

* refactor

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-07-22 16:14:00 +05:30
Abhishek Kumar 5e98405b38 Merge remote-tracking branch 'apple/apple-base418' into scalability-improvements 2024-07-22 16:12:19 +05:30
Suresh Kumar Anaparti d1faa59677
Back port fixes from upstream 4.19 (#466)
* Fixed src datastore on copy check for PowerFlex/ScaleIO storage driver (#9310)

* Ignore non-managed pools for storage pool access preparation (#9376)
2024-07-19 09:38:11 +05:30
Suresh Kumar Anaparti 5c682677fc
Support resource name / displaytext with unicode / emoji chars, and SQL exception msg improvements (#460)
* Don't send sql exception/query from dao to upper layer, log it and send only the error message

* Updated charset to utf8mb4, for display_name column/user_vm table and job_result column/async_job table to support unicode chars & emojis

* Added API arg validator for RFC compliance domain name, to validate VM's host name

* Updated user resources name / display name column's charset to utf8mb4

* Check and update char set for affinity group name to utf8mb4, from the data migration in upgrade path

* Updated backup offering name column charset to utf8mb4

* Added unit tests for vm host/domain name validation

* Added smoke test to check resource name for vm, volume, service & disk offering, template, iso, account(first/lastname)

* Updated resource annotation charset to utf8mb4

* Updated some resources description charset to utf8mb4
2024-07-19 09:35:18 +05:30
Rohit Yadav b46e4d4bbf
framework/cluster: improve cluster service and integration API service (#465)
- mTLS implementation for cluster service communication
- Listen only on the specified cluster node IP address instead of all interfaces
- Validate incoming cluster service requests are from peer management servers based on the server's certificate dns name which can be through global config - ca.framework.cert.management.custom.san
- Hardening of KVM command wrapper script execution
- Improve API server integration port check
- cloudstack-management.default: don't have JMX configuration if not needed. JMX is used for instrumentation; users who need to use it should enable it explicitly

Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Wei Zhou <weizhou@apache.org>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
(cherry picked from commit 4f5561937c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-07-09 09:03:40 +05:30
Vishesh c6d35b31ca
Log stdout to a file (#399)
* Log stdout to a file

* Add logrotation

* Fixup permissions in log file

* Remove info logs in stdout

* Change output file names

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Fix logrotate config

* Disable logging to stdout

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-03 20:51:20 +05:30
Abhishek Kumar 08246e05ed
server,test: fix resourceid for VOLUME.DETROY in restore VM (#9032) (#454)
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-06-28 12:01:55 +05:30
Suresh Kumar Anaparti be87b1a668
FR74: Mitigation for non-scalable ScaleIO clients (#447)
* Mitigation for non-scalable Powerflex/ScaleIO clients
- Added ScaleIOSDCManager to manage SDC connections, checks clients limit, prepare and unprepare SDC on the hosts.
- Added commands for prepare and unprepare storage clients to prepare/start and stop SDC service respectively on the hosts.
- Introduced config 'storage.pool.connected.clients.limit' at storage level for client limits, currently support for Powerflex only.

* tests issue fixed

* refactor / improvements

* lock with powerflex systemid while checking connections limit

* updated powerflex systemid lock to hold till sdc preparation

* Added custom stats support for storage pool, through listStoragePools API

* code improvements, and unit tests

* Update config 'storage.pool.connected.clients.limit' to dynamic, and some improvements

* Stop SDC on host after migration if no volumes mapped to host

* Wait for SDC to connect after scini service start, and some log improvements

* Do not throw exception (log it) when SDC is not connected while revoking access for the powerflex volume

* some log improvements
2024-06-27 18:47:50 +05:30
Vishesh 7ed43e3e43
Let network guru decide if ipv6 cidr size can't be equal to 64 (#462) 2024-06-27 12:20:49 +05:30
Vishesh 8be18e587f
FR75 Enforce strict host tag checking (#421)
* Enforce strict host tag checking

* Add e2e tests

* Add more information to error log

* Fix e2e test

* Update global settings descrption

* fixup

* Fix e2e test teardown
2024-06-25 14:38:59 +05:30
Abhishek Kumar 8f88103a29
FR72 - api,server: purge expunged resources (#405)
This PR introduces the functionality of purging removed DB entries for CloudStack entities (currently only for VirtualMachine).
There would be three mechanisms for purging removed resources:
- Background task - CloudStack will run a background task which runs at a defined interval. Other parameters for this task can be controlled with new global settings.
- API - New API `purgeExpungedResources`. It will allow passing the following parameters - resourcetype, batchsize, startdate, enddate
- Config for service offering. Service offerings can be created with purgeresources parameter which would allow purging resources immediately on expunge.

Following new global settings have been added:
- `expunged.resources.purge.enabled`: Default: false. Whether to run a background task to purge the DB records of the expunged resources.
- `expunged.resources.purge.resources`: Default: (empty). A comma-separated list of resource types that will be considered by the background task to purge the DB records of the expunged resources. Currently only VirtualMachine is supported. An empty value will result in considering all resource types for purging.
- `expunged.resources.purge.interval`: Default: 86400. Interval (in seconds) for the background task to purge the DB records of the expunged resources.
- `expunged.resources.purge.delay`: Default: 300. Initial delay (in seconds) to start the background task to purge the DB records of the expunged resources task.
- `expunged.resources.purge.batch.size`: Default: 50. Batch size to be used during purging of the DB records of the expunged resources.
- `expunged.resources.purge.start.time`: Default: (empty). Start time to be used by the background task to purge the DB records of the expunged resources. Use format `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss`.
- `expunged.resources.purge.keep.past.days`: Default: 30. The number of days in the past from the execution time of the background task to purge the DB records of the expunged resources for which the expunged resources must not be purged. To enable purging DB records of the expunged resource till the execution of the background task, set the value to zero.
- `expunged.resource.purge.job.delay`: Default: 180. Delay (in seconds) to execute the purging of the DB records of an expunged resource initiated by the configuration in the offering. Minimum value should be 180 seconds and if a lower value is set then the minimum value will be used.

Upstream PRs:
https://github.com/apache/cloudstack/pull/8999
https://github.com/apache/cloudstack-documentation/pull/397

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-06-19 12:59:50 +05:30
Suresh Kumar Anaparti 04091abc0d
User data content size validation, register managed user data using POST call from UI, and related code improvements (#361)
* Validate user data with actual length, and some code improvements

* Ignore if user data is not set (don't fail)

* Validate user data after finalizing it

* Updated registerUserData API using POST call from UI, to support user data upto 1048576 bytes

* Apply suggestions from code review

* Added logs for user data

* Addressed review comments

* Check user data length with base64 encoded data, and some code improvements
2024-06-19 12:54:32 +05:30
Abhishek Kumar 256051af1d
server: fix resource reservation leakage (#456)
* server: fix resource reservation leakage

Fixes #453

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactor

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* Fix resource reservation leftover entries (#455)

* Resolve comments

* Address comments

---------

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Vishesh <vishesh92@gmail.com>
2024-06-10 12:29:45 +05:30
Wei Zhou e065c93c3f
Apple FR76: Implicit host tags (#427)
* Merge two HostTagVO and HostTagDaoImpl

* Apple FR76: dynamic host tags

* Revert "Apple FR76: dynamic host tags"

This reverts commit 01b93a873f167018c4fafd0744c0de07ae4de4ed.

* Apple FR76: Implicit host tags

* Apple FR76: address Abhishek's comments

* Apple FR76: move updateImplicitTags

* Apple FR76: add since to other two responses

* Update 8929: add unit test in LibvirtComputingResourceTest

* Update variable names

* Update FR76: add explicithosttags in response

* Update FR76 UI: Update explicit host tags

* Update 8929: remove host tags and change labels on UI

* Update: ui polish for host tags

* fix since in responses

* Update 8929: fix UI error if no host tags
2024-05-30 17:20:37 +05:30
Rohit Yadav b03d1382e6 fix unit tests failures
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-23 10:23:32 +05:30
Rohit Yadav c3867a941f more fixmes and todos
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 7a7f1e2b6e FIXME/TODO: CPU and DB hotspot found
Found these CPU and DB hotspot that handle agent ping commands, this
adds idle load when there are high number of hosts. By design, there
isn't any quick win here. However, the power sync report/handling could
be improved, so it doesn't need to kick-in for every ping command
received.

Few more areas marked in the codebase.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 5603bf9c1a engine: optimise CPU and DB hotspot to return enabled hypervisors in the zone
This refactors a ResourceManager::listAvailHypervisorInZone method
that should return unique hypervisors for which existing hosts are Up
and processed. We can approximate this by assuming that those hosts
would have setup their hypervisor-specific systemvmtemplates. In a given
environment there wouldn't be thousands of systemvmtemplates, but can
have thousands of hosts. So, instead of scanning the entire cloud.host
table, we can make calculate guess by returning unique hypervisors of
systemvm templates which are ready. This method was used in
::processConnect() when an agent joins, to speed up its handling.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 35462dc96d server: fix full table scanning for listHosts API
The type parameter isn't keyword, but a simple listHosts API call with
type=Routing, runs SELECT COUNT(*) FROM host WHERE host.type LIKE
'%Routing'  AND host.removed IS NULL; ... which causes an unnecessary
full table scan.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 5750e56be5 server: improve DB optimisation, indexing and reduce table scans
In this example commit, we look at:
- Adding missing indexes to speed up queries
- Reduce table scans by optimising sql query and using indexes
- Optimising sql queries to remove duplicate rows (use of distinct)
- Reduce CPU and DB load by using jprofiler to optimise both sql query
  and CPU hotspots

server: reduce CPU and DB load caused by systemvm ::isZoneReady()
For this case, the sql query was fetching large number of table scans
only to determine if zone has any available pool+host to launch
systemvms. Accodingly the code and sql queries along with indexes
optimisations were used to lower both DB scans and mgmt server CPU load.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 3a0927a568 server: trace logs for security groups listener
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 607911562e server: fix NPE, compare known versus unknown in equals()
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav c01aad6ba8 server: count hosts than get all hosts in capacity scans
This refactors hotspot code to fetch just the count of hosts than
all the host VOs for a zone, during capacity scans for systemvms.
This reduces CPU and DB load, in really large (10k+ hosts) env.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 2a48d71909 server: don't go into O(n^2) loop for non-XenServer hosts
Introduced in https://github.com/apache/cloudstack/pull/1403 this
gates the logic only to XenServer where this would at all run. The
specific code is only applicable for XenServer and SolidFire
(https://youtu.be/YQ3pBeL-WaA?si=ed_gT_A8lZYJiEh.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Vishesh c3eba5e213
Fix exceeding of resource limits with powerflex (#443)
* Fix exceeding of resource limits with powerflex

* Fix for volume prepare during VM start

* resolve comments

* Add e2e tests

* Fixup

* Update e2e tests

* minor refactoring

* refactoring

* fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-05-08 20:54:54 +05:30
Vishesh 2f4cea6dca
Fix message publish in transaction (#438)
* Fix message publish in transaction

* Resolve comments
2024-05-07 13:27:19 +05:30
Vishesh 7fae1fc747
Fix restore VM with allocated root disk (#441)
* Fix restore VM with allocated root disk

* Add e2e test for restore vm

* Add more checks for e2e test
2024-04-29 12:18:55 +05:30
Vishesh 9ab786c18a
Fix: Update rootdisksize detail on restore VM (#440)
* Fix: Update rootdisksize detail on restore VM

* Update server/src/main/java/com/cloud/vm/UserVmManagerImpl.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* minor fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-29 12:14:44 +05:30
Vishesh 1b54edd9de
Fix resource limit checks and increment/decrements for different operations (#430)
* Fix resource limit checks and increment/decrements for different operations

* Fixup

* More fixups

* fixup

* Refactor code

* Resolve comments

* Some minor code refactoring

* Fixup

* fixup

* Fix method name

* Fixup

* Fixup listing
2024-04-24 17:56:33 +05:30
Vishesh c21b6d8b52
Update volume's passphrase to null if diskOffering doesn't support encryption (#428) 2024-04-23 09:46:20 -06:00
Vishesh 93e66c52dc
Fix null pointer exception in restore VM (#431) 2024-04-23 09:41:39 -06:00
Vishesh 1b52bebd08
Fix error message for checkVolume command (#409) 2024-04-17 17:27:08 +05:30
Marcus Sorensen 3a058f3a18
Introduce scheduled executor wrapper with dynamic interval (#424)
* Introduce scheduled executor wrapper with dynamic interval

* Resolve comments

* Add validations

* Add validation for configkey

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Vishesh <vishesh92@gmail.com>
2024-04-17 14:24:31 +05:30
Vishesh fd9325a86d
Speed up resource count calculation (#425)
* Speed up resource count calculation

* server: remove supportedOwner from Resource.ResourceType (#7416)

* Refactor resource count calculation

* Start transaction for updateCountByDeltaForIds

---------

Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
2024-04-17 14:21:07 +05:30
Vishesh 0501678478
Allow overriding root diskoffering id & size, and expunge old root disk while restoring VM (#401)
* Allow overriding root diskoffering id & size while restoring VM

* UI changes

* Allow expunging of old disk while restoring a VM

* Apply suggestions from code review

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* resolve comments

* Fixup

* Rename some variables

* Resolve comments

* Address comments

* Duplicate volume's details while duplicating volume

* Allow setting IOPS for the new volume

* minor cleanup

* fixup

* Add checks for template size

* Replace strings for IOPS with constants

* Fix saveVolumeDetails method

* Fixup

* Fixup UI styling

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-12 17:47:16 +05:30
Vishesh 8d0915c4c9
Change iops on offering change (#416)
* Change IOPS on disk offering change

* Remove iops & bandwidth limits before copying template

* minor refactor

* Handle diskOfferingDetails

* Fixup
2024-04-11 16:59:57 +05:30
Marcus Sorensen 227dc5e86a
Add ability to set cpu.threadspercore similar to existing cpu.corespersocket (#411)
* Add ability to set cpu.threadspercore similar to existing cpu.corespersocket

* Add license to new test file

* Add tests to handle some edge cases

* Add some edge test cases to CPU topology

* Rework logic on KVM CPU topology, handle more cases

* Add more test cases

* Add more test cases

* Update plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Added cpu.threadspercore detail in listDetailOptions response (for KVM hypervisor)

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-10 09:58:03 -06:00
Marcus Sorensen f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Rohit Yadav 0c23820c7c
Merge pull request #414 from shapeblue/security-backport418
Backport upstream security fixes to apple-base418
2024-04-03 19:58:28 +05:30
Vishesh c09cea5d86
Fix: check root disk offering tagged limits during VM deploy (#415) 2024-04-03 18:41:24 +05:30
Vishesh 5137c196c2
HypervisorType as a class (#393)
* HypervisorType as a class

* Fixup

* fixup

* Add missing annotation

* Resolve comments

* Handle parallels typo
2024-04-02 17:35:16 +05:30
Abhishek Kumar 292c0eb291 fix test failure
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 14:05:03 +05:30
Abhishek Kumar 996ae9a959 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 09:23:17 +05:30
dahn cfaac2a67e api: client verification in servlet
This introduces new global settings to handle how client address checks
are handled by the API layer:

proxy.header.verify: enables/disables checking of ipaddresses from a
                     proxy set header
proxy.header.names: a list of names to check for allowed ipaddresses
                    from a proxy set header.
proxy.cidr: a list of cidrs for which \"proxy.header.names\" are
            honoured if the \"Remote_Addr\" is in this list.

(cherry picked from commit b65546636d)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-03-31 22:03:04 +05:30
Wei Zhou 2b93886934 server: fix security issues caused by extraconfig on KVM
- Move allow.additional.vm.configuration.list.kvm from Global to Account setting
- Disallow VM details start with "extraconfig" when deploy VMs
- Skip changes on VM details start with "extraconfig" when update VM settings
- Allow only extraconfig for DPDK in service offering details
- Check if extraconfig values in vm details are supported when start VMs
- Check if extraconfig values in service offering details are supported when start VMs
- Disallow add/edit/update VM setting for extraconfig on UI

(cherry picked from commit e6e4fe16fb)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-03-31 22:02:26 +05:30
Vishesh 98d021faed
server: skip password policies check on empty password (#8370) (#396)
This PR changes the password.policy.regex default value to empty. With an empty value for the configuration, it is skipped during the password policy check, only when the configuration is set to something different than a blank string, the regex will get checked.
This way, when creating a user on org.apache.cloudstack.ldap.LdapAuthenticator#authenticate() we won't get an error by default, as an empty value for the password is passed.

Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
2024-03-27 13:44:28 +05:30
Vishesh 5b3a81c2a3
Fix failing test (#400) 2024-03-15 16:19:39 +05:30
Vishesh 4c6c8216d5
Use join instead of views (#365)
* Use join instead of views for filtering volumes

* Use join instead of views for filtering events

* Use join instead of views for filtering accounts

* Use join instead of views for filtering domains

* Use join instead of views for filtering hosts

* Use join instead of views for filtering storage pools

* Use join instead of views for filtering service offerings

* Use join instead of views for filtering disk offerings

* Remove unused code

* Fix unit test

* Use disk_offering instead of disk_offering_view in service_offering_view

* Fixup

* Fix listing of diskoffering & serviceoffering

* Use constants instead of strings

* Make changes to prevent sql injection

* Remove commented code

* Prevent n+1 queries for template's response

* remove unused import

* refactor some code

* Add missing check for service offering's join with disk offering

* Fix n+1 queries for stoage pool metrics

* Remove n+1 queries from list accounts

* Remove unused imports

* remove todo

* Remove unused import

* Fixup query generation for nested joins

* Fixups

* Fix DB exception on ClientPreparedStatement

* events,alerts: Add missing indexes (#366)

* Fixup
2024-03-14 17:49:35 +05:30