Commit Graph

36102 Commits

Author SHA1 Message Date
Rohit Yadav 3883dbe9a0 schema: force index on user_view_view
In env with large number of shared networks or ip addresses (10k+), this
causes millions of table scans in user_ip_address table. This causes
severe slowness in listVM APIs etc.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav f57f244863 schema: speed up network offering created table scans
Using function in view was causing too many scans, as many rows as
number of domains and zones. This reduces table scans where left joins
happen using sub-queries. The effect is seen in bit faster create
network API performance.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav c3867a941f more fixmes and todos
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 7a7f1e2b6e FIXME/TODO: CPU and DB hotspot found
Found these CPU and DB hotspot that handle agent ping commands, this
adds idle load when there are high number of hosts. By design, there
isn't any quick win here. However, the power sync report/handling could
be improved, so it doesn't need to kick-in for every ping command
received.

Few more areas marked in the codebase.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 5603bf9c1a engine: optimise CPU and DB hotspot to return enabled hypervisors in the zone
This refactors a ResourceManager::listAvailHypervisorInZone method
that should return unique hypervisors for which existing hosts are Up
and processed. We can approximate this by assuming that those hosts
would have setup their hypervisor-specific systemvmtemplates. In a given
environment there wouldn't be thousands of systemvmtemplates, but can
have thousands of hosts. So, instead of scanning the entire cloud.host
table, we can make calculate guess by returning unique hypervisors of
systemvm templates which are ready. This method was used in
::processConnect() when an agent joins, to speed up its handling.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 8a320b807d engine/schema: cluster dao method query optimisation
Replace list.size() by doing getCount() instead.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 696927455f framework/db: use HikariCP instead of dbcp2
Replaces dbcp2 connection pool library with more performant HikariCP.
With this unit tests are failing but build is passing.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav f21a00b2de framework/db: use lightweight-ping
As per the docs, the connector-j can use /* ping */ before calling
SELECT 1 to have light weight application pings to the server:
https://dev.mysql.com/doc/connector-j/en/connector-j-usagenotes-j2ee-concepts-connection-pooling.html

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 1c02166d29 framework/db: dont' use validation query as connector is JDBC4 compliant
Per docs, if the mysql connector is JDBC2 compliant then it should use
the Connection.isValid API to test a connection.
(https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#isValid-int-)

This would significantly reduce query lags and API throughput, as for
every SQL query one or two SELECT 1 are performed everytime a Connection
is given to application logic.

This should only be accepted when the driver is JDBC4 complaint.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:39 +05:30
Rohit Yadav 90afcf2f85 metrics: optimise code and query to get summed cpu sockets
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 35462dc96d server: fix full table scanning for listHosts API
The type parameter isn't keyword, but a simple listHosts API call with
type=Routing, runs SELECT COUNT(*) FROM host WHERE host.type LIKE
'%Routing'  AND host.removed IS NULL; ... which causes an unnecessary
full table scan.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 076a712fbe schema: add indexes that save DB from too many scans
Speeds up several APIs, esp host and VM listing APIs and VM deployment

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 54accfdc0a schema: add missing index to reduce table scans
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 5750e56be5 server: improve DB optimisation, indexing and reduce table scans
In this example commit, we look at:
- Adding missing indexes to speed up queries
- Reduce table scans by optimising sql query and using indexes
- Optimising sql queries to remove duplicate rows (use of distinct)
- Reduce CPU and DB load by using jprofiler to optimise both sql query
  and CPU hotspots

server: reduce CPU and DB load caused by systemvm ::isZoneReady()
For this case, the sql query was fetching large number of table scans
only to determine if zone has any available pool+host to launch
systemvms. Accodingly the code and sql queries along with indexes
optimisations were used to lower both DB scans and mgmt server CPU load.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 3a0927a568 server: trace logs for security groups listener
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 607911562e server: fix NPE, compare known versus unknown in equals()
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 807cd6a830 metrics: speed up list zones and cluster metrics APIs
Also add a flag to disable on-the-fly metrics computation when the
list metrics APIs for zones and clusters are called.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 72b841567e ui: add disconnected hosts filter and improve admin dashboard
Adds disconnected as a host filter in the UI
Improve capacity dashboard for admins for large env.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 5484d3c7e6 orchestartion: optimise vm list fetching excluding that reported
This optimises the sql query and iterator to simply return the VMs list
excluding those in the received report.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav de82aa8e91 engine/orchestartion: wrap db txn in try-with, only fetch id
Optimises DB query that seem to run against every Ping command, where
whole columns are fetched but only `id` column is used.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav c01aad6ba8 server: count hosts than get all hosts in capacity scans
This refactors hotspot code to fetch just the count of hosts than
all the host VOs for a zone, during capacity scans for systemvms.
This reduces CPU and DB load, in really large (10k+ hosts) env.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 2a48d71909 server: don't go into O(n^2) loop for non-XenServer hosts
Introduced in https://github.com/apache/cloudstack/pull/1403 this
gates the logic only to XenServer where this would at all run. The
specific code is only applicable for XenServer and SolidFire
(https://youtu.be/YQ3pBeL-WaA?si=ed_gT_A8lZYJiEh.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:22:38 +05:30
Rohit Yadav 47163df2ff
framework/config: make logic in ::value() defensive (#449)
This adds a NPE check on the s_depot.global() which can cause NPE in
case of unit tests, where s_depot is not null but the underlying config
dao is null (not mocked or initialised) via `s_depot.global()` becomes
null.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-05-22 20:20:37 +05:30
Vishesh c3eba5e213
Fix exceeding of resource limits with powerflex (#443)
* Fix exceeding of resource limits with powerflex

* Fix for volume prepare during VM start

* resolve comments

* Add e2e tests

* Fixup

* Update e2e tests

* minor refactoring

* refactoring

* fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-05-08 20:54:54 +05:30
Vishesh 2f4cea6dca
Fix message publish in transaction (#438)
* Fix message publish in transaction

* Resolve comments
2024-05-07 13:27:19 +05:30
Vishesh 04a589d013
Fixup e2e test_restore_vm (#445)
* Fixup e2e test_restore_vm

* Fix template's size attribute

* Resolve comments
2024-05-07 12:59:42 +05:30
Vishesh 7fae1fc747
Fix restore VM with allocated root disk (#441)
* Fix restore VM with allocated root disk

* Add e2e test for restore vm

* Add more checks for e2e test
2024-04-29 12:18:55 +05:30
Vishesh 9ab786c18a
Fix: Update rootdisksize detail on restore VM (#440)
* Fix: Update rootdisksize detail on restore VM

* Update server/src/main/java/com/cloud/vm/UserVmManagerImpl.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* minor fixup

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-29 12:14:44 +05:30
Vishesh 1b54edd9de
Fix resource limit checks and increment/decrements for different operations (#430)
* Fix resource limit checks and increment/decrements for different operations

* Fixup

* More fixups

* fixup

* Refactor code

* Resolve comments

* Some minor code refactoring

* Fixup

* fixup

* Fix method name

* Fixup

* Fixup listing
2024-04-24 17:56:33 +05:30
Vishesh c21b6d8b52
Update volume's passphrase to null if diskOffering doesn't support encryption (#428) 2024-04-23 09:46:20 -06:00
Vishesh 93e66c52dc
Fix null pointer exception in restore VM (#431) 2024-04-23 09:41:39 -06:00
Marcus Sorensen e630d7afea
Update netty version for compatibility/staying current (#433)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-19 21:20:18 +05:30
Vishesh 1b52bebd08
Fix error message for checkVolume command (#409) 2024-04-17 17:27:08 +05:30
Marcus Sorensen 3a058f3a18
Introduce scheduled executor wrapper with dynamic interval (#424)
* Introduce scheduled executor wrapper with dynamic interval

* Resolve comments

* Add validations

* Add validation for configkey

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Vishesh <vishesh92@gmail.com>
2024-04-17 14:24:31 +05:30
Vishesh fd9325a86d
Speed up resource count calculation (#425)
* Speed up resource count calculation

* server: remove supportedOwner from Resource.ResourceType (#7416)

* Refactor resource count calculation

* Start transaction for updateCountByDeltaForIds

---------

Co-authored-by: GaOrtiga <49285692+GaOrtiga@users.noreply.github.com>
2024-04-17 14:21:07 +05:30
Vishesh 26c1741af5
Fix listStoragePoolsMetricsCmd (#419) 2024-04-16 15:51:53 +05:30
Vishesh 1b7f33d0e1
This PR fixes the build issue on apple-base418 (#429) 2024-04-12 15:43:23 +02:00
Vishesh 0501678478
Allow overriding root diskoffering id & size, and expunge old root disk while restoring VM (#401)
* Allow overriding root diskoffering id & size while restoring VM

* UI changes

* Allow expunging of old disk while restoring a VM

* Apply suggestions from code review

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* resolve comments

* Fixup

* Rename some variables

* Resolve comments

* Address comments

* Duplicate volume's details while duplicating volume

* Allow setting IOPS for the new volume

* minor cleanup

* fixup

* Add checks for template size

* Replace strings for IOPS with constants

* Fix saveVolumeDetails method

* Fixup

* Fixup UI styling

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-12 17:47:16 +05:30
Vishesh 8d0915c4c9
Change iops on offering change (#416)
* Change IOPS on disk offering change

* Remove iops & bandwidth limits before copying template

* minor refactor

* Handle diskOfferingDetails

* Fixup
2024-04-11 16:59:57 +05:30
Marcus Sorensen 227dc5e86a
Add ability to set cpu.threadspercore similar to existing cpu.corespersocket (#411)
* Add ability to set cpu.threadspercore similar to existing cpu.corespersocket

* Add license to new test file

* Add tests to handle some edge cases

* Add some edge test cases to CPU topology

* Rework logic on KVM CPU topology, handle more cases

* Add more test cases

* Add more test cases

* Update plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Added cpu.threadspercore detail in listDetailOptions response (for KVM hypervisor)

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-10 09:58:03 -06:00
Marcus Sorensen 631b0960f3
Allow kvm storage plugin to customize diskdef, add geometry (#402)
* Allow kvm storage plugin to customize diskdef, add geometry

* formatting update

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-04-08 09:28:59 -06:00
Marcus Sorensen f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Rohit Yadav 0c23820c7c
Merge pull request #414 from shapeblue/security-backport418
Backport upstream security fixes to apple-base418
2024-04-03 19:58:28 +05:30
Marcus Sorensen ac4b030759
Mark libvirt events experimental, add properties flag (#404)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-03 08:03:26 -06:00
Vishesh c09cea5d86
Fix: check root disk offering tagged limits during VM deploy (#415) 2024-04-03 18:41:24 +05:30
Wei Zhou 21a03ae4da upgrade: fix upgrade from 4.18.1.0 to 4.18.2.0-SNAPSHOT (#7959)
The uprgade from 4.18.1.0 to 4.18.2.0-SNAPSHOT failed with error

```
2023-09-12 16:12:19,003 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.18.1.0 Code Version = 4.18.2.0
2023-09-12 16:12:19,004 INFO  [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Database upgrade must be performed from 4.18.1.0 to 4.18.2.0
2023-09-12 16:12:19,036 DEBUG [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Running upgrade Upgrade41800to41810 to upgrade from 4.18.0.0-4.18.1.0 to 4.18.1.0
...
2023-09-12 16:12:19,041 DEBUG [c.c.u.d.ScriptRunner] (main:null) (logid:) -- Schema upgrade from 4.18.0.0 to 4.18.1.0
...
2023-09-12 16:12:21,602 DEBUG [c.c.u.d.DatabaseAccessObject] (main:null) (logid:) Statement: CREATE INDEX i_cluster_details__name on cluster_details (name)
2023-09-12 16:12:21,663 DEBUG [c.c.u.d.DatabaseAccessObject] (main:null) (logid:) Created index i_cluster_details__name
2023-09-12 16:12:21,673 DEBUG [c.c.u.d.T.Transaction] (main:null) (logid:) Rolling back the transaction: Time = 2632 Name =  Upgrade; called by -TransactionLegacy.rollback:888-TransactionLegacy.removeUpTo:831-TransactionLegacy.close:655-TransactionContextInterceptor.invoke:36-ReflectiveMethodInvocation.proceed:175-ExposeInvocationInterceptor.invoke:97-ReflectiveMethodInvocation.proceed:186-JdkDynamicAopProxy.invoke:215-$Proxy30.persist:-1-DatabaseUpgradeChecker.upgrade:319-DatabaseUpgradeChecker.check:403-CloudStackExtendedLifeCycle.checkIntegrity:64
```

It succeeded with this change.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit a88a47989369af204ea6ee8a5fd190311f43c74c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2024-04-03 15:50:56 +05:30
Vishesh 8eee0ec213
Fix getRepair method in checkVolume command (#408)
* Fix getRepair method in checkVolume command

* Add License
2024-04-02 17:37:39 +05:30
Vishesh 5137c196c2
HypervisorType as a class (#393)
* HypervisorType as a class

* Fixup

* fixup

* Add missing annotation

* Resolve comments

* Handle parallels typo
2024-04-02 17:35:16 +05:30
Abhishek Kumar 292c0eb291 fix test failure
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 14:05:03 +05:30
Abhishek Kumar 996ae9a959 engine-storage: control download redirection
Add a global setting to control whether redirection is allowed while
downloading templates and volumes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-04-01 09:23:17 +05:30