Commit Graph

654 Commits

Author SHA1 Message Date
Abhishek Kumar 52c483d91d revert to constant backoff
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-11 12:19:50 +05:30
Abhishek Kumar 3ebc6c3d1e agent ssl handshake timeout
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-09 17:19:18 +05:30
Abhishek Kumar a74403bd97 revert logging changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-10-04 16:49:39 +05:30
Abhishek Kumar adae7c88b8 changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-27 13:00:38 +05:30
Abhishek Kumar fa50740514 wip
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-26 12:10:38 +05:30
Abhishek Kumar 1652086ff9 wip changes for agent reconnection
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2024-09-25 18:33:55 +05:30
mprokopchuk e0d6066935 Bumped pom version to 4.18.1.2 (to add migration SQL script) 2024-08-15 17:55:00 -07:00
Vishesh c6d35b31ca
Log stdout to a file (#399)
* Log stdout to a file

* Add logrotation

* Fixup permissions in log file

* Remove info logs in stdout

* Change output file names

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>

* Fix logrotate config

* Disable logging to stdout

---------

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2024-07-03 20:51:20 +05:30
Wei Zhou e065c93c3f
Apple FR76: Implicit host tags (#427)
* Merge two HostTagVO and HostTagDaoImpl

* Apple FR76: dynamic host tags

* Revert "Apple FR76: dynamic host tags"

This reverts commit 01b93a873f167018c4fafd0744c0de07ae4de4ed.

* Apple FR76: Implicit host tags

* Apple FR76: address Abhishek's comments

* Apple FR76: move updateImplicitTags

* Apple FR76: add since to other two responses

* Update 8929: add unit test in LibvirtComputingResourceTest

* Update variable names

* Update FR76: add explicithosttags in response

* Update FR76 UI: Update explicit host tags

* Update 8929: remove host tags and change labels on UI

* Update: ui polish for host tags

* fix since in responses

* Update 8929: fix UI error if no host tags
2024-05-30 17:20:37 +05:30
Marcus Sorensen f896586925
Update version to 4.18.1.1 (#417)
* Update version to 4.18.1.1

* Update changelog

* Update changelog

* Update changelog

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-08 09:27:57 -06:00
Marcus Sorensen ac4b030759
Mark libvirt events experimental, add properties flag (#404)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-04-03 08:03:26 -06:00
Vishesh f30e07b312
Fix host stuck in connecting state (#375)
* Fix host stuck in connecting state (#8502)

There are a lot of test failures due to test_vm_life_cycle.py in multiple PRs due to host not available for migration of VMs.
#8438 (comment)
#8433 (comment)
#7344 (comment)

While debugging I noticed that the hosts get stuck in Connecting state because MS is waiting for a response of the ReadyCommand from the agent. Since we take a lock on connection and disconnection, restarting the agent doesn't work. To fix this, we have to restart the MS or wait for ~1 hour (default timeout).

On the agent side, it gets stuck waiting for a response from the Script execution.

To reproduce, run smoke/test_vm_life_cycle.py (TestSecuredVmMigration test class to be specific). Once the tests are complete, you will notice that some hosts are stuck in Connecting state. And restarting the agent fails due to the named lock. Locks on DB can be checked using the below query.

SELECT *
FROM performance_schema.metadata_locks
INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID
WHERE PROCESSLIST_ID <> CONNECTION_ID() \G;

This PR adds a wait for the ready command and a timeout to the Script execution to ensure that the thread doesn't get stuck and the named lock from database is released.

* Externalise a few timeouts & fix timeout for hostSupportsUefi in libvirt ready command wrapper (#8547)

This PR fixes bug introduced in #8502. Timeout for script execution was set to 60 ms instead of 60s which resulted in host not getting UEFI enabled. This is a blocker for 4.19 release.

We do this by introducing a new agent parameter `agent.script.timeout` (default - 60 seconds) to use as a timeout for the script checking host's UEFI status.

We also externalize the timeout for the ReadyCommand by introducing a new global setting `ready.command.wait` (default - 60 seconds).

For ModifyStoragePoolCommand, we don't externalize the timeout to avoid confusion for the user. Since, the required timeout can vary depending on the provider in use and we are only setting the wait for default host listener for now. Instead, we reuse the global `wait` setting by dividing it by `5` making the default value of 6 minutes (1800/5 = 360s) for ModifyStoragePoolCommand.

Note: the actual time, the MS waits is twice the wait set for a Command. Check reference code below.
19250403e6/engine/orchestration/src/main/java/com/cloud/agent/manager/AgentAttache.java (L406-L442)

* fixup
2024-02-21 13:44:53 +05:30
Marcus Sorensen f49265c14c
Fix missing code from backport of 4.16 version of dom0 CPU reserve (#374)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2024-02-05 11:53:45 +05:30
Vishesh a7c7a33131
Apple base418 agent lock during reconnect (#340)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-11-03 16:56:15 +01:00
Marcus Sorensen c01ed90569 Trigger out of band VM state update via libvirt event when VM stops (#7963)
* Trigger out of band VM state update via libvirt event when VM stops

* Add License headers, refactor nested try

---------

Co-authored-by: Marcus Sorensen <mls@apple.com>
(cherry picked from commit 3694667f50)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-09-28 12:22:15 +05:30
Marcus Sorensen 33a5c159a3 KVM Agent config to reserve dom0 CPUs (#326)
Cherry-picked from 1a722edfce6234319c5eaecf3cd0076fa9690267

Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-09-27 17:15:40 +05:30
Nicolas Vazquez 20952b4842 Auto Enable/Disable KVM hosts (#7170)
* Auto Enable Disable KVM hosts

* Improve health check result

* Fix corner cases

* Script path refactor

* Fix sonar cloud reports

* Fix last code smells

* Add marvin tests

* Fix new line on agent.properties to prevent host add failures

* Send alert on auto-enable-disable and add annotations when the setting is enabled

* Address reviews

* Add a reason for enabling or disabling a host when the automatic feature is enabled

* Fix comment on the marvin test description

* Fix for disabling the feature if the admin has manually updated the host resource state before any health check result

(cherry picked from commit be66eb2a35)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2023-09-15 13:15:59 +05:30
Wei Zhou 4bdff06acd Updating pom.xml version numbers for release 4.18.1.0
Signed-off-by: Wei Zhou <weizhou@apache.org>
2023-09-07 08:50:50 +02:00
dahn 2752c49fa7
agent: get the right controll cidr (#7580)
Fixes: #7574
2023-07-07 22:57:58 +05:30
Nicolas Vazquez c733a23c90
Fix direct download URL checks (#7693)
This PR fixes the URL check for direct downloads, in the case of HTTPS URLs the certificates were not loaded into the SSL context
2023-07-06 13:47:13 +05:30
Marcus Sorensen bdd5363314
Qemu migration hook: check for source length before using element 0 (#7482)
Co-authored-by: Marcus Sorensen <mls@apple.com>
2023-05-08 11:59:42 +05:30
Daan Hoogland 05cda2729f Updating pom.xml version numbers for release 4.18.1.0-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-15 19:38:14 +01:00
Daan Hoogland 0574087284 Updating pom.xml version numbers for release 4.18.0.0
Signed-off-by: Daan Hoogland <daan@onecht.net>
2023-03-11 09:35:41 +01:00
Nicolas Vazquez eac357cb77
kvm: Secure KVM VNC Console Access Using the CA Framework (#7015)
This PR allows securing the console access through CloudStack to the virtual machines running on KVM. The secure access is achieved through the generated certificates for the CA Framework in CloudStack, that provides mutual TLS connections between agents. These certificates are used to also secure the connection between the console proxies and the VNC ports for VM console access.

This feature is only supported on the KVM hypervisor

Design Document: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Secure+KVM+VNC+connection+using+the+CA+framework
2023-01-27 17:22:06 +05:30
slavkap d288bb0c78
KVM support of iothreads and IO driver policy (#6909) 2023-01-25 12:34:05 +01:00
John Bampton 52c321a0c6
Fix spelling (#7087) 2023-01-16 10:56:07 +01:00
Paula Oliveira 0fe2e6950e
Improving code related to the Agent properties (#6348)
Co-authored-by: Paula Zomignani Oliveira <paula@scclouds.com.br>
Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2022-12-22 12:00:49 +01:00
José Flauzino 1843632c24
Fix memory stats for KVM (#6358)
Co-authored-by: joseflauzino <jose@scclouds.com.br>
2022-11-09 18:00:12 +01:00
Nicolas Vazquez b2fbe7bb12
console: Console access enhancements (#6577)
This PR creates a new API createConsoleAccess to create VM console URL allowing it to connect using other UI implementations. To avoid reply attacks, the console access is enhanced to use a one time token per session

New configuration added:
consoleproxy.extra.security.validation.enabled: Enable/disable extra security validation for console proxy using a token

Documentation PR: apache/cloudstack-documentation#284
2022-09-14 12:39:59 +05:30
Daan Hoogland a470f3353a Merge branch '4.17' 2022-07-05 09:11:45 +02:00
John Bampton 7d23a0a759
Fix spelling (#6272) 2022-07-05 09:08:53 +02:00
nvazquez 0bcc609f05
Updating pom.xml version numbers for release 4.18.0.0-SNAPSHOT
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-06-06 12:25:35 -03:00
nvazquez 038a669d6b
Updating pom.xml version numbers for release 4.17.1.0-SNAPSHOT
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-06-06 12:19:44 -03:00
nvazquez c56220fcf2
Updating pom.xml version numbers for release 4.17.0.0
Signed-off-by: nvazquez <nicovazquez90@gmail.com>
2022-05-31 14:33:47 -03:00
nvazquez 96594aec28
Merge branch '4.16' 2022-05-23 08:16:52 -03:00
Nicolas Vazquez dc975dff95
[KVM] Enable IOURING only when it is available on the host (#6399)
* [KVM] Disable IOURING by default on agents

* Refactor

* Remove agent property for iouring

* Restore property

* Refactor suse check and enable on ubuntu by default

* Refactor irrespective of guest OS

* Improvement

* Logs and new path

* Refactor condition to enable iouring

* Improve condition

* Refactor property check

* Improvement

* Doc comment

* Extend comment

* Move method

* Add log
2022-05-23 08:11:14 -03:00
Wei Zhou 8f39a049bb
agent: enable ssl only for kvm agent (not in system vms) (#6371)
* agent: enable ssl only for kvm agent (not in system vms)

* Revert "agent: enable ssl only for kvm agent (not in system vms)"

This reverts commit b2d76bad2e.

* Revert "KVM: Enable SSL if keystore exists (#6200)"

This reverts commit 4525f8c8e7.

* KVM: Enable SSL if keystore exists in LibvirtComputingResource.java
2022-05-12 07:01:55 -03:00
Wei Zhou 4525f8c8e7
KVM: Enable SSL if keystore exists (#6200)
* KVM: Enable SSL if keystore exists

* Update #6200: add logs if no passphrase or no keystore
2022-04-22 11:51:21 -03:00
Pearl Dsilva 830f3061bc
SystemVM optimizations (#5831)
* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency

* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp

* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup

* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes

* Add ssh to k8s nodes details in the Access tab on the UI

* test

* Refactor ca/cert patching logic

* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script

* remove all references of systemvm.iso

* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs

* fix script timeout

* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand

* remove commented code + change core user to cloud for cks nodes

* Update ownership of ssh directory

* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)

* Add UI changes + move changes from patch file to runcmd

* test: validate performance for template modification during seeding

* create vms folder in cloudstack-commons directory - debian rules

* remove logic for on the fly template convert + update k8s test

* fix syntax issue - causing issue with shared network tests

* Code cleanup

* refactor patching logic - certs

* move logic of fixing rootdiskcontroller from upgrade to kubernetes service

* add livepatch option to restart network & vpc

* smooth upgrade of cks clusters

* Support for live patching systemVMs and deprecating systemVM.iso. Includes:
- fix systemVM template version
- Include agent.zip, cloud-scripts.tgz to the commons package
- Support for live-patching systemVMs - CPVM, SSVM, Routers
- Fix Unit test
- Remove systemvm.iso dependency

* The following commit:
- refactors logic added to support SystemVM deployment on KVM
- Adds support to copy specific files (required for patching) to the hosts on Xenserver
- Modifies vmops method - createFileInDomr to take cleanup param
- Adds configuratble sleep param to CitrixResourceBase::connect() used to verify if telnet to specifc port is possible (if sleep is 0, then default to _sleep = 10000ms)
- Adds Command/Answer for patch systemVMs on XenServer/Xcp

* - Support to patch SystemVMs - VMWare
- Remove attaching systemvm.iso to systemVMs
- Modify / Refactor VMware start command to copy patch related files to the systemvms
- cleanup

* Commit comprises of:
- remove docker from systemvm template - use containerd as container runtime
- update create-k8s-binaries script to use ctr for all docker operations
- Update userdata sent to the k8s nodes
- update cksnode script, run during patching of the cks/k8s nodes

* Add ssh to k8s nodes details in the Access tab on the UI

* test

* Refactor ca/cert patching logic

* Commit comprises of the following changes:
- Use restart network/VPC API to patch routers
- use livePatch API support patching of only cpvm/ssvm
- add timeout to the keystore setup/import script

* remove all references of systemvm.iso

* Fix keystore-cert-import invocation + refactor cert timeout in CP/SS VMs

* fix script timeout

* Refactor cert patching for systemVMs + update keystore-cert-import script + patch-sysvms script + remove patchSysvmCommand from networkelementcommand

* remove commented code + change core user to cloud for cks nodes

* Update ownership of ssh directory

* NEED TO DISCUSS - add on the fly template conversion as an ExecStartPre action (systemd)

* Add UI changes + move changes from patch file to runcmd

* test: validate performance for template modification during seeding

* create vms folder in cloudstack-commons directory - debian rules

* remove logic for on the fly template convert + update k8s test

* fix syntax issue - causing issue with shared network tests

* Code cleanup

* add cgroup config for containerd

* add systemd config for kubelet

* add additional info during image registry config

* address comments

* add temp links of download.cloudstack.org

* address part of the comments

* address comments

* update containerd config - as version has upgraded to 1.5 from 1.4.12 in 4.17.0

* address comments - simplify

* fix vue3 related icon changes

* allow network commands when router template version is lower but is patched

* add internal LB to the list of routers to be patched on network restart with live patch

* add unit tests for API param validations and new helper utilities - file scp & checksum validations

* perform patching only for non-user i.e., system VMs

* add test to validate params

* remove unused import

* add column to domain_router to display software version and support networkrestart with livePatch from router view

* Requires upgrade column to consider package (cloud-scripts) checksum to identify if true/false

* use router software version instead of checksum

* show N/A if no software version reported i.e., in upgraded envs

* fix deb failure

* update pom to official links of systemVM template
2022-04-21 13:40:19 -03:00
nvazquez 65dc2df896
Merge branch '4.16' 2022-04-13 08:45:48 -03:00
slavkap 42a92dcdd3
Extract the IO_URING configuration into the agent.properties (#6253)
When using advanced virtualization the IO Driver is not supported. The
admin will decide if want to enable/disable this configuration from
agent.properties file. The default value is true
2022-04-13 08:43:35 -03:00
Nicolas Vazquez 2f2d6cbe38
KVM: Enhance CPU speed detection on hosts (#6175)
* KVM: Enhance CPU speed detection on hosts

* Update agent/conf/agent.properties

Co-authored-by: dahn <daan.hoogland@gmail.com>

Co-authored-by: dahn <daan.hoogland@gmail.com>
2022-04-01 12:12:34 -03:00
John Bampton 15937369fe
Fix spelling (#6161)
* Fix spelling

* Fix spelling
2022-03-29 00:21:07 -03:00
John Bampton 6401c850b7
Fix spelling (#6064)
* Fix spelling

- `interupted` to `interrupted`
- `paramter` to `parameter`

* Fix more typos
2022-03-08 13:02:35 -03:00
Suresh Kumar Anaparti bc70535ee5
Updating pom.xml version numbers for release 4.16.2.0-SNAPSHOT
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2022-03-03 18:15:33 +05:30
Suresh Kumar Anaparti cad9332082
Updating pom.xml version numbers for release 4.16.1.0
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2022-02-25 19:01:16 +05:30
Suresh Kumar Anaparti b50542a11c
Merge branch '4.16' into main 2022-02-15 19:26:04 +05:30
Pearl Dsilva e0a5df50ce
CKS Enhancements and SystemVM template upgrade improvements (#5863)
* This PR/commit comprises of the following:
- Support to fallback on the older systemVM template in case of no change in template across ACS versions
- Update core user to cloud in CKS
- Display details of accessing CKS nodes in the UI - K8s Access tab
- Update systemvm template from debian 11 to debian 11.2
- Update letsencrypt cert
- Remove docker dependency as from ACS 4.16 onward k8s has deprecated support for docker - use containerd as container runtime

* support for private registry - containerd

* Enable updating template type (only) for system owned templates via UI

* edit indents

* Address comments and move cmd from patch file to cloud-init runcmd

* temporary change

* update k8s test to use k8s version 1.21.5 (instead of 1.21.3 - due to https://github.com/kubernetes/kubernetes/pull/104530)

* support for private registry - containerd

* Enable updating template type (only) for system owned templates via UI

* smooth upgrade of cks clusters

* update pom file with temp download.cloudstack.org testing links

* fix pom

* add cgroup config for containerd

* add systemd config for kubelet

* add additional info during image registry config

* update to official links
2022-02-15 18:27:14 +05:30
Daniel Augusto Veronezi Salvador b4aabadc4d
Replace string libraries with org.apache.commons.lang3.StringUtils (#5386)
* Replace google lib for lang3 and adjust methods calls

* Replace string libs by lang3

* Prohibit others string libs

Co-authored-by: GutoVeronezi <daniel@scclouds.com.br>
2021-11-18 13:41:48 +05:30
nicolas 3f79436840
Updating pom.xml version numbers for release 4.17.0.0-SNAPSHOT
Signed-off-by: nicolas <nicovazquez90@gmail.com>
2021-11-09 22:55:52 -03:00