When a Instance is (attempted to be) started in KVM Host the Agent
should not worry about the allocated memory on this host.
To make a proper judgement we need to take more into account:
- Memory Overcommit ratio
- Host reserved memory
- Host overcommit memory
The Management Server has all the information and the DeploymentPlanner
has to make the decision if a Instance should and can be started on a
Host, not the host itself.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This fixes#2763 by moving a post cert-renewal class for kvm
plugin/hypervisor to src/main/java. The regression is due to change
in file-system layout due to maven standard refactoring on master and
issue was not caught during forward-merging of a PR from 4.11 branch.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Cleaup and code-formatting POM files
* Remove obsolete mycila license-maven-plugin
* Remove obsolete console-proxy/plugin project
* Move console-proxy-rdbconsole under console-proxy parent
* Use correct parent path for rdpconsole
* Order alphabetally items in setnextversion.sh
* Unifiy License header in POMs
* Alphabetic order of modules definition
* Extract all defined versions into parent pom
* Remove obsolete files: version-info.in, configure-info.in
* Remove redundant defaultGoal
* Remove useless checkstyle plugin from checkstyle project
* Order alphabetally items in pom.xml
* Add aditional SPACEs to fix debian build
* Don't execute checkstyle on parent projects
* Use UTF-8 encoding in building checkstyle project
* Extract plugin versions into properties
* Execute PMD plugin on all the projects with -Penablefindbugs
* Upgrade maven plugins to latest version
* Make sure to always look for apache parent pom from repository
* Fix incorrect version grep in debian packaging
* Fix rebase conflicts
* Fix rebase conflicts
* Remove PMD for now to be fixed on another PR
This is a new feature for CS that allows Admin users improved
troubleshooting of network issues in CloudStack hosted networks.
Description: For troubleshooting purposes, CloudStack administrators may wish to execute network utility commands remotely on system VMs, or request system VMs to ping/traceroute/arping to specific addresses over specific interfaces. An API command to provide such functionalities is being developed without altering any existing APIs. The targeted system VMs for this feature are the Virtual Router (VR), Secondary Storage VM (SSVM) and the Console Proxy VM (CPVM).
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Remote+Diagnostics+API
ML discussion:
https://markmail.org/message/xt7owmb2c6iw7tva
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.
Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
This ensure that fewer mount points are made on hosts for either
primary storagepools or secondary storagepools.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* 4.11:
comment on unencryption
ui: fix create VPC dialog box failure when zone is SG enabled (#2704)
CLOUDSTACK-10381: Fix password reset / reset ssh key with ConfigDrive
isisnot=
extra message
debug message
imports
update without decrypt doesn't work
set unsensitive attributes as not 'Secure'
remove old config artifacts from update path
Now the KVM agent checks whether a storage pool is mounted or not mounted before calling storagePoolCreateXML().
Signed-off-by: Kai Takahashi <k-takahashi@creationline.com>
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
This introduces a new global setting `vm.configdrive.primarypool.enabled` to toggle creation/hosting of config drive iso files on primary storage, the default will be false causing them to be hosted on secondary storage. The current support is limited from hypervisor resource side and in current implementation limited to `KVM` only. The next big change is that config drive is created at a temporary location by management server and shipped to either KVM or SSVM agent via cmd-answer pattern, the data of which is not logged in logs. This saves us from adding genisoimage dependency on cloudstack-agent pkg.
The APIs to reset ssh public key, password and user-data (via update VM API) requires that VM should be shutdown. Therefore, in the refactoring I removed the case of updation of existing ISO. If there are objections I'll re-put the strategy to detach+attach new config iso as a way of updation. In the refactored implementation, the folder name is changed to lower-cased configdrive. And during VM start, migration or shutdown/removal if primary storage is enable for use, the KVM agent will handle cleanup tasks otherwise SSVM agent will handle them.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In 4.11.0, I added the ability to online migrate volumes from NFS to managed storage. This actually works for Ceph to managed storage in a private 4.8 branch, as well. I thought I had brought along all of the necessary code from that private 4.8 branch to make Ceph to managed storage functional in 4.11.0, but missed one piece (which is fixed by this PR).
* CLOUDSTACK-9184: Fixes#2631 VMware dvs portgroup autogrowth
This deprecates the vmware.ports.per.dvportgroup global setting.
The vSphere Auto Expand feature (introduced in vSphere 5.0) will take
care of dynamically increasing/decreasing the dvPorts when running out
of distributed ports . But in case of vSphere 4.1/4.0 (If used), as this
feature is not there, the new default value (=> 8) have an impact in the
existing deployments. Action item for vSphere 4.1/4.0: Admin should
modify the global configuration setting "vmware.ports.per.dvportgroup"
from 8 to any number based on their environment because the proposal
default value of 8 would be very less without auto expand feature in
general. The current default value of 256 may not need immediate
modification after deployment, but 8 would be very less which means
admin need to update immediately after upgrade.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
These Boolean-return methods are named "getXXX", but other Boolean-return methods are named "isXXX", such as the following two methods. They will return boolean values, rename them as "isXXX" should be more clear than "getXXX".
* CLOUDSTACK-10147 Disabled Xenserver Cluster can still deploy VM's. Added code to skip disabled clusters when selecting a host (#2442)
(cherry picked from commit c3488a51db)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10318: Bug on sorting ACL rules list in chrome (#2478)
(cherry picked from commit 4412563f19)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10284:Creating a snapshot from VM Snapshot generates error if hypervisor is not KVM.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10221: Allow IPv6 when creating a Basic Network (#2397)
Since CloudStack 4.10 Basic Networking supports IPv6 and thus
should be allowed to be specified when creating a network.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
(cherry picked from commit 9733a10ecd)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10214: Unable to remove local primary storage (#2390)
Allow admins to remove primary storage pool.
Cherry-picked from eba2e1d8a1
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* dateutil: constistency of tzdate input and output (#2392)
Signed-off-by: Yoan Blanc <yoan.blanc@exoscale.ch>
Signed-off-by: Daan Hoogland <daan.hoogland@shapeblue.com>
(cherry picked from commit 2ad5202823)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10054:Volume download times out in 3600 seconds (#2244)
(cherry picked from commit bb607d07a9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* When creating a new account (via domain admin) it is possible to select “root admin” as the role for the new user (#2606)
* create account with domain admin showing 'root admin' role
Domain admins should not be able to assign the role of root admin to new users. Therefore, the role ‘root admin’ (or any other of the same type) should not be visible to domain admins.
* License and formatting
* Break long sentence into multiple lines
* Fix wording of method 'getCurrentAccount'
* fix typo in variable name
* [CLOUDSTACK-10259] Missing float part of secondary storage data in listAccounts
* [CLOUDSTACK-9338] ACS not accounting resources of VMs with custom service offering
ACS is accounting the resources properly when deploying VMs with custom service offerings. However, there are other methods (such as updateResourceCount) that do not execute the resource accounting properly, and these methods update the resource count for an account in the database. Therefore, if a user deploys VMs with custom service offerings, and later this user calls the “updateResourceCount” method, it (the method) will only account for VMs with normal service offerings, and update this as the number of resources used by the account. This will result in a smaller number of resources to be accounted for the given account than the real used value. The problem becomes worse because if the user starts to delete these VMs, it is possible to reach negative values of resources allocated (breaking all of the resource limiting for accounts). This is a very serious attack vector for public cloud providers!
* [CLOUDSTACK-10230] User should not be able to use removed “Guest OS type” (#2404)
* [CLOUDSTACK-10230] User is able to change to “Guest OS type” that has been removed
Users are able to change the OS type of VMs to “Guest OS type” that has been removed. This becomes a security issue when we try to force users to use HVM VMs (Meltdown/Spectre thing). A removed “guest os type” should not be usable by any users in the cloud.
* Remove trailing lines that are breaking build due to checkstyle compliance
* Remove unused imports
* fix classes that were in the wrong folder structure
* Updates to capacity management
This adds and allows Ubuntu 18.04 to be used as KVM host. In addition,
on the UI when hypervisor version key is missing, this adds and display
the host os and version detail which is useful to show the KVM host
os and version.
When cache mode 'none' is used for empty cdrom drives, systemvms
and guest VMs fail to start on newer libvirtd such as Ubuntu bionic.
The fix is ensure that cachemode is not declared when drives are empty
upon starting of the VM. Similar issue logged at redhat here:
https://bugzilla.redhat.com/show_bug.cgi?id=1342999
The workaround is to ensure that we don't configure cachemode for
cdrom devices at all. This also fixes live VM migration issue.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes SAML2 certificate encoding/decoding issue due to refactoring
regression introduced in 7ce54bf7a8 that
did not account for base64 based encoding/decoding. The changes
effectively restore the same logic as used in previous versions.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This ensures that certificate setup includes all the IP addresses (v4
and v6) when a (KVM) host is added to CloudStack. This fixes#2530.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* [CLOUDSTACK-5235] Force users to enter old password when updating password
* Formatting for checkstyle
* Remove an unused import in AccountManagerImpl
* Apply Nitin's suggestions
* Change 'oldPassword' to 'currentPassword'
* Second review of Resmo
* Fix typos found by Nitin
The three methods are named as "setXXX", actually, they are not simple setter or getter.
They are further renamed as "generateXXX" with dahn's comments.
These three methods are not direct getter or list.
They try to find the target objects with the related arguments.
So that, renaming them as "findXXX" should be more intuitive.
This adds support for XenServer 7.3 and 7.4, and XCP-ng 7.4 version as hypervisor hosts. Fixes#2523.
This also fixes the issue of 4.11 VRs stuck in starting for up-to 10mins, before they come up online.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* fix https://issues.apache.org/jira/browse/CLOUDSTACK-10356
* del patch file
* Update ResourceCountDaoImpl.java
* fix some format
* fix code
* fix error message in VolumeOrchestrator
* add check null stmt
* del import unuse class
* use BooleanUtils to check Boolean
* fix error message
* delete unuse function
* delete the deprecated function updateDomainCount
* add error log and throw exception in ProjectManagerImpl.java
CloudStack SSO (using security.singlesignon.key) does not work anymore with CloudStack 4.11, since commit 9988c26, which introduced a regression due to a refactoring: every API request that is not "validated" generates the same error (401 - Unauthorized) and invalidates the session.
However, CloudStack UI executes a call to listConfigurations in method bypassLoginCheck. A non-admin user does not have the permissions to execute this request, which causes an error 401:
{"listconfigurationsresponse":{"uuidList":[],"errorcode":401,"errortext":"unable to verify user credentials and/or request signature"}}
The session (already created by SSO) is then invalidated and the user cannot access to CloudStack UI (error "Session Expired").
Before 9988c26 (up to CloudStack 4.10), an error 432 was returned (and ignored):
{"errorresponse":{"uuidList":[],"errorcode":432,"cserrorcode":9999,"errortext":"The user is not allowed to request the API command or the API command does not exist"}}
Even if the call to listConfigurations was removed, another call to listIdps also lead to an error 401 for user accounts if the SAML plugin is not enabled.
This pull request aims to fix the SSO issue, by restoring errors 432 (instead of 401 + invalidate session) for commands not available. However, if an API command is explicitly denied using ACLs or if the session key is incorrect, it still generates an error 401 and invalidates the session.
This extends securing of KVM hosts to securing of libvirt on KVM
host as well for TLS enabled live VM migration. To simplify implementation
securing of host implies that both host and libvirtd processes are
secured with management server's CA plugin issued certificates.
Based on whether keystore and certificates files are available at
/etc/cloudstack/agent, the KVM agent determines whether to use TLS or
TCP based uris for live VM migration. It is also enforced that a secured
host will allow live VM migration to/from other secured host, and an
unsecured hosts will allow live VM migration to/from other unsecured
host only.
Post upgrade the KVM agent on startup will expose its security state
(secured detail is sent as true or false) to the managements server that
gets saved in host_details for the host. This host detail can be accesed
via the listHosts response, and in the UI unsecured KVM hosts will show
up with the host state of ‘unsecured’. Further, a button has been added
that allows admins to provision/renew certificates to KVM hosts and can
be used to secure any unsecured KVM host.
The `cloudstack-setup-agent` was modified to accept a new flag `-s`
which will reconfigure libvirtd with following settings:
listen_tcp=0
listen_tls=1
tcp_port="16509"
tls_port="16514"
auth_tcp="none"
auth_tls="none"
key_file = "/etc/pki/libvirt/private/serverkey.pem"
cert_file = "/etc/pki/libvirt/servercert.pem"
ca_file = "/etc/pki/CA/cacert.pem"
For a connected KVM host agent, when the certificate are
renewed/provisioned a background task is scheduled that waits until all
of the agent tasks finish after which libvirt process is restarted and
finally the agent is restarted via AgentShell.
There are no API or DB changes.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Add stack traces information
* update stack trace info
* update stack trace to make them consistent
* update stack traces
* update stacktraces
* update stacktraces for other similar situations
* fix some other situations
* enhance other situations
* [CLOUDSTACK-10226] CloudStack is not importing Local storage properly
CloudStack is importing as Local storage any XenServer SR that is of type LVM or EXT. This causes a problem when one wants to use both Direct attach storage and local storage. Moreover, CloudStack was not importing all of the local storage that a host has available when local storage is enabled. It was only importing the First SR it sees.
To fix the first problem we started ignoring SRs that have the flag shared=true when discovering local storages. SRs configured to be shared are used as direct attached storage, and therefore should not be imported again as local ones.
To fix the second problem, we started loading all Local storage and importing them accordingly to ACS.
* Cleanups and formatting
* translate groovy test for ADLdapUserManagerImpl to java
* fixed by returning the actual result instead of false
* unit test case for manual mapped user in ldap
* [CLOUDSTACK-10241] Duplicated file SRs being created in XenServer pools
Due to a race condition between multiple management servers, in some rare cases, CloudStack is creating multiple file SRs to the same secondary folder. This causes a problem when introducing the SR to the XenServer pools, as “there will be VDIs with duplicated UUIDs“. The VDIs are the same, but they are seen in different SRs, and therefore cause an error.
The solution to avoid race conditions between management servers is to use a deterministic srUuid for the file SR to be created (we are leaving XenServer with the burden of managing race conditions). The UUID is based on the SR file path and is generated using UUID#nameUUIDFromBytes. Therefore, if there is an SR with the generated UUID, this means that some other management server has just created it. An exception will occur and it will contain a message saying 'Db_exn.Uniqueness_constraint_violation'. In these unlikely events, we catch the exception and use the method retrieveAlreadyConfiguredSrWithoutException to get the SR that has already been created for the given mount point.
Several fixes addressed:
- Dettach ISO fails when trying to detach a direct download ISO
- Fix for metalink support on SSVM agents (this closes CLOUDSTACK-10238)
- Reinstall VM from bypassed registered template (this closes CLOUDSTACK-10250)
- Fix upload certificate error message even though operation was successful
- Fix metalink download, checksum retry logic and metalink SSVM downloader
The new CA framework introduced basic support for comma-separated
list of management servers for agent, which makes an external LB
unnecessary.
This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).
This FR introduces two new global settings:
- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
for the agent's background task that checks and switches to agent's
preferred host.
The indirect.agent.lb.algorithm supports following algorithm options:
- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).
Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the mangement server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.
Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
'indirect.agent.lb.algorithm' global settings.
On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.
First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.
From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.
Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.
Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addresses.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-8855 Improve Error Message for Host Alert State
* [CLOUDSTACK-9846] create column to save the content of alert messages
Remove declaration of throws CloudRuntimeException
I also removed some unused variables and comments left behind
This closes#837
* Isolate a problematic test "smoke/test_certauthority_root"
* Refactored nuage tests
Added simulator support for ConfigDrive
Allow all nuage tests to run against simulator
Refactored nuage tests to remove code duplication
* Move test data from test_data.py to nuage_test_data.py
Nuage test data is now contained in nuage_test_data.py instead of
test_data.py
Removed all nuage test data from nuage_test_data.py
* CLOUD-1252 fixed cleanup of vpc tier network
* Import libVSD into the codebase
* CLOUDSTACK-1253: Volumes are not expunged in simulator
* Fixed some merge issues in test_nuage_vsp_mngd_subnets test
* Implement GetVolumeStatsCommand in Simulator
* Add vspk as marvin nuagevsp dependency, after removing libVSD dependency
* correct libVSD files for license purposes
pep8 pyflakes compliant
This deprecates and remove TLS 1.0 and 1.1 from preferred list of
protocols and keeps only TLSv1.2.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Update the displayText of XenServer ISO when it already exist in the DB
Besides updating the ISO display text, I also created unit test cases for 'createXenServerToolsIsoEntryInDatabase' and 'getActualIsoTemplate' methods.
* Formatting and cleanups for checkstyle of changed classes
L2 network refused to be designed on VXLAN physical network. Add fix for vxlan issue.
Add condition for L2 networks which do not allow specifying vlan.
* 4.11:
CLOUDSTACK-10306: Upgrade to VMware 6.5 vim jar dependency (#2467)
CLOUDSTACK-10298: fix for recreation of an earlier deleted Nuage managed network (#2460)
There is a race condition in the monitoring of the migration process on KVM. If the monitor wakes up in the tight window after the migration succeeds, but before the migration thread terminates, the monitor will get a LibvirtException “Domain not found: no domain with matching uuid” when checking on the migration status. This in turn causes CloudStack to sync the VM state to stop, in which it issues a defensive StopCommand to ensure it is correctly synced.
Fix: Prevent LibvirtException: "Domain not found" caused by the call to dm.getInfo()
CLOUDSTACK-10269: On deletion of role set name to null (#2444)
CLOUDSTACK-10146 checksum in java instead of script (#2405)
CLOUDSTACK-10222: Clean snaphosts from primary storage when taking (#2398)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When user creates a snapshot (manual or recurring), snapshot remains on
the primary storage, even if the snapshot is transferred successfully to
secondary storage. This is causing issues because XenServer can only hold
a limited number of snapshots in its VDI chain, preventing the user from
creating new snapshots after some time, when too many old snapshots are
present on the primary storage.
Automate dynamic roles migration for missing props file
- In case commands.properties file is missing, enables dynamic roles.
- Adds a new -D or --default flag to migrate-dynamicroles.py script
to simply update the global setting and use the default role-rule
permissions.
- Add warning message, ask admins to move to dynamic roles during upgrade
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>