After upgrade existing environments to 4.11, ConfigDrive cannot be enabled for existing zones due to missing entry on 'physical_network_service_providers' table.
Prevents errors while migrating VM from ISO:
Test 1: Deploy VM from ISO -> Live migrate VM to another host -> ERROR
Test 2: Register ISO using Direct Download on KVM -> Deploy VM from ISO -> Live migrate VM to another host -> ERROR
- Prevent NullPointerException migrating VM from ISO
- Prevent mount secondary storage on ISO direct downloads on KVM
This force stops old VRs when performing rolling restart with
cleanup=true. This will ensure that VRs are powered off quickly than
wait longer for the normal ACPI shutdown. During testing, it was found
on VMware where VM stops are slow compared to XenServer and KVM.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Unify checksum API output for templates and ISOs: not list the checksum algorithm on:
KVM direct downloads
On in progress normal template downloads. The algorithm is shown on the listtemplates API, but after it is downloaded it is not shown anymore.
This adds a global setting for admins who may not want the rolling
restart of routers or are seeing any issues around it. In future, this
setting may be removed.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes#2719 where private gateway IP might be incorrectly
programmed on a guest network nic. The VR would now check ipassoc
requests by mac addresses than provided nic/device id in case they are
wrong.
The root cause is that the device id information is lost when aggregated
commands are created upon starting of a new VPC VR, without the correct
device id in ip_associations json it mis-programs the VR.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-9473: storage pool capacity check when volume is resized or migrated
Storage pool checker is not being called on resize and migrate volume.
This may lead to allocated percentage of storage above 100%.
Setup:
1 VMware cluster with 2 Hosts.
Executed Steps:
Applied the following global settings:
storage.overprovisioning.factor = 1
pool.storage.allocated.capacity.disablethreshold = 1
pool.storage.capacity.disablethreshold = 1
Restarted management server
Executed Resize and migrate pool and Observed that Storage pool checker is not performed on resizeVolume and migrateVolume.
Result:
Root cause analysis shows storage pool checker is not called when doing migration and resizing.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This re-introduces the rebooting of VR after setup of nics/macs in
case of VMware. It also adds a minor enhancement to show the console
esp. for root admins when VRs and systemvms are in starting state.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Previously, the ethernet device index was used as rt_table index and
packet marking id/integer. With eth0 that is sometimes used as link-local
interface, the rt_table index `0` would fail as `0` is already defined
as a catchall (unspecified). The fwmarking on packets on eth0 with 0x0
would also fail. This fixes the routing issues, by adding 100 to the
ethernet device index so the value is a non-zero, for example then the
relationship between rt_table index and ethernet would be like:
100 -> Table_eth0 -> eth0 -> fwmark 100 or 0x64
101 -> Table_eth1 -> eth1 -> fwmark 101 or 0x65
102 -> Table_eth2 -> eth2 -> fwmark 102 or 0x66
This would maintain the legacy design of routing based on packet mark
and appropriate routing table rules per table/ids. This also fixes a
minor NPE issue around listing of snapshots.
This also backports fixes to smoketests from master.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.
Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
As rolling restart does not deallocate an IP before configuring it on a new VR, the code must allow it to be reused on a non-redundant VPCs gateway nic.
In crease ping counts to reduce intermittent failures in smoketests.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This ensure that fewer mount points are made on hosts for either
primary storagepools or secondary storagepools.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Not possible to deploy an Advanced zone with Security Groups, and VXLAN isolation method on KVM. Exception: "Unable to convert network offering with specified id to network profile" is logged.
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
* Create unit test cases for 'ConfigDriveBuilder' class
* add method 'getProgramToGenerateIso' as suggested by rohit and Daan
* fix encoding for base64 to StandardCharsets.US_ASCII
* fix MockServerTest.testIsMockServerCanUpgradeConnectionToSsl()
This is another method that is causing Jenkins to fail for almost a month
This fixes and ensures that every VM has its capacity individually
calculated, with the initial override of 1.0f as overcommit ratio.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When configuring a pre-setup primary storage we can enter the name-label of the storage that is going to be used by ACS and is already set up in the host. The problem is that we can use any String of characters there, and this String does not need to be a UUID. When listing volumes from a primary storage that has such conditions, the list will return all of the volumes in the cloud because the “API framework” will ignore that value as it is not a UUID type.
This introduces a new global setting `vm.configdrive.primarypool.enabled` to toggle creation/hosting of config drive iso files on primary storage, the default will be false causing them to be hosted on secondary storage. The current support is limited from hypervisor resource side and in current implementation limited to `KVM` only. The next big change is that config drive is created at a temporary location by management server and shipped to either KVM or SSVM agent via cmd-answer pattern, the data of which is not logged in logs. This saves us from adding genisoimage dependency on cloudstack-agent pkg.
The APIs to reset ssh public key, password and user-data (via update VM API) requires that VM should be shutdown. Therefore, in the refactoring I removed the case of updation of existing ISO. If there are objections I'll re-put the strategy to detach+attach new config iso as a way of updation. In the refactored implementation, the folder name is changed to lower-cased configdrive. And during VM start, migration or shutdown/removal if primary storage is enable for use, the KVM agent will handle cleanup tasks otherwise SSVM agent will handle them.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Primary Storage count for an account does not decrease when a Data Disk is deleted
When a data disk is created and not attached in a running VM, the "deleteVolume" will not decrement the count for used primary storage in the VMs accounting information. The property that is not being decremented is called "primarystoragetotal"; this information can be retrieved via "listAccounts" API method.
Steps to reproduce this issue:
1 - Create an account, deploy a VM in it
2 - Check the primary storage count for the account with listAccounts API
3 - Create a data disk
4 - Check the primary storage count for the account with listAccounts API
5 - Delete the Data disk
6 - Check the primary storage count for the account with listAccounts API - It is the same as before deleting the data disk (it should not be the same as the value in step 2!)
* formatting and cleanups
* fix imports that were wrongly changed during rebase
This fixes config drive to use VM's user provided host-name instead of
the internal VM instance ID for hostname related config in both
cloudstack and openstack metadata bundled in the ISO.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-9184: Fixes#2631 VMware dvs portgroup autogrowth
This deprecates the vmware.ports.per.dvportgroup global setting.
The vSphere Auto Expand feature (introduced in vSphere 5.0) will take
care of dynamically increasing/decreasing the dvPorts when running out
of distributed ports . But in case of vSphere 4.1/4.0 (If used), as this
feature is not there, the new default value (=> 8) have an impact in the
existing deployments. Action item for vSphere 4.1/4.0: Admin should
modify the global configuration setting "vmware.ports.per.dvportgroup"
from 8 to any number based on their environment because the proposal
default value of 8 would be very less without auto expand feature in
general. The current default value of 256 may not need immediate
modification after deployment, but 8 would be very less which means
admin need to update immediately after upgrade.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This introduces a rolling restart of VRs when networks are restarted
with cleanup option for isolated and VPC networks. A make redundant option is
shown for isolated networks now in UI.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Supporting ConfigDrive user data on L2 networks.
Add UI checkbox to create L2 network offering with config drive.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This returns the boolean value of the `isuserdefined` key than
converting it to string. Fixes#2529.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10147 Disabled Xenserver Cluster can still deploy VM's. Added code to skip disabled clusters when selecting a host (#2442)
(cherry picked from commit c3488a51db)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10318: Bug on sorting ACL rules list in chrome (#2478)
(cherry picked from commit 4412563f19)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10284:Creating a snapshot from VM Snapshot generates error if hypervisor is not KVM.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10221: Allow IPv6 when creating a Basic Network (#2397)
Since CloudStack 4.10 Basic Networking supports IPv6 and thus
should be allowed to be specified when creating a network.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
(cherry picked from commit 9733a10ecd)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10214: Unable to remove local primary storage (#2390)
Allow admins to remove primary storage pool.
Cherry-picked from eba2e1d8a1
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* dateutil: constistency of tzdate input and output (#2392)
Signed-off-by: Yoan Blanc <yoan.blanc@exoscale.ch>
Signed-off-by: Daan Hoogland <daan.hoogland@shapeblue.com>
(cherry picked from commit 2ad5202823)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10054:Volume download times out in 3600 seconds (#2244)
(cherry picked from commit bb607d07a9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* When creating a new account (via domain admin) it is possible to select “root admin” as the role for the new user (#2606)
* create account with domain admin showing 'root admin' role
Domain admins should not be able to assign the role of root admin to new users. Therefore, the role ‘root admin’ (or any other of the same type) should not be visible to domain admins.
* License and formatting
* Break long sentence into multiple lines
* Fix wording of method 'getCurrentAccount'
* fix typo in variable name
* [CLOUDSTACK-10259] Missing float part of secondary storage data in listAccounts
* [CLOUDSTACK-9338] ACS not accounting resources of VMs with custom service offering
ACS is accounting the resources properly when deploying VMs with custom service offerings. However, there are other methods (such as updateResourceCount) that do not execute the resource accounting properly, and these methods update the resource count for an account in the database. Therefore, if a user deploys VMs with custom service offerings, and later this user calls the “updateResourceCount” method, it (the method) will only account for VMs with normal service offerings, and update this as the number of resources used by the account. This will result in a smaller number of resources to be accounted for the given account than the real used value. The problem becomes worse because if the user starts to delete these VMs, it is possible to reach negative values of resources allocated (breaking all of the resource limiting for accounts). This is a very serious attack vector for public cloud providers!
* [CLOUDSTACK-10230] User should not be able to use removed “Guest OS type” (#2404)
* [CLOUDSTACK-10230] User is able to change to “Guest OS type” that has been removed
Users are able to change the OS type of VMs to “Guest OS type” that has been removed. This becomes a security issue when we try to force users to use HVM VMs (Meltdown/Spectre thing). A removed “guest os type” should not be usable by any users in the cloud.
* Remove trailing lines that are breaking build due to checkstyle compliance
* Remove unused imports
* fix classes that were in the wrong folder structure
* Updates to capacity management
* CLOUDSTACK-10289: Config Drive Metadata: Use VM UUID instead of VM id
* CLOUDSTACK-10288: Config Drive Userdata: support for binary userdata
* CLOUDSTACK-10358: SSH keys are missing on Config Drive disk in some cases
CloudStack SSO (using security.singlesignon.key) does not work anymore with CloudStack 4.11, since commit 9988c26, which introduced a regression due to a refactoring: every API request that is not "validated" generates the same error (401 - Unauthorized) and invalidates the session.
However, CloudStack UI executes a call to listConfigurations in method bypassLoginCheck. A non-admin user does not have the permissions to execute this request, which causes an error 401:
{"listconfigurationsresponse":{"uuidList":[],"errorcode":401,"errortext":"unable to verify user credentials and/or request signature"}}
The session (already created by SSO) is then invalidated and the user cannot access to CloudStack UI (error "Session Expired").
Before 9988c26 (up to CloudStack 4.10), an error 432 was returned (and ignored):
{"errorresponse":{"uuidList":[],"errorcode":432,"cserrorcode":9999,"errortext":"The user is not allowed to request the API command or the API command does not exist"}}
Even if the call to listConfigurations was removed, another call to listIdps also lead to an error 401 for user accounts if the SAML plugin is not enabled.
This pull request aims to fix the SSO issue, by restoring errors 432 (instead of 401 + invalidate session) for commands not available. However, if an API command is explicitly denied using ACLs or if the session key is incorrect, it still generates an error 401 and invalidates the session.