* Suqash commits to a single commit and rebase against master
Update marvin tests to use white list
* * Fix marvin test failure
* Add new marvin negative tests cases
* Remove hard-coded hypervisor types in marvin tests
* Fix build error after rebase and add hugepagesless
* Fix readability of python code
* Fix failing test
* Adding cleanup of vms for negative tests
* Bug fixes - change config checks properly and block extraconfig in details
* Trim to compare the keys
* CR comments
* Don't skip extraconfig without exception
Co-authored-by: Boris Stoyanov - a.k.a Bobby <bss.stoyanov@gmail.com>
* 4.13:
only update powerstate if sure it is the latest (#3743)
ui: fix migrate host form no host popup (#3682)
client: jetty session timeout set after server is started (#3658)
Increase DHCP lease time to infinite (#3662)
Currently in cloudstack, when we click on "Acquire New Ip", it will
randomly acquire IP from the pool. With this enhancement, it is
possible to select the IP from the drop down IP list of that network.
Same thing applies for a VPC as well.
* Service layer changes for new way of tracking maintanence progress
* Fixes after offline code review
* Fix marvin tests
* Change state name and add documentation
* Fix test
* Fix and add more unit tests for different caseS
* Fix and enhance Marvin Tests
* Fixes for corner cases
* More fixes and logging
* UI fixes
* Some minor changes and reducing VMs on host for more contained tests
* Fixed ssh client auth problem causing test failure
* Code review changes + fixes + some more logging
* Fix flaky tests by adding delays between host states
* Added fetching only enabled hosts for tests
* Make port blocking KVM specific and refactor to handle failure
* Make failing migrations due to tagged host instead of port blocking
* Added additional check for migrating VMs
* Refactor to use single place for methods checking maintenance states
* Add missing HA config keys
* Change time value to seconds
* Change Integer to Long
* Using ConfigKey defaultValue
* Do some code refactoring
* Simplify code
If the volume is in "Expunged" state then it should not be
considered towards total resource count of "primarystoragetotal"
field.
Currently cloudstack takes into resource calculation even if the
volume is expunged. The volume itself doesnt exist in primage
storage and hence it should not be considered towrds resource
caculation.
Steps to reproduce the issue:
1 . Get the resource count of "primarystoragetotal" of a particular domain.
2 . Create a VM with 5GB root disk size and stop it.
3 . Now the value of "primarystoragetotal" should be intitial value plus 5.
4 . Navigate to "volumes" of the VM and select "Download Volume" option.
5 . Once the volume is downloaded, expunge the VM.
6 . Get the resource count of "primarystoragetotal". it will be same value as in step 3
But it should be same as initial value obtained in step 1.
With this fix, the value obtained at step 6 will be same as in step 1.
In case an older SSVM is removed without changing it's state from Up
to Destroyed/Removed etc, the SSVM may be randomly selected for image
store related operations. This fix ensures that endpoints for an image
store are found only from a set of SSVM hosts that are not removed.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Diagnostics are hard when a snapshot fails if a null pointer occurs. This is because no stack trace or location of the error is logged. I.E.
2019-10-21 12:55:00,056 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-131:ctx-80420156 job-10033827/job-10033828 ctx-4864e2f5) (logid:21454564) copy snasphot failed: java.lang.NullPointerException
* server: Do NOT cleanup dhcp and dns when stop a vm
According comment in PR #3608, dhcp and dns entries are cleaned up only when a VM is expunged.
Revert part of commit 8fb388e931.
* server: cleanup dns/dhcp entries in removeNic instead of finalizeExpunge
- Removes CentOS6/el6 packaging (voting thread reference https://markmail.org/message/u3ka4hwn2lzwiero)
- Add upgrade path from 4.13 to 4.14
- Enable live storage migration support for KVM by default as el6 is deprecated
- PRs using live storage migration
#2997 KVM VM live migration with ROOT volume on file storage type
#2983 KVM live storage migration intra cluster from NFS source and destination
#2298 CLOUDSTACK-9620: Enhancements for managed storage
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When a network IP range is removed, the "vlan" stays mapped on pod_vlan_map; therefore, the method that lists the VLANs by pod id will return null VLANS.
This PR adds proper verifications to avoid null pointer exception when deploying VRs on a pod with removed VLANs. The exception was caused on getPlaceholderNicForRouter.
Problem: In Vmware, appliances that have options that are required to be answered before deployments are configurable through vSphere vCenter user interface but it is not possible from the CloudStack user interface.
Root cause: CloudStack does not handle vApp configuration options during deployments if the appliance contains configurable options. These configurations are mandatory for VM deployment from the appliance on Vmware vSphere vCenter. As shown in the image below, Vmware detects there are mandatory configurations that the administrator must set before deploy the VM from the appliance (in red on the image below):
Solution:
On template registration, after it is downloaded to secondary storage, the OVF file is examined and OVF properties are extracted from the file when available.
OVF properties extracted from templates after being downloaded to secondary storage are stored on the new table 'template_ovf_properties'.
A new optional section is added to the VM deployment wizard in the UI:
If the selected template does not contain OVF properties, then the optional section is not displayed on the wizard.
If the selected template contains OVF properties, then the optional new section is displayed. Each OVF property is displayed and the user must complete every property before proceeding to the next section.
If any configuration property is empty, then a dialog is displayed indicating that there are empty properties which must be set before proceeding
image
The specific OVF properties set on deployment are stored on the 'user_vm_details' table with the prefix: 'ovfproperties-'.
The VM is configured with the vApp configuration section containing the values that the user provided on the wizard.
Fix regression bug that affects KVM local storage migration. Some of the desired execution flows for KVM local storage migration had been altered to allow only managed storage to execute. Fixed allowing managed and non managed storages to execute.
Fixes#3521
Retrieval of an image store using ImageStoreProviderManager has been refactored by introducing three different methods,
DataStore getRandomImageStore(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will not be used here.
DataStore getImageStoreWithFreeCapacity(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will be used here and the store with max free space will be returned. If no store with filled storage less than the threshold is found, the NULL value will be returned.
List<DataStore> listImageStoresWithFreeCapacity(List<DataStore> imageStores);
To get a list of image stores for writing purpose which fulfills threshold capacity check.
Correspondingly DataStoreManager methods have been refactored to return similar values for a given zone.
Fixes#3287 - NULL value will be returned when secondary storage is needed for writing but there is not store with free space.
Fixes#3041 - Rather than returning random secondary storage for writing, storage with max. free space will be returned.
Fixes#3478 - For migration on VMware, all writable secondary storage will be mounted while preparation.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Currently when refreshing disk usage stats all kvm agents are asked to collect stats for all volumes. In setups with multiple kvm hosts where managed storage is used, not all volumes are attached to all kvm hosts, this results in a large number of warnings in the kvm agent logs. This change introduces a filter step in case managed storage is used so that the management server only requests kvm agents for stats about volumes that are connected to each kvm host.
Fixes checkstyle issue caused by previous commit 6a511fce40
from PR #3466 where a minor review fix did not address this. Merging this
one to unblock few other PRs after running a local build test.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Add CephSnapshotStrategy to handle RBD revert (rollback) snapshot. In order to support RBD revert (rbd_rollback), this PR adds a CephSnapshotStrategy class to handle Ceph/RBD snapshot actions.
* Add revoke certificates API
* Add background task to sync certificates
* Fix marvin test and revoke certificate
* Fix certificate sent to hypervisor was missing headers
* Fix background task for uploading certificates to hosts
If projects have many resource tags, it will take a long time to list projects.
Remove resource tags information from project_view will fix it the issue.
Fixes#3178
Problem: Users don't know what keys/values to enter for template and VM details.
Root Cause: The feature does not exist that can list possible details and options.
Solution: Based on the possible VM and template details handled by the
codebase, those details were refactored and a list API is introduced
that can return users those details along with possible values. When
users add details now, they will be presented with a list of key details
and their possible options if any.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes#2742
UI Supported ordering VPC Offerings but the API did not have that
support implemented. This makes the change in updateVPCOfferings
and listVPCOfferings API calls, along with necessary database
changes for supporting sorting of VPC Offerings.
Problem: The disk offering change is not reflected in cloud_usage database table.
Root Cause: The resizeVolume API does not publish the volume disk offering change event to the
cloud_usage database table.
Solution: This issue has been fixed by refactoring the resizeVolume API to publish this disk offering change for volumes that either in Allocated or Ready state.
Moves the method that published events for volumes in Ready state from
the VolumeStateListener class to the orchestrateResizeVolume method in
the VolumeApiService.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem: Not able to configure a sort order for the zones that are listed in various views in the UI.
Root Cause: There is no mechanism to accept sort key for existing zones or UI widget, that would allow to listing zones in the UI in a certain order.
Solution: The order of zones in listed in various views in the UI can now be configured through the newly added “sort_key” field added for the zone. It can be set using updateZone API by providing “sort_key” parameter for a zone, or by reordering the items in the zones list in the UI. UI has been updated to show ordering controls in zones list view. Database changes include updating table “data_center” by adding “sort_key” column (containing integer values and defaults to zero).
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem: Admins don’t want to charge for IP address usage on certain (shared) networks.
Root Cause: There is no flag or detail for admins to provide using UI or API when creating networks to specify if they want IP address usage of the network hidden.
Solution: A new boolean hideipaddressusage flag is added to the createNetwork API and a checkbox in the ‘Add guest network’ UI for the root admins to specify if they want the shared network’s IP address usage to be hidden in the listUsageRecords API response. The provided flag is saved as the ‘hideIpAddressUsage’ detail in the cloud.network_details table for the network. For existing (shared) networks, root admins can also specify the same boolean API parameter hideipaddressusage with the updateNetwork API request to configure the behaviour for an existing network. When the detail/flag is true, the IP address usage for the (shared) network is not exported in the listUsageRecords API response. The listNetworks API response will include the details of a network for root admin only. (note usage is still recorded in the usage database but not return by the listUsageRecords API)
The API flag works for any kind of network via the API, but the checkbox is only shown while creating shared networks in the UI.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes tests path from old layout to standard maven in src/test/java/
- Removed duplicate SnapshotManagerImpl at old path `server/src/com...`
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Feature Specification: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653548
Live storage migration on KVM under these conditions:
From source and destination hosts within the same cluster
From NFS primary storage to NFS cluster-wide primary storage
Source NFS and destination NFS storage mounted on hosts
In order to enable this functionality, database should be updated in order to enable live storage capacibilty for KVM, if previous conditions are met. This is due to existing conflicts between qemu and libvirt versions. This has been tested on CentOS 6 hosts.
Additional notes:
To use this feature set the storage_motion_supported=1 in the hypervisor_capability table for KVM. This is done by default as the feature may not work in some environments, read below.
This feature of online storage+VM migration for KVM will only work with CentOS6 and possible Ubuntu as KVM hosts but not with CentOS7 due to:
https://bugs.centos.org/view.php?id=14026https://bugzilla.redhat.com/show_bug.cgi?id=1219541
On CentOS7 the error we see is: " error: unable to execute QEMU command 'migrate': this feature or command is not currently supported" (reference https://ask.openstack.org/en/question/94186/live-migration-unable-to-execute-qemu-command-migrate/). Reading through various lists looks like the migrate feature with qemu may be available with paid versions of RHEL-EV but not centos7 however this works with CentOS6.
Fix for CentOS 7:
Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
Reboot host
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem: When a multi-disk OVA template is uploaded, only the root disk is recognized and VMs deployed using such template only get the root disk provisioned.
Root Cause: The template processor for multi-disk OVA was not used in the template upload processor.
Solution: Added support for local multi-disk OVA template upload. After a multi-disk OVA template is
uploaded, the mechanism that worked on multi-disk OVA templates registered using URL is now also used to discovers and creates data-disk templates in cloud.vm_template table and on the secondary storage.
To enable SSL on SSVMs :
• Upload the certificates like you usually do via the API or UI->Infrastructure tab
• Set the global settings secstorage.encrypt.copy, secstorage.ssl.cert.domain to appropriate values
along with the CPVM ones
• Restart management server (no need to destroy/restart SSVM (or the ssvm agent))
Test cases:
- Upload template and check it creates multi-disk folders on secondary
storage and entries in cloud.vm_template table
- Upload template and kill/shutdown management server. Then restart MS
to check if template sync works
- Copy template across zone of an uploaded template
Signed-off-by: Rohit Yadav rohit.yadav@shapeblue.com
This PR is for deactivating Ehcache in CloudStack since it is not usable. The first commit remove the default RMI cache peering configured for multicast which most of the time cannot work. It also requires to have an interface up which is not always the case while developing offline.
The second commits remove the configuration to activate caching on some DAOs.
Problems
The code in CS does not seem to fit any caching mechanism especially due to the homemade DAO code. The main 3 flaws are the following:
Entities are not expected to be shared
There is quite a lot of code with method calls passing entity IDs value as long, which does some object fetching. Without caching, this behavior will create distinct objects each time an entity with the same ID is fetched. With the cache enabled, the same object will be shared among those methods. It has been seen that it does generate some side effects where code still expected unchanged entity attributes after calling different methods thus generating exception/bugs.
DAO update operations are using search queries
Some part of the code are updating entities based on a search query, therefore the whole cache must be invalidated (see GenericDaoBase: public int update(UpdateBuilder ub, final SearchCriteria<?> sc, Integer rows);).
Entities based on views joining multiple tables
There are quite a lot of entities based on SQL views joining multiple entities in a same object. Enabling caching on those would require a mechanism to link and cross-remove related objects whenever one of the sub-entity is changed.
Final word
Based on the previously discussed points, the best approach IMHO would be to move out of the custom DAO framework in CS and use a well known one (out of scope of this change of course). It will handle caching well and the joins made by the views in the code. It's not an easy change, but it will fix along a lot of issues and add a proven / robust framework to an important part of the code.
Since the CloudStack virtual router was redesigned on version 4.6 it has been observed that the DHCP leases file is not persistent across network operations. This causes conflicts on guest VMs static IPs, causing these static IPs to not be renewed by the DHCP server running on isolated and VPC networks' virtual routers (dnsmasq). On stopping or destroying a VM, its dhcp/dns records are not removed from the virtual router causing ghost effects.
Fixes#3272Fixes#3354
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* DPDK vHost User mode selection
* SQL text field and DPDK classes refactor
* Fix NullPointerException after refactor
* Fix unit test
* Refactor details type
This commit simplifies the generateDestPath method and fixes an issue where an extra file, named as 'null', was created on the target storage pool during VM local storage volume migration. Without this fix, the VM is migrated and there is no data loss; however, 193 KB is allocated for the unused file named as 'null' and the file stays on the target storage.
Added changes for creating service offerings for specified domain(s) and zone(s).
Fixed checkAccess for disk offerings.
Fixed list APIs for disk and service offerings.
UI changes for creating disk, service offerings for specified domain(s) and zone(s).
Signed-off-by: Abhishek Kumar <abhishek.kumar@shapeblue.com>
Allows creating storage offerings associated with particular domain(s) and zone(s). In create disk/storage offfering form UI, a mult-select control has been addded to select desired zone(s) and domain select element has been made multi-select.
createDiskOffering API has been modified to allow passing list of domain and zone IDs with keys domainids and zoneids respectively. These lists are stored in DB in cloud.disk_offering_details table with 'domainids' and 'zoneids' key as string of comma separated list of IDs. Response for create, update and list disk offering APIs will return domainids, domainnames, zoneids and zonenames in details object of offering.
listDiskOfferings API has been modified to allow passing zoneid to return only offerings which are associated with the zone.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Keep connection alive when on maintenance
* Refactor cancel maintenance and unit tests
* Add marvin tests
* Refactor
* Changing the way we get ssh credentials
* Add check on SSH restart and improve marvin tests
Problem: Custom compute offering does not allow setting min and max values for CPU and VRAM for custom VMs.
Root Cause: Custom compute offerings cannot be created with a given range of CPU number and memory instead it allows only fixed values.
Solution: createServiceOffering API has been modified to allow setting a defined range for CPU number and memory. Also, UI form for compute offering creation is provided with a new field named 'compute offering type’ with values - Fixed, Custom Constrained, Custom Constrained. It will allow the creation of compute offerings either with a fixed CPU speed and memory for fixed compute offering, or with a range of CPU number and memory for custom constrained compute offering or without predefined CPU number, CPU speed and memory for custom unconstrained compute offering.
To allow the user to set CPU number, CPU speed and memory during VM deployment, UI form for VM deployment has been modified to provide controls to change these values. These controls are depicted in screenshots below for custom constrained and custom unconstrained compute offering types.
Sample API calls using cmk to create a constrained service offering and deploying a VM using it,
create serviceoffering name=Constrained displaytext=Constrained customized=true mincpunumber=2 maxcpunumber=4 cpuspeed=400 minmemory=256 maxmemory=1024
deploy virtualmachine displayname=ConstrainedVM serviceofferingid=60f3e500-6559-40b2-9a61-2192891c2bd6 templateid=8e0f4a3e-601b-11e9-9df4-a0afbd4a2d60 zoneid=9612a0c6-ed28-4fae-9a48-6eb207af29e3 details[0].cpuNumber=3 details[0].memory=800
Signed-off-by: Abhishek Kumar <abhishek.kumar@shapeblue.com>
- Fixes PR #3146 db cleanup to the correct 4.12->4.13 upgrade path
- Fixes failing unit test due to jdk specific changes after forward
merging
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* skip geting used bytes for volumes that are not in Ready state
* updated log message
* filter snapshots by state backedup
* removed * import
* filter templates by state 'DOWNLOADED'
* refactored getUsedBytes to use O(1) queries
* querying for ready volumes instead filtering in memory
* make listByStoreIdInReadyState more generic ex listByStoreIdAndState
* updated snapshot search criteria for listByStoreIdAndState
* updated template search criteria for listByPoolIdAndState
* fixed typo in search criteria for listByTemplateAndState
* fixed typo in search criteria for templates in listByPoolIdAndState
With this patch b766bf7
we started tracking disks in attaching state so that other attach request can fail gracefully. However this missed the case where disks were in allocated state but attach was requested.
For the use case where users want to attach disk in allocated state but not ready, we need to have allocated-attaching transition as well. We must take care of returning to the original state - allocated or ready - when attach request has completed.
For the use case of unstarted vm's the disk must proceed as follows - "Allocated" -> Attaching -> Allocated. When VM is started, the disk is "created" and pool is assigned. For the use case of started VMs it's more trivial and disk proceeds as follows - Ready -> Attaching -> Ready.
Test this by creating a VM with "startvm=false", create a disk and try attaching it in allocated state. It would give an exception on latest 4.11 but will be fixed on this patch.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>