* vmware: delete snapshot disk after backup to secondary storage
WIP - This ensures that worker VM is destroyed along with any of its own
disks that are backed up to secondary storage.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix for volume backup and confuding vm var name
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* change
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* tag as worker vm
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* vmware: fix copy systemvm.iso for same version
For VMware, systemvm.iso is copied from MS to secondary store. Current server checks if the desired file is present on the secondary store or not. If it is not present ISO is copied.
This change adds a check for checksum for source and destination ISO which would allow copying new ISO if there is a mismatch.
Useful in case when file is corrupted in secondary store or new systemvm.iso is generated for dev environment.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Fix of creating volumes from snapshots without backup
When few snaphots are created onyl on primary storage, and try to create
a volume or a template from the snapshot only the first operation is
successful. Its because the snapshot is backup on secondary storage with
wrong SQL query. The problem appears on Ceph/NFS but may affects other
storage plugins.
Bypassing secondary storage is implemented only for Ceph primary storage
and it didn't cover the functionality to create volume from snapshot
which is kept only on Ceph
* Address review
Worker VM tags are missed for few cloned VMs in VMware, and so these are skipped when tracking / cleaning up of Worker VMs. Adding proper Worker VM tags to these VMs would make them trackable from CloudStack.
* Added support for removing unused port groups on VMWare
* Fixed error handling around unavailable portgroup name
* Review changes, defaulting glovbal var to false, added warning to description, changed if statement.
* Cleanup unused network port groups on all the hosts.
Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Co-authored-by: nicolas <nicovazquez90@gmail.com>
* CLOUDSTACK-9175: [VMware DRS] Adding new host to DRS cluster does not participate in load balancing.
Summary: When a new host is added to a cluster, Cloudstack doesn't create all the port groups (created by cloudstack earlier in other hosts) present in the cluster. Since the new host doesn't have all the necessary networking port groups of cloudstack, it is not eligible to participate in DRS load balancing or HA.
Solution: When adding a host to the cluster in Cloudstack, use VMware API to find the list of unique port groups on a previously added host (older host in the cluster) if exists and then create them on the new host.
* Added few checks for cluster details
* remove hot enable cpu und memory in case of reservation
ram and cpu reservation have not relation to ram and cpu hot add
* add custom ram_reservation and it to vm details
* system vms haven't this property, for this reason add additional check
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/vmware/resource/VmwareResource.java
Co-authored-by: dahn <daan.hoogland@gmail.com>
* replace 0.0 with NumberUtils
* remove default value and remove return MinRam(seems to be not necessary)
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/guru/VmwareVmImplementer.java
Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com>
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/vmware/resource/VmwareResource.java
Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com>
Co-authored-by: DK101010 <dirk.klahre@itelligence.de>
Co-authored-by: dahn <daan.hoogland@gmail.com>
Co-authored-by: davidjumani <dj.davidjumani1994@gmail.com>
* Create utility to centralize byte convertions
* Add/change toString definitions
* Create Libvirt handler to ScaleVmCommand
* Enable dynamic scalling VM with KVM
* Move config from interface to class and rename it
As every variable declared in interfaces are already final,
this moving will be needed to mock tests in nexts commits
* Configure VM max memory and cpu cores
The values are according to service offering or global configs
* Extract dpdk configuration to a method and test it
* Extract OS desc config to a method and test it
* Extract guest resource def to a method and test it
Improve libvirt def
* Refactor LibvirtVMDef.GuestResourceDef
* Refactor ScaleVmCommand
* Improve VMInstaVO toString()
* Refactor upgradeRunningVirtualMachine method
* Turn int variables into long on utility
* Verify if VM is scalable on KVMGuru
* Rename some KVMGuruTest's methods
* Change vm's xml to work with max memory
* Verify if service offering is dynamic before scale
* Create methods to retrieve data from domain
* Create def to hotplug memory
* Adjust the way command was scaling the VM
* Fix database persistence before executing command
* Send more info to host to improve log
* Fix var name
* Fix missing "}"
* Undo unnecessary changes
* Address review
* Fix scale validation
* Add VM prepared for dynamic scaling validation
* Refactor LibvirtScaleVmCommandWrapper and improve unit tests
* Remove duplicated method
* Add RuntimeException check
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Remove copyright from header
* Update ByteScaleUtilsTest.java
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Add SharedMountPoint to KVMs supported storage pool types
* Fix live migration to iSCSI and improve logs
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* disable hot add memory and cpu via vm settings
* add alternative implementation for hot add memory and cpu
* add log entry
* Modify and add log entry for hotadd
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/vmware/resource/VmwareResource.java
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/vmware/resource/VmwareResource.java
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
Co-authored-by: DK101010 <dirk.klahre@itelligence.de>
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* server: skip zone check for PERHOST iso during attachIso
Hypervisor tools ISO - vmware-toools.iso, xs-tools.iso are marked as PERHOST in DB. They are active but not downloaded to the secondary storages and hence no template-zone entry.
Skips the template-zone check for such templates.
Fixes#5265
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* inverted check
* use constants in TemplateManager
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Externalize KVM Agent storage's timeout configuration
* Address @nvazquez review
* Add empty line at the end of the agent.properties file
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Refactor method createVMFromSpec
* Add unit tests
* Fix test
* Extract if block to method for add extra configs to VM Domain XML
* Split travis tests trying to isolate which test is causing an error
* Override toString() method
* Update documentation
* Fix checkstyle error (line with trailing spaces)
* Change VirtualMachineTO print of object
* Add try except to find message error. Remove after test
* Fix indent
* Trying to understanding why is happening in this code
* Refactor method createVMFromSpec
* Add unit tests
* Fix test
* Extract if block to method for add extra configs to VM Domain XML
* Split travis tests trying to isolate which test is causing an error
* Override toString() method
* Update documentation
* Fix checkstyle error (line with trailing spaces)
* Remove unnecessary comment
* Revert travis tests
Co-authored-by: SadiJr <17a0db2854@firemailbox.club>
* Externalize KVM Agent storage's timeout configuration
Created a class of constant agent's properties available to configure on "agent.properties".
Created a class to provides a facility to read the agent's properties file and get its properties.
* Refactored KVHAMonitor nested thread and changed some logs
* It has been added the timeout's config in the agent.properties file
* Rename classes
* Rename var and remove comment
* Fix typo with word "heartbeat"
* Extract multiple methods call to variables
* Add unit tests to file handler
* Increase info about the property
* Create inner class Property
* Rename method getProperty to getPropertyValue
* Remove copyright
* Remove copyright
* Extract code to createHeartBeatCommand
* Change method access from protected to private
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
* Added disk provisioning type support for VMWare
* Review changes
* Fixed unit test
* Review changes
* Added missing licenses
* Review changes
* Update StoragePoolInfo.java
Removed white space
* Review change - Getting disk provisioning strictness setting using the zone id and not the pool id
* Delete __init__.py
* Merge fix
* Fixed failing test
* Added comment about parameters
* Added error log when update fails
* Added exception when using API
* Ordering storage pool selection to prefer thick disk capable pools if available
* Removed unused parameter
* Reordering changes
* Returning storage pool details after update
* Removed multiple pool update, updated marvin test, removed duplicate enum
* Removed comment
* Removed unused import
* Removed for loop
* Added missing return statements for failed checks
* Class name change
* Null pointer
* Added more info when a deployment fails
* Null pointer
* Update api/src/main/java/org/apache/cloudstack/api/BaseListCmd.java
Co-authored-by: dahn <daan.hoogland@gmail.com>
* Small bug fix on API response and added missing bracket
* Removed datastore cluster code
* Removed unused imports, added missing signature
* Removed duplicate config key
* Revert "Added more info when a deployment fails"
This reverts commit 2486db78dc.
Co-authored-by: dahn <daan.hoogland@gmail.com>
On newer libvirt/qemu it seems PCI hot-plugging could be an issue as
seen in:
https://www.suse.com/support/kb/doc/?id=000019383https://bugs.launchpad.net/nova/+bug/1836065
This was found to be true on ARM64/aarch64 platform (tested on
RaspberryPi4). As per the default machine doc, it advises to
pre-allocate PCI controllers on the machine and pcie-to-pci-bridge based
controller for legacy PCI models:
https://libvirt.org/pci-hotplug.html#x86_64-q35
This patch introduces the concept as a workaround until a proper fix is
done (ideally in the upstream libvirt/qemu projects). Until then client
code can add 32 PCI controllers and a pcie-to-pci-bridge controller for
aarch64 platforms.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Currently there is no disk IO driver configuration for VMs running on KVM. That's OK for most the cases; however, recently there have been added some quite interesting optimizations with the IO driver io_uring.
Note that IO URING requires:
Qemu >= 5.0, and
Libvirt >= 6.3.0.
By using io_uring we can see a massive I/O performance improvement within Virtual Machines running from Local and/or NFS storage.
This implementation enhances the KVM disk configuration by adding workflow for setting the disk IO drivers. Additionally, if the Qemu and Libvirt versions matches with the required for having io_uring we are going to set it on the VM. If there is no support for such driver we keep it as it is nowadays, without any IO driver configured.
Fixes: #4883
* vmware: fix migrate vm with volume
Recent forward merge of 4.15 branch accidentally brought a bug in VM relocation method for VMware while trying to find datastore for the migrated volume.
This PR fixes it by using either of available target or source host.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* Update plugins/hypervisors/vmware/src/main/java/com/cloud/hypervisor/vmware/resource/VmwareResource.java
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
* fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Daniel Augusto Veronezi Salvador <38945620+GutoVeronezi@users.noreply.github.com>
* server: fix failed to apply userdata when enable static nat
* server: fix cannot expunge vm as applyUserdata fails
* configdrive: fix ISO is not recognized when plug a new nic
* configdrive: detach and attach configdrive ISO as it is changed when plug a new nic or migrate vm
* configdrive test: (1) password file does not exists in recreated ISO; (2) vm hostname should be changed after migration
* configdrive: use centos55 template with sshkey and configdrive support
* configdrive: disklabel is 'config-2' for configdrive ISO
* configdrive: use copy for configdrive ISO and move for other template/volume/iso
* configdrive: use public-keys.txt
* configdrive test: fix (1) update_template ; (2) ssh into vm by keypair
This PR intends to improve logging on agent start to facilitate troubleshooting.
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* vxlan: arp does not work between hosts as multicast group is communicated over physical nic instead of linux bridge
when linux bridge is setup (refer to http://docs.cloudstack.apache.org/projects/archived-cloudstack-getting-started/en/latest/networking/vxlan.html#configure-product-to-use-vxlan-plugin) and used as the kvm traffic label of physical networks, the vms on different hosts cannot reach each other.
(1) does not work:
```
/usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvxlan.sh -v 1001 -p eth1 -b brvx-1001 -o add
```
"bridge fdb" shows
```
00:00:00:00:00:00 dev vxlan1001 dst 239.0.3.233 via eth1 self permanent
```
(2) this works:
```
/usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvxlan.sh -v 1001 -p cloudbr1 -b brvx-1001 -o add
```
"bridge fdb" shows
```
00:00:00:00:00:00 dev vxlan1001 dst 239.0.3.233 via cloudbr1 self permanent
```
* vxlan: fix issue if kvm network label is not set
his PR fixes the problem of not updating the chain info or setting chain info to null after volume migrations.
Problem: While fetching the volume chain info, management server assumes datastore name to be a UUID (this is true only for NFS storages added by CloudStack) but datastore name can be with any name.
Solution: To fetch the volume chain info, use datastore name instead of UUID.
The fix is made in the flow of following API operations
migrateVirtualMachine
migrateVirtualMachineWithVolume
migrateVolume
* Fix of some UEFI related issues
1 - fix of attach/detach ISO of VM with UEFI boot type
2 - if OS type of an ISO is categorized as "Other" the bus type of the disk
will be set to "sata"
* Simplify the validation of OS types
Inclusivity changes for CloudStack
- Change default git branch name from 'master' to 'main' (post renaming/changing default git branch to 'main' in git repo)
- Rename some offensive words/terms as appropriate for inclusiveness.
This PR updates the default git branch to 'main', as part of #4887.
Signed-off-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR fixes the issue of missing fcd folder in local storage in case of VMware vSphere.
with this fix, a folder with name fcd is created whenever local storage is initiated.
Datastore cluster as a primary storage support is already there. But if any changes at vCenter to datastore cluster like addition/removal of datastore is not synchronised with CloudStack directly. It needs removal of primary storage from CloudStack and add it again to CloudStack.
Here synchronisation of datastore cluster is fixed without need to remove or add the datastore cluster.
1. A new API is introduced syncStoragePool which takes datastore cluster storage pool UUID as the parameter. This API checks if there any changes in the datastore cluster and updates management server accordingly.
2. During synchronisation if a new child datastore is found in datastore cluster, then management server will create a new child storage pool in database under the datastore cluster. If the new child storage pool is already added as an individual storage pool then the existing storage pool entry will be converted to child storage pool (instead of creating a new storage pool entry)
3. During synchronisaton if the existing child datastore in CloudStack is found to be removed on vCenter then management server removes that child datastore from datastore cluster and makes it an individual storage pool.
The above behaviour is on par with the vCenter behaviour when adding and removing child datastore.
Fixes: #4808, #4941
This PR adds a force flag to the attachIso / detachIso commands, especially for VMware where it is noticed that when trying to either detach an iso or attach an iso when there already exists another present it fails to do the necessary operation as from ACS end we either answer the question returned by Esxi for CDRom disconnect operation as No (for detach operation) or do not answer the question at all (for Attach operation).
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
* prevent other vm disks getting deleted
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* vmware: fix inter-cluster stopped vm migration
Fixes#4838
For inter-cluster migration without shared storage, VMware needs a host to be specified. Fix is to specify an appropriate host in the target cluster.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix detached volume inter-cluster migration
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* cleanup unused method
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* review changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* vmware: allow attached volume migration using VmwareStorageMotionStrategy
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* find vm clusterid with multiple ROOT volumes
VM can have multiple ROOT volumes and some can be on zone-wide store therefore iterate over all of them till a cluster ID is found.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix successive storage migration
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix intercluster check
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor vm cluster, host method
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* remove inter-pod check
Added by mistake, VMware won't have pods
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address review comment
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Fixes#4838
For inter-cluster migration without shared storage, VMware needs a host to be specified. Fix is to specify an appropriate host in the target cluster during a stopped VM migration. Also, find target datastore using the host in the target cluster.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Adds new/missing guest os mappings for XCP-ng/Xenserver 8.1
Copy guest OS mappings from XCP-ng/Xenserver 8.1 for XCP-ng/Xenserver 8.2
Adds Ubuntu 20.04 guest os mapping for XCP-ng/Xenserver 8.2
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR fixes a small bug when explicitly setting VM hardware versions lower than version 10.
Vmware expects the hardware version in format: vmx-DD where DD is a two-digit representation of the virtual hardware version. For hardware version lower than 10, CloudStack was not using to digits for the hardware version number, which ended up on an error while creating worker VMs. (vmx-8 for example instead of vmx-08)
This PR fixes#4244
deploying of VMs from ISOs and from templates with UEFI boot type
deploying of VMs from ISOs and from templates with UEFI boot type with
volumes in RAW format
This PR aims at introducing persistence mode in L2 networks and enhancing the behavior in Isolated networks
Doc PR apache/cloudstack-documentation#183
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
Fixes#4729
As reported in the issue ACP 4.7 used a normal UUID in db for a presetup primary store on Xenserver.
Later the value has been changed to store's path with '/' removed.
Current changes try to retrieve SR's name-lable from store's path if UUID doesn't match path field for a pre-setup store.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR fixes the issue pertaining to volume resize on VMWare for deploy as-is templates. VMware deploy as-is templates are those that are deployed as per the specification in the imported OVF. Hence override root disk size will not be adhered to for such templates. Moreover, when we deploy VMs in stopped state and resize the volume, the root disk doesn't get resized but the volume size is merely updated in the DB.
This PR also includes the following (for deploy as-is templates):
- Disables overriding root disk size during VM deployment on the UI
- Disables selection of compute offerings with root disk size specified, at the time of deployment
- Provided users with the option to deploy VM is stopped state via UI (so as to give an option to users to resize the volumes before starting the VM)
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
* Updated libvirt's native reboot operation for VM on KVM using ACPI event, and Added 'forced' reboot option to stop and start the VM (using rebootVirtualMachine API)
* Added 'forced' reboot option for System VM and Router
- New parameter 'forced' in rebootSystemVm API, to stop and then start System VM
- New parameter 'forced' in rebootRouter API, to force stop and then start Router
* Added force reboot tests for User VM, System VM and Router
These changes are related to PR #3194, but include suspending/resuming the VM when doing a VM snapshot as well, when deleting a VM snapshot, as it is performing the same operations via Libvirt. Also, there was an issue with the UI/localization changes in the prior PR, as that PR was altering the Volume snapshot behavior, but was altering the VM snapshot wording. Both have been altered in this PR.
Issuing this in response to the work happening in PR #4029.
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ
Documentation PR: apache/cloudstack-documentation#169
This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack
Other improvements addressed in addition to PowerFlex/ScaleIO support:
- Added support for config drives in host cache for KVM
=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
- Updated full deployment destination for preparing the network(s) on VM start
- Propagate the direct download certificates uploaded to the newly added KVM hosts
- Discover the template size for direct download templates using any available host from the zones specified on template registration
=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
- Retry VM deployment/start when the host cannot grant access to volume/template
- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
- Check the router filesystem is writable or not, before performing health checks
=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)
* Addressed some issues for few operations on PowerFlex storage pool.
- Updated migration volume operation to sync the status and wait for migration to complete.
- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.
- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).
- Updated resize volume error message string.
- Blocked the below operations on PowerFlex storage pool:
-> Extract Volume
-> Create Snapshot for VMSnapshot
* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.
- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html
Other fixes included:
- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)
- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.
- Perform basic tests (for connectivity and file system) on router before updating the health check config data
=> Validate the basic tests (connectivity and file system check) on router
=> Cleanup the health check results when router is destroyed
* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0
* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool
and Minor improvements in PowerFlex/ScaleIO storage plugin code
* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.
- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.
* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py
* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)
* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.
* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration
* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure
* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
In previous cloudstack versions, qcow2 image does not have a backing file format.
however, it is required in newer qemu versions, for example qemu 4.2 on ubuntu 20.04.
steps to reproduce the issue
(1) install cloudstack 4.14 or previous version, and ubuntu 19.04 or 18.04/16.04 LTS.
(2) create vms.
(3) upgrade to 4.15, upgrade os to ubuntu 20.04 , or install a new server with ubuntu 20.04.
(4) migrate vm from old ubuntu version to ubuntu 20.04, failed with exception below
```
2021-02-04 13:43:07,397 DEBUG [resource.wrapper.LibvirtMigrateCommandWrapper] (agentRequest-Handler-1:null) (logid:93da9385) ExecutionException : org.libvirt.LibvirtException: Requested operation is not valid: format of backing image '/mnt/03b6f487-9eaf-38bf-ad2d-d985423b832f/66990fcc-fd98-4932-9649-989bf6583d59' of image '/mnt/03b6f487-9eaf-38bf-ad2d-d985423b832f/a3dd1f0f-2557-4e07-951c-e4eb7b3f38b2' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting)
```
(5)stop vm, and start it on ubuntu 20.04 server. failed with exception below
```
2021-02-04 13:46:29,766 WARN [resource.wrapper.LibvirtStartCommandWrapper] (agentRequest-Handler-5:null) (logid:b54745a7) LibvirtException
org.libvirt.LibvirtException: Requested operation is not valid: format of backing image '/mnt/03b6f487-9eaf-38bf-ad2d-d985423b832f/66990fcc-fd98-4932-9649-989bf6583d59' of image '/mnt/03b6f487-9eaf-38bf-ad2d-d985423b832f/a3dd1f0f-2557-4e07-951c-e4eb7b3f38b2' was not specified in the image metadata (See https://libvirt.org/kbase/backing_chains.html for troubleshooting)
```
To make testing easier, step 1 and 2 can be replaced by
```
qemu-img create -f qcow2 -b <backing file> <qcow2 image>
```
so qcow2 image does not have a backing file format.
- Fixes inter-cluster migration of VMs
- Allows migration of stopped VM with disks attached to different and suitable pools
- Improves inter-cluster detached volume migration
- Allows inter-cluster migration (clusters of same Pod) for system VMs, VRs on VMware
- Allows storage migration for stopped system VMs, VRs on VMware within same Pod if StoragePool cluster scopetype
Linked Primate PR: https://github.com/apache/cloudstack-primate/pull/789 [Changes merged in this PR after new UI merge]
Documentation PR: https://github.com/apache/cloudstack-documentation/pull/170
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* 4.15:
server: select root disk based on user input during vm import (#4591)
kvm: Use Q35 chipset for UEFI x86_64 (#4576)
server: fix wrong error message when create isolated network without SourceNat (#4624)
server: add possibility to scale vm to current customer offerings (#4622)
server: keep networks order and ips while move a vm with multiple networks (#4602)
server: throw exception when update vm nic on L2 network (#4625)
doc: fix typo in install notes (#4633)
* 4.14:
server: select root disk based on user input during vm import (#4591)
kvm: Use Q35 chipset for UEFI x86_64 (#4576)
server: fix wrong error message when create isolated network without SourceNat (#4624)
server: add possibility to scale vm to current customer offerings (#4622)
server: keep networks order and ips while move a vm with multiple networks (#4602)
server: throw exception when update vm nic on L2 network (#4625)
doc: fix typo in install notes (#4633)
The bus type to `data disk` volumes is hardcoded to `virtio` or `scsi`, when using virtio-scsi (or, based on the template type). Therefore, there is no way to specify the bus type to data disk volumes (as we have for root disks).
This PR intends to replicate the `rootDiskController` behavior to `dataDiskController`, allowing the definition of the controller.
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
This merges apache/cloudstack-primate under ui and removes the legacy UI
from ui/legacy in master/4.16 as voted on dev ML.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Add RBD main storage through UI, it will fail when there is no host port parameter;
Because when we created the pool, we did not add the port target in the xml
This fixes issue introduced in c3554ec31d
which enable block of code that will double escape rados host/monitor
port.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR fixes a regression issue in #4497
In cloudstack 4.14 or before, the cpu topology is set only when cpucore per socket is set (to 4 or 6).
in other conditions, there is no cpu topology in vm xml definition.
with #4497, vm will have cpu topology in its xml definition, if cpucore per socket is not set.
<topology sockets='<vm cpu cores>' cores='1' threads='1'/>
Not sure if it causes any issue. I think it would be better not to add this part in vm xml definition if cpucore per socket is not set.
XenServer 7.1 has an file descriptor/tapdisk iso-caching issue where new systemvm.iso are not recognised and inside the VR/ssvm/cpvm file IO error is seen. This was only reproducible with XS7.1 (intermittently), the fix was to check and eject the systemvm.iso (old/stale/cached), then insert the new systemvm.iso and then eject it.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem:
When migrateVMwithVolumes API is tried on a VM with two volumes to migrate to a different host and tried to migrate only one volume, Cloudstack migrates both the Volumes but then marks only one of them migrated. This makes volume inaccessible due to inconsitency in path of volume in cloudstack and vsphere
Solution:
Set the target datastore in relocate spec properly for each volume
* Fixed vm-templates not being removed from primary storage with storage garbage collection
* Update vmware-base/src/main/java/com/cloud/hypervisor/vmware/mo/VirtualMachineMO.java
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* Var name
Co-authored-by: sureshanaparti <12028987+sureshanaparti@users.noreply.github.com>
* Add hardware version to worker VM
* Added worker VM hardware version when creating a template from a volume and migrating a detached volume
* Add null parameter back that was removed by merge
VMware code keeps a cache of existing VMs on a hypervisor host using cloud.vm.internal.name property of the VM. Searching for unmanaged instances/VMs on a host might not return an expected result when this property differs from the actual name of the VM.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
In large environments, with VR having multiple nics when plugging in
nic, it must get existing nics by sorted device ID otherwise it may
cause incorrect nic plugging/order.
Fixes#4246
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Creation of templates from detached data disks results in a Null Pointer Exception on VMWare, as it expects the volume to be attached to a VM.
To fix this behavior and make it consistent with other hypervisors, creation of the template from the volume in case not attached to a VM is facilitated by creating a worker VM, attaching the disk to the worker VM, creating the template from it, and then destroying the VM.
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
Added property to agent.properties that enables or disables the iscsi session clean up feature. #4210
Added a condition to prevent disk partitions from being cleaned up. #4216
This is an extention of #3732 for kvm.
This is restricted to ovs > 2.9.2
Since Xen uses ovs 2.6, pvlan is unsupported.
This also fixes issues of vms on the same pvlan unable to communicate if they're on the same host
This PR adds minor version support when mounting nfs on the SSVM as requested in #2861
The global setting "secstorage.nfs.version" has been changed to use the String data type which allows any minor version to be specified.
* DB : Add support for MySQL 8
- Splits commands to create user and grant access on database, the old
statement is no longer supported by MySQL 8.x
- `NO_AUTO_CREATE_USER` is no longer supported by MySQL 8.x so remove
that from db.properties conn parameters
For mysql-server 8.x setup the following changes were added/tested to
make it work with CloudStack in /etc/mysql/mysql.conf.d/mysqld.cnf and
then restart the mysql-server process:
server_id = 1
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=1000
log-bin=mysql-bin
binlog-format = 'ROW'
default-authentication-plugin=mysql_native_password
Notice the last line above, this is to reset the old password based
authentication used by MySQL 5.x.
Developers can set empty password as follows:
> sudo mysql -u root
ALTER USER 'root'@'localhost' IDENTIFIED BY '';
In libvirt repository, there are two related commits
2019-08-23 13:13 Daniel P. Berrangé ● rpm: don't enable socket activation in upgrade if --listen present
2019-08-22 14:52 Daniel P. Berrangé ● remote: forbid the --listen arg when systemd socket activation
In libvirt.spec.in
/bin/systemctl mask libvirtd.socket >/dev/null 2>&1 || :
/bin/systemctl mask libvirtd-ro.socket >/dev/null 2>&1 || :
/bin/systemctl mask libvirtd-admin.socket >/dev/null 2>&1 || :
/bin/systemctl mask libvirtd-tls.socket >/dev/null 2>&1 || :
/bin/systemctl mask libvirtd-tcp.socket >/dev/null 2>&1 || :
Co-authored-by: Wei Zhou <w.zhou@global.leaseweb.com>
Co-authored-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR adds outputting human readable byte sizes in the management server logs, agent logs, and usage records. A non-dynamic global variable is added (display.human.readable.sizes) to control switching this feature on and off. This setting is sent to the agent on connection and is only read from the database when the management server is started up. The setting is kept in memory by the use of a static field on the NumbersUtil class and is available throughout the codebase.
Instead of seeing things like:
2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"106496","bytesReceived":"0","result":"true","details":"","wait":"0",}}] }
The KB MB and GB values will be printed out:
2020-07-23 15:31:58,593 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1863645820801253428: Processing: { Ans: , MgmtId: 52238089807, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-224-VM","bytesSent":"(104.00 KB) 106496","bytesReceived":"(0 bytes) 0","result":"true","details":"","wait":"0",}}] }
FS: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Human+Readable+Byte+sizes
This fixes issues of virtual size to be twice in case the disk is a
linked-clone root disk. The virtual size of root disk (first in chain)
must be used.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Ceph used to use port 6789 (no need to specify it), but with the messenger v2
from Ceph it switched to port 3300 while 6789 still works.
librados/librbd/libvirt will automatically figure out the ports to use if none is
specified.
Therefor there is no need for CloudStack to explicitely define the port in the XML
passed to Libvirt or Qemu.
Leave blank if no port number has been defined by the user.
* Enable unmanaging guest VMs
* Minor fixes
* Fix stop usage event only if VM is not stopped when unmanaging
* Rename unmanaged VMs manager
* Generate netofferingremove usage event if VM is not stopped
* Generate usage event VM snapshot primary off when unmanaging
This PR fixes an issue where an instance fails to deploy due to a null pointer when using an L2 Guest Network with DefaultL2NetworkOfferingConfigDrive on Xenserver. It also fixes migrating an instance to another host.
This has been tested by:
- Creating an L2 Guest network, using DefaultL2NetworkOfferingConfigDrive as the network offering.
- Deploying an instance using the L2 Guest network created.
- Migrating the instance away from the host and back
When you migrate volume between data stores CS keeps the original UUID and changes the path of the volume.
When volume is not found by the given path the agent throws CloudRuntimeException but it's not catched in LibvirtGetVolumeStatsCommandWrapper.java
* 4.13:
Snapshot deletion issues (#3969)
server: Cannot list affinity group if there are hosts dedicated… (#4025)
server: Search zone-wide storage pool when allocation algothrim is firstfitleastconsumed (#4002)
* Fixes snapshot deletion
* Remove legacy '@Component', it is not necessary in this bean/class.
* Fix log message missing %d and remove snapshot on DB
* Remove "dummy" boolean return statement
* Manage snapshot deletion for KVM + NFS (primary storage)
* checkstyle trailing spaces
* rename options strings to *_OPTION
* Fix typo on deleteSnapshotOnSecondaryStorage and enhance log message
* Move the snapshotDao.remove(snapshotId); (#4006)
* Fix deletesnapshot worflow to handle both snapshots created in primary storage and snapshots backed up to secondary storage
* Fix extra space
* refactor out separate handling methods for secondary and primary (reducing returns)
* return false on unexpected error or log when expected
* != instead of ==
* secondary instead of backup storage
* init to null
* Handle snapshot deletion on primary storage. When primary store ref not found for snapshot do not fail the operation.
* Fix debug levels on log messages
Co-authored-by: GabrielBrascher <gabriel@apache.org>
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
Though VMware does not support security groups, but in a basic zone with VMware and no isolation VMs should be able to deploy.
Root cause:
In case of VMware and basic zone control nic is set to 0.0.0.0 assuming control network will be shared with guest network.
But to have access to VMware instances management/private needs to be assigned to it.
Solution:
Assing a private ip even in case of basic zone VMware.
* Remove constraint for NFS storage
* Add new property on agent.properties
* Add free disk space on the host prior template download
* Add unit tests for the free space check
* Fix free space check - retrieve avaiable size in bytes
* Update default location for direct download
* Improve the method to retrieve hosts to retry on depending on the destination pool type and scope
* Verify location for temporary download exists before checking free space
* In progress - refactor and extension
* Refactor and fix
* Last fixes and marvin tests
* Remove unused test file
* Improve logging
* Change default path for direct download
* Fix upload certificate
* Fix ISO failure after retry
* Fix metalink filename mismatch error
* Fix iso direct download
* Fix for direct download ISOs on local storage and shared mount point
* Last fix iso
* Fix VM migration with ISO
* Refactor volume migration to remove secondary storage intermediate
* Fix simulator issue
This adds support for JDK11 in CloudStack 4.14+:
- Fixes code to build against JDK11
- Bump to Debian 9 systemvmtemplate with openjdk-11
- Fix Travis to run smoketests against openjdk-11
- Use maven provided jdk11 compatible mysql-connector-java
- Remove old agent init.d scripts
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The VM ingestion feature allows CloudStack to discover, on-board, import existing VMs in an infra. The feature currently works only for VMware, with a hypervisor agnostic framework which may be extended for KVM and XenServer in future.
* [CLOUDSTACK-10408] Fix String.replaceAll() to replace() for better performance
* improve with replace char but string
Co-authored-by: Rohit Yadav <rohit@apache.org>
* * Complete API implementation
* Complete UI integration
* Complete marvin test
* Complete Secondary storage GC background task
* improve UI labels
* slight reword and add another missing description
* improve download message clarity
* Address comments
* multiple fixes and cleanups
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* fix more bugs, let it return ip rule list in another log file
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* fix missing iprule bug
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* add support for ARCHIVE type of object to be linked/setup on secstorage
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Fix retrieving files for Xenserver
* Update get_diagnostics_files.py
* Fix bug where executable scripts weren't handled
* Fixed error on script cmd generation
* Do not filter name for log files as it would override similar prefix script names
* Addressed code review comments
* log error instead of printstacktrace
* Treat script as executable and shell script
* Check missing script name case and write to output instead of catching exception
* Use shell = true instead of shlex to support any executable
* fix xenserver bug
* don't set dir permission for vmware
* Code review comments - refactoring
* Add check for possible NPE
* Remove unused imoprt after rebase
* Add better description for configs
Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
Co-authored-by: Rohit Yadav <rohit@apache.org>
Co-authored-by: Anurag Awasthi <anurag.awasthi@shapeblue.com>
* Suqash commits to a single commit and rebase against master
Update marvin tests to use white list
* * Fix marvin test failure
* Add new marvin negative tests cases
* Remove hard-coded hypervisor types in marvin tests
* Fix build error after rebase and add hugepagesless
* Fix readability of python code
* Fix failing test
* Adding cleanup of vms for negative tests
* Bug fixes - change config checks properly and block extraconfig in details
* Trim to compare the keys
* CR comments
* Don't skip extraconfig without exception
Co-authored-by: Boris Stoyanov - a.k.a Bobby <bss.stoyanov@gmail.com>
* 4.13:
only update powerstate if sure it is the latest (#3743)
ui: fix migrate host form no host popup (#3682)
client: jetty session timeout set after server is started (#3658)
Increase DHCP lease time to infinite (#3662)
* Avgload (#2)
* Adding avgload for kvm
* Fix coding style issue
* Add getter/setter
* Fix several small errors
* Add override
* Uncomment getAverageLoad
* Override getAverageLoad()
* Checkstyle bug?
* Delete trailing spaces
* Renaming function
* Change interface to match
* Rename method in GetHostStatsAnswer
* Change method call name
* Convert double to long
* Remove trailing whitespace
* Change names around
* Make load visible to return it
* Parse string to double
* Change Long to Double
* Fix getter
* Unify naming to cpuloadaverage
* Change cpuloadaverage String to Double in listHostsMetrics
Remove some unnecessary whitespaces
* Add CPU_LOAD_AVERAGE to ApiConstants
When I add a secondary IP to a nic on shared network in advanced zone with security groups, the network rules for new IP are not applied on KVM hypervisors.
It is because "--action -A" cannot be recognized in security_group.py after commit ac73e7e671. changing to "--action=-A" will fix it.
Fixes issue #3590 by using the last element on the array from the snapshot "path" String for retrieving the snapshot id. Additionally, it uses the volumePath as the volume id which should always be the correct value. The error raised on issue #3590 was related to the wrong use of variable "path" where in some cases had a different set of substrings.
The proposed change has been tested and evaluated. The values used for openning the RBD connection and executing the rollback were stable on the tests. Runned rollback on multiple snapshots and could start the VM with the content matching the ROOT reverted snapshot.
KVM is supported on arm64 Linux (https://www.linux-kvm.org/page/Processor_support#ARM:).
For a small (IoT) platform such as the new Raspberry Pi 4 that uses armv8 processor
(cortex-a72) it's possible to run Linux host with `/dev/kvm`
accleration. This adds support for IoT IaaS in CloudStack.
This PR is from a fun weekend project where:
- I set up a Raspberry Pi 4 - 4GB RAM model with 4 CPU cores @ 1.5Ghz, 128GB SD samsung evo plus card
- Installed Ubuntu 19.10 raspi3 base image: http://cdimage.ubuntu.com/releases/19.10/release/ubuntu-19.10-preinstalled-server-arm64+raspi3.img.xz
- Build a custom Linux 5.3 kernel with KVM enabled, deb here: http://dl.rohityadav.cloud/cloudstack-rpi/kernel-19.10/ and install the linux-image and linux-module
- Then install/setup CloudStack on it (fix some issues around jna, by manually installing newer libjna-java to /usr/share/cloudstack-agent/lib)
- Since the host processor is not x86_64, I had to build a new arm64 (or aarch64) systemvmtemplate: http://dl.rohityadav.cloud/cloudstack-rpi/systemvmtemplate/
I could finally get a 4.13 CloudStack + Adv zone/networking to run on it
and deployed a KVM based Ubuntu 19.10 environment and NFS storage.
Deployed a test vm with isolated network, VR works as expected. Console
proxy works as well, for this tested against arm64 openstack Debian 9/10
templates.
I raised the issue of enabling KVM in upstream Ubuntu arm64 build: https://bugs.launchpad.net/ubuntu/+source/linux-raspi2/+bug/1783961
Ubuntu kernel team has come back and future arm64 releases may have
KVM enabled by default.
Limitation: on my aarch64 env, it did not support IDE, therefore all
default bus type for volumes are SCSI by default. With VIRTIO it fails
sometimes.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* kvm: Use 'ip' instead of 'brctl'
The command 'brctl' is deprecated and should no longer be used.
iproute2 supports all the features we need and therefor we should use
this instead of the old commands.
Feature wise this does not change anything. It just makes the code more
robust towards the future.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* kvm/modifyvlan: Use 'ip' instead of 'brctl'
brctl is deprecated and by using iproute2 we are future-proof
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Refactor: Cleanup duplicate code
Make use of Java 8 default implementation in interfaces,
to remove code duplication between XxxCmd and XxxCmdAsAdmin.
Refactor checkFormat by pre-calculating the supported
extensions. Also make use of this in ImageStoreUtil.
Makes it easier to add new file and compression formats.
Problem: In Vmware, appliances that have options that are required to be answered before deployments are configurable through vSphere vCenter user interface but it is not possible from the CloudStack user interface.
Root cause: CloudStack does not handle vApp configuration options during deployments if the appliance contains configurable options. These configurations are mandatory for VM deployment from the appliance on Vmware vSphere vCenter. As shown in the image below, Vmware detects there are mandatory configurations that the administrator must set before deploy the VM from the appliance (in red on the image below):
Solution:
On template registration, after it is downloaded to secondary storage, the OVF file is examined and OVF properties are extracted from the file when available.
OVF properties extracted from templates after being downloaded to secondary storage are stored on the new table 'template_ovf_properties'.
A new optional section is added to the VM deployment wizard in the UI:
If the selected template does not contain OVF properties, then the optional section is not displayed on the wizard.
If the selected template contains OVF properties, then the optional new section is displayed. Each OVF property is displayed and the user must complete every property before proceeding to the next section.
If any configuration property is empty, then a dialog is displayed indicating that there are empty properties which must be set before proceeding
image
The specific OVF properties set on deployment are stored on the 'user_vm_details' table with the prefix: 'ovfproperties-'.
The VM is configured with the vApp configuration section containing the values that the user provided on the wizard.
Fix regression bug that affects KVM local storage migration. Some of the desired execution flows for KVM local storage migration had been altered to allow only managed storage to execute. Fixed allowing managed and non managed storages to execute.
Fixes#3521
This reverts commit 7a27e35a61.
We're near 4.13 RC1, we've low confidence if the changes from #3152
would cause other regressions so reverting this. The author may send a
PR again towards 4.14.
Regressions found are all related to template and iso registration and
upload.
Retrieval of an image store using ImageStoreProviderManager has been refactored by introducing three different methods,
DataStore getRandomImageStore(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will not be used here.
DataStore getImageStoreWithFreeCapacity(List<DataStore> imageStores);
To get an image store for reading purpose. Threshold capacity check will be used here and the store with max free space will be returned. If no store with filled storage less than the threshold is found, the NULL value will be returned.
List<DataStore> listImageStoresWithFreeCapacity(List<DataStore> imageStores);
To get a list of image stores for writing purpose which fulfills threshold capacity check.
Correspondingly DataStoreManager methods have been refactored to return similar values for a given zone.
Fixes#3287 - NULL value will be returned when secondary storage is needed for writing but there is not store with free space.
Fixes#3041 - Rather than returning random secondary storage for writing, storage with max. free space will be returned.
Fixes#3478 - For migration on VMware, all writable secondary storage will be mounted while preparation.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Make use of Java 8 default implementation in interfaces,
to remove code duplication between XxxCmd and XxxCmdAsAdmin.
Refactor checkFormat by pre-calculating the supported
extensions. Also make use of this in ImageStoreUtil.
Makes it easier to add new file and compression formats.
There are certain scenarios where the 169.254.0.0/16 subnet is used for different
purposes then CloudStack on a hypervisor.
Once of such scenarios is a BGP+EVPN+VXLAN setup using BGP Unnumbered where the
169.254.0.1 address is used by Frr/Zebra BGP routing to send traffic to the
neighboring router.
The following settings can be changed in the agent.properties (default values added):
control.cidr=169.254.0.0/16
Make sure the global setting 'control.cidr' matches the values defined in the agent.propeties!
In the future the mgmt server can send this parameter to a KVM Agent on startup, but at the moment
this framework is not in place and thus these values can't be send to the Agent in a proper manner.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Currently when refreshing disk usage stats all kvm agents are asked to collect stats for all volumes. In setups with multiple kvm hosts where managed storage is used, not all volumes are attached to all kvm hosts, this results in a large number of warnings in the kvm agent logs. This change introduces a filter step in case managed storage is used so that the management server only requests kvm agents for stats about volumes that are connected to each kvm host.
Add CephSnapshotStrategy to handle RBD revert (rollback) snapshot. In order to support RBD revert (rbd_rollback), this PR adds a CephSnapshotStrategy class to handle Ceph/RBD snapshot actions.
During volume stats calculation, if a volume has more than one disk in
the chain-info it is not used to sum the physical and virtual size
in the loop, instead any previous entry was overwritten by the last disk.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Add revoke certificates API
* Add background task to sync certificates
* Fix marvin test and revoke certificate
* Fix certificate sent to hypervisor was missing headers
* Fix background task for uploading certificates to hosts
This change addresses #3089. There was an issue when disks were being added with bus type IDE when creating windows VMs from ISOs. It is not possible to select bus type when creating a VM with an ISO. The bus type is inferred based on the platform emulator string provided to the KVM agent. Currently when creating a VM with managed storage (ex: Solidfire) and OS type string Windows*, all disks are added as IDE. Qemu currently does not support multiple IDE controllers and this configuration results in VMs that cannot be started. This issue does not occur when using NFS as the storage provider due to logic in that KVM agent that makes all data volumes (non root) use a virtio controller for file based disk. Similar logic was added for raw physical disks so that managed storage has the same behavior as NFS. In addition specific versions were removed from the code that guesses the disk controller to be used based on the platform emulator string since most modern operating systems support virtio.
Fixes#3089
Problem: The VM metrics has aggregated volume bytes read/write and iops metrics but not on per volume basis.
Root Cause: The volume stats sub-system is not used to export the metrics, the support is not available for VMware.
Solution: Use the volume stats sub-system and DB table to export the metrics via the listVolumes and listVolumeMetrics API, and implement support for VMware and fix issue with network and disk metrics in the VM metrics view.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem: Users don't know what keys/values to enter for template and VM details.
Root Cause: The feature does not exist that can list possible details and options.
Solution: Based on the possible VM and template details handled by the
codebase, those details were refactored and a list API is introduced
that can return users those details along with possible values. When
users add details now, they will be presented with a list of key details
and their possible options if any.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes tests path from old layout to standard maven in src/test/java/
- Removed duplicate SnapshotManagerImpl at old path `server/src/com...`
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Feature Specification: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653548
Live storage migration on KVM under these conditions:
From source and destination hosts within the same cluster
From NFS primary storage to NFS cluster-wide primary storage
Source NFS and destination NFS storage mounted on hosts
In order to enable this functionality, database should be updated in order to enable live storage capacibilty for KVM, if previous conditions are met. This is due to existing conflicts between qemu and libvirt versions. This has been tested on CentOS 6 hosts.
Additional notes:
To use this feature set the storage_motion_supported=1 in the hypervisor_capability table for KVM. This is done by default as the feature may not work in some environments, read below.
This feature of online storage+VM migration for KVM will only work with CentOS6 and possible Ubuntu as KVM hosts but not with CentOS7 due to:
https://bugs.centos.org/view.php?id=14026https://bugzilla.redhat.com/show_bug.cgi?id=1219541
On CentOS7 the error we see is: " error: unable to execute QEMU command 'migrate': this feature or command is not currently supported" (reference https://ask.openstack.org/en/question/94186/live-migration-unable-to-execute-qemu-command-migrate/). Reading through various lists looks like the migrate feature with qemu may be available with paid versions of RHEL-EV but not centos7 however this works with CentOS6.
Fix for CentOS 7:
Create repo file on /etc/yum.repos.d/:
[qemu-kvm-rhev]
name=oVirt rebuilds of qemu-kvm-rhev
baseurl=http://resources.ovirt.org/pub/ovirt-3.5/rpm/el7Server/
mirrorlist=http://resources.ovirt.org/pub/yum-repo/mirrorlist-ovirt-3.5-el7Server
enabled=1
skip_if_unavailable=1
gpgcheck=0
yum install qemu-kvm-common-ev-2.3.0-29.1.el7.x86_64 qemu-kvm-ev-2.3.0-29.1.el7.x86_64 qemu-img-ev-2.3.0-29.1.el7.x86_64
Reboot host
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Problem: Users can register ISOs from URL but cannot upload local ISOs.
Root cause: CloudStack provides browser-based upload support for volumes and templates, but ISOs are not supported.
Solution:
The existing browser-based upload from local functionality for templates and volumes (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=39620237) is extended to support uploading local ISOs.
Extend the UI: A new button is created under the ISOs view: 'Upload from Local'. A new dialog form is displayed in which the user must select the ISO to upload from its local file system.
Extend the API: New 'GetUploadParamsForIso' API command is created to handle the ISO upload.
To make sure that a qemu2-image won't be corrupted by the snapshot deletion procedure which is being performed after copying the snapshot to a secondary store, I'd propose to put a VM in to suspended state.
Additional reference: https://bugzilla.redhat.com/show_bug.cgi?id=920020#c5Fixes#3193
* Improvements on upload direct download certificates
* Move upload direct download certificate logic to KVM plugin
* Extend unit test certificate expiration days
* Add marvin tests and command to revoke certificates
* Review comments
* Do not include revoke certificates API
Since the CloudStack virtual router was redesigned on version 4.6 it has been observed that the DHCP leases file is not persistent across network operations. This causes conflicts on guest VMs static IPs, causing these static IPs to not be renewed by the DHCP server running on isolated and VPC networks' virtual routers (dnsmasq). On stopping or destroying a VM, its dhcp/dns records are not removed from the virtual router causing ghost effects.
Fixes#3272Fixes#3354
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
We do NOT always reserve VMware CPU/RAM resources - only when "vmware.reserve.cpu" or "vmware.reserve.mem" setting is set to TRUE - AND we do so, irrelevant if overprovisioning is active or not. Verified for both system VMs and user VMs.
* DPDK vHost User mode selection
* SQL text field and DPDK classes refactor
* Fix NullPointerException after refactor
* Fix unit test
* Refactor details type
This adds memory used column in the instance metrics view. Also fixes
a bug for VMware, due to which incorrect memory usage was returned.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When I use SandyBridge as custom cpu in my testing, vm failed to start due to following error:
```
org.libvirt.LibvirtException: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: avx, xsave, aes, tsc-deadline, x2apic, pclmuldq
```
With this patch, it works with the following setting in agent.properties:
```
guest.cpu.mode=custom
guest.cpu.model=SandyBridge
guest.cpu.features=-avx -xsave -aes -tsc-deadline -x2apic -pclmuldq
```
vm cpu is defined as below:
```
<cpu mode='custom' match='exact'>
<model fallback='allow'>SandyBridge</model>
<feature policy='disable' name='avx'/>
<feature policy='disable' name='xsave'/>
<feature policy='disable' name='aes'/>
<feature policy='disable' name='tsc-deadline'/>
<feature policy='disable' name='x2apic'/>
<feature policy='disable' name='pclmuldq'/>
</cpu>
```
On first startup, the management server creates and saves a random
ssh keypair using ssh-keygen in the database. The command does
not specify keys in PEM format which is not the default as generated
by latest ssh-keygen tool.
The systemvmtemplate always needs re-building whenever there is a change
in the cloud-early-config file. This also tries to fix that by introducing a
stage 2 bootstrap.sh where the changes specific to hypervisor detection
etc are refactored/moved. The initial cloud-early-config only patches
before the other scripts are called.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes the issue that VM with VMsnapshots fails to start after
extract volume is done on a stopped VM, on VMware.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes PR #3146 db cleanup to the correct 4.12->4.13 upgrade path
- Fixes failing unit test due to jdk specific changes after forward
merging
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This introduces a new patching script for patching systemvms on KVM
using qemu-guest-agent that runs inside the systemvm on startup. This
also removes the vport device which was previously used by the legacy
patching script and instead uses the modern and new uniform guest
agent vport for host-guest communication.
Also updates the sytemvmtemplate build config to use the latest Debian
9.9.0 iso.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* ubuntu16: fix unable to add host if cloudbrX is not configured
while add a ubuntu16.04 host with native eth0 (cloudbrX is not configured),
the operation failed and I got the following error in /var/log/cloudstack/agent/setup.log
```
DEBUG:root:execute:ifconfig eth0
DEBUG:root:[Errno 2] No such file or directory
File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 38, in configration
result = self.config()
File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 211, in config
super(networkConfigUbuntu, self).cfgNetwork()
File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 108, in cfgNetwork
device = self.netcfg.getDefaultNetwork()
File "/usr/lib/python2.7/dist-packages/cloudutils/networkConfig.py", line 53, in getDefaultNetwork
pdi = networkConfig.getDevInfo(dev)
File "/usr/lib/python2.7/dist-packages/cloudutils/networkConfig.py", line 157, in getDevInfo
elif networkConfig.isBridge(dev) or networkConfig.isOvsBridge(dev):
```
The issue is caused by commit 9c7cd8c248
2017-09-19 16:45 Sigert Goeminne ● CLOUDSTACK-10081: CloudUtils getDevInfo function will now return "bridge" instead o
* ubuntu16: Stop service libvirt-bin.socket while add a host
service libvirt-bin.socket will be started when add a ubuntu 16.04 host
DEBUG:root:execute:sudo /usr/sbin/service libvirt-bin start
However, libvirt-bin service will be broken by it after restarting
Stopping service libvirt-bin.socket will fix the issue.
An example is given as below.
```
root@node32:~# /etc/init.d/libvirt-bin restart
[ ok ] Restarting libvirt-bin (via systemctl): libvirt-bin.service.
root@node32:~# virsh list
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
root@node32:~# systemctl stop libvirt-bin.socket
root@node32:~# /etc/init.d/libvirt-bin restart
[ ok ] Restarting libvirt-bin (via systemctl): libvirt-bin.service.
root@node32:~# virsh list
Id Name State
----------------------------------------------------
```
* ubuntu16: Diable libvirt default network
By default, libvirt will create default network virbr0 on kvm hypervisors.
If vm uses the same ip range 192.168.122.0/24, there will be some issues.
In some cases, if we run tcpdump inside vm, we will see the ip of kvm hypervisor as source ip.
* Mock Scanner, instead of scan the computer running the test.
This allows non linux machines to run the tests without scanning for a
non existing /proc/meminfo.
* test fixes on 'other' platforms libvirt wrapper unit tests (#3)
* Fix XenServer Security Groups 'vmops' script
- fix tokens = line.split(':') to tokens = line.split(';')
- fix expected tokens size from 5 to 4
- enhance logs
- remove unused vmops script. The XCP patch points to the vmops script
on the parent folder [1]. Thus, all XenServer versions are considering
the vmops script located at [2].
- fix UI ipv4/ipv6 cidr validator to allow a list of cidirs.
Fixing issue: #3192 Security Group rules not applied at all for
XenServer 6.5 / Advanced Zone
https://github.com/apache/cloudstack/issues/3192
* Update security group rules after VM migration
Add security group rules on target host
Cause: vmops script expected secondary IPs as "0;" but received "0:"
Remove security group network rules on source host.
Cause: destroy_network_rules_for_vm function on vmops script was not
called when migrating VM
* Add unit tests and address reviewers
* Keep iotune section in the VM's XML after live migration
When live migrating a KVM VM among local storages, the VM loses the
<iotune> section on its XML, therefore, having no IO limitations.
This commit removes the piece of code that deletes the <iotune> section
in the XML.
* Add test for replaceStorage in LibvirtMigrateCommandWrapper
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* Fix Javadoc for method replaceIpForVNCInDescFile
* feature: add libvirt / qemu io bursting
Adds the ability to set bursting features from libvirt / qemu
This allows you to utilize the iops and bytes temporary "burst" mode
introduced with libvirt 2.4 and improved upon with libvirt 2.6.
https://blogs.igalia.com/berto/2016/05/24/io-bursts-with-qemu-2-6/
* updates per rafael et al
* api: add command to list management servers
* api: add number of mangement servers in listInfrastructure command
* ui: add block for mangement servers on infra page
* api name resolution method cleanup
* - Offline VM and Volume migration on Vmware hypervisor hosts
- Also add VM disk consolidation call on successful VM migrations
* Fix indentation of marvin test file and reformat against PEP8
* * Fix few comment typos
* Refactor debug messages to use String.format() when debug log level is enabled.
* Send list of commands returned by hypervisor Guru instead of explicitly selecting the first one
* Fix unhandled NPE during VM migration
* Revert back to distinct event descriptions for VM to host or storage pool migration
* Reformat test_primary_storage file against PEP-8 and Remove unused imports
* Revert back the deprecation messages in the custom StringUtils class to favour the use of the ApacheUtils
The KVM Agent had two mechanisms for reporting its capabilities
and memory to the Management Server.
On startup it would ask libvirt the amount of Memory the Host has
and subtract and add the reserved and overcommit memory.
When the HostStats were however reported to the Management Server
these two configured values on the Agent were no longer reported
in the statistics thus showing all the available memory in the
Agent/Host to the Management Server.
This commit unifies this by using the same logic on Agent Startup
and during statistics reporting.
memory=3069636608, reservedMemory=1073741824
This was reported by a 4GB Hypervisor with this setting:
host.reserved.mem.mb=1024
The GUI (thus API) would then show:
Memory Total 2.86 GB
This way the Agent properly 'lies' to the Management Server about its
capabilities in terms of Memory.
This is very helpful if you want to overprovision or undercommit machines
for various reasons.
Overcommitting can be done when KSM or ZSwap or a fast SWAP device is
installed in the machine.
Underprovisioning is done when the Host might run other tasks then a KVM
hypervisor, for example when it runs in a hyperconverged setup with Ceph.
In addition internally many values have been changed from a Double to a Long
and also store the amount of bytes instead of Kilobytes.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: Replace deprecated optparse by argparse
Starting with Python 2.7 the library optparse has been replaced by
argpase.
This commit replaces the use of optparse by argparse
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: Remove LXC support from security_group.py
LXC does not work and has been partially removed from CloudStack already
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: Refactor libvirt code
Use a single function which properly throws an Exception when the
connection to libvirt fails.
Also simplify some logic, make it PEP-8 compatible and remove a unused
function from the code.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: Raise Exception on execute() failure
If the executed command exists with a non-zero exit status we should
still return the output to the command, but also raise an Exception.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: Use a function to determin the physical device of a bridge
We can not safely assume that the first device listed under a bridge is the
physical device.
With VXLAN isolation a vnet device can be attached to a bridge prior to the
vxlanXXXX device being attached.
We need to filter out those devices and then fetch the physical device attached
to the bridge.
In addition use the 'bridge' command instead of 'brctl'. 'bridge' is part of the
iproute2 utils just like 'ip' and should be considered as the new default.
This command is also available on EL6 and does not break any backwards compat.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: --set is deprecated, use --match-set
These messages are seen in the KVM Agent log:
--set option deprecated, please use --match-set
Functionality does not change
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* security group: PEP-8 and indentation fixes
There were a lot of styling problems in the code:
- Missing whitespace or exess whitespace
- CaMelCaSe function names and variables
- 2-space indentation instead of 4 spaces
This commit addresses those issues.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
The additional queues can enhance the performance of the VirtIO SCSI disk
and it is recommended to set this to the amount of vCPUs a Instance is assigned.
The optional queues attribute specifies the number of queues for the
controller. For best performance, it's recommended to specify a value matching
the number of vCPUs. Since 1.0.5 (QEMU and KVM only)
Source: https://libvirt.org/formatdomain.html#elementsVirtio
Signed-off-by: Wido den Hollander <wido@widodh.nl>
The static method syncVolumeToRootFolder() from VmwareStorageLayoutHelper.java:146 has been incorrectly called and leads to an infinite recursive call that ends up in a StackOverflowError. This PR fixes this.
public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, null); } -> public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, vmName, null); }
* Allow KVM VM live migration with ROOT volume on file
* Allow KVM VM live migration with ROOT volume on file
- Add JUnit tests
* Address reviewers and change some variable names to ease future
implementation (developers can easily guess the name and use
autocomplete)
Added dummy and lo devices to be treated as a normal bridge slave devs.
Fixes#2998
Added two more device names (lo* and dummy*). Implemented tests. Code was refactored.
Improved paths concatenation code from "+" to Paths.get.
If a host has many routes this can be a magnitude faster then printing
all the routes and grepping for the default.
In some situations the host might have a large amount of routes due to
dynamic routing being used like OSPF or BGP.
In addition fix a couple of loglines which were throwing messages on
DEBUG while WARN and ERROR should be used there.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When vxlan://untagged is used for public (or guest) network, use the
default public/guest bridge device same as how vlan://untagged works.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
These additional RBD features allow for faster lookups of how much space a RBD
image is using, but with the exclusive locking we prevent two VMs from writing
to the same RBD image at the same time.
These are the default features used by Ceph for any new RBD image.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This adds a new API updateVmwareDc that allows admins to update the
VMware datacenter details of a zone. It also recursively updates
the cluster_details for any username/password updates
as well as updates the url detail in cluster_details table and guid
detail in the host_details table with any newly provided vcenter
domain/ip. The update API assumes that there is only one vCenter per
zone. And, since the username/password for each VMware host could be different
than what gets configured for vcenter at zone level, it does not update the
username/password in host_details.
Previously, one has to manually update the db with any new vcenter details for the zone.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
On actual testing, I could see that kvmheartbeat.sh script fails on NFS
server failure and stops the agent only. Any HA VMs could be launched
in different hosts, and recovery of NFS server could lead to a state
where a HA enabled VM runs on two hosts and can potentially cause
disk corruptions. In most cases, VM disk corruption will be worse than
VM downtime. I've kept the sleep interval between check/rounds but
reduced it to 10s. The change in behaviour was introduced in #2722.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>