Fixes: #4990
When a VM associated with a backup offering is destroyed/expunged, the backup offering isn't unassigned, and despite the VM having no backups present, backup usage is generated. This PR prevent usage record generation when there are no backups present for a VM with a backup offering associated to it. This is done by ensuring that usage event for backups is generated only when a the backup size > 0
This PR fixes#5047 which can be reproduced on Zones with _(I) Advanced Networks, (II) Security Groups enabled for the Zone, (III) network offering without Security Groups_; for instance, `DefaultSharedNetworkOffering` which does not list Security Group as supported service.
The issue is due to the following code inside the method `VirtualMachineManagerImpl.orchestrateReboot`:
[VirtualMachineManagerImpl.java#L3340](280c13a4bb/engine/orchestration/src/main/java/com/cloud/vm/VirtualMachineManagerImpl.java (L3340)).
```
final Answer rebootAnswer = cmds.getAnswer(RebootAnswer.class);
if (rebootAnswer != null && rebootAnswer.getResult()) {
if (dc.isSecurityGroupEnabled() && vm.getType() == VirtualMachine.Type.User) {
List<Long> affectedVms = new ArrayList<Long>();
affectedVms.add(vm.getId());
_securityGroupManager.scheduleRulesetUpdateToHosts(affectedVms, true, null);
}
return;
}
```
Fixes: #4972
This PR sets systevms' agent state to disconnected when it is stopped. Currently, when a systemVM (Console Proxy VM / Secondary storage VM) is stopped, the agent state still appears to be 'Up'
* server: destroy ssvm, cpvm on last host maintenance
When a single or last UP host enters into maintenance just stopping SSVM and CPVM will leave behind VMs on hypervisor side. As these system vms will be recreated they can be destroyed.
Fixes#3719
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix methods
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* immediately destroy systemvms
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix destroy
Added bypassHostMaintenance flag in Comma.java class to allow command to be handled by host agent even when host is in maintenace.
Flag is set true only for delete commands for ssvm and cpvm.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* unit test fix
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix missing return statement
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix
VM should be stopped with cleanup before calling expunge else it server may through error with host in PrepareForMaintenance state.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* rename
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* forceha: fix vm is not started if it is poweroff from inside
steps to reproduce the issue
(1) make sure force.ha is true in global setting. if not, change it to true, and restart mgt server
(2) create a service offering , ha is not enabled
(3) create a vm
(4) log into the vm, and power off via cli.
expected result: vm is started again by cloudstack
actual result: vm is not started.
* forceha: fix vms are still running if host is force-removed
when host can be force removed, however vms are stopped in cloudstack, but not stopped on host
```
(localcloud) 🐱 > delete host id="a5625393-444d-4d0a-b31d-62baf88a8be1" forced=true
{
"success": true
}```
after some minutes, vms are still runnning on host
```
root@mgt01:~# ssh node63 virsh list
Id Name State
---------------------------
1 i-2-19-VM running
2 i-2-11-VM running
```
error message are
```
Cannot transmit host 2 to Enabled state
com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state = Enabled event = DeleteHost
at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1216)
at com.cloud.resource.ResourceManagerImpl$1.doInTransactionWithoutResult(ResourceManagerImpl.java:907)
```
* forceha: Make ForceHA dynamic
Datastore cluster as a primary storage support is already there. But if any changes at vCenter to datastore cluster like addition/removal of datastore is not synchronised with CloudStack directly. It needs removal of primary storage from CloudStack and add it again to CloudStack.
Here synchronisation of datastore cluster is fixed without need to remove or add the datastore cluster.
1. A new API is introduced syncStoragePool which takes datastore cluster storage pool UUID as the parameter. This API checks if there any changes in the datastore cluster and updates management server accordingly.
2. During synchronisation if a new child datastore is found in datastore cluster, then management server will create a new child storage pool in database under the datastore cluster. If the new child storage pool is already added as an individual storage pool then the existing storage pool entry will be converted to child storage pool (instead of creating a new storage pool entry)
3. During synchronisaton if the existing child datastore in CloudStack is found to be removed on vCenter then management server removes that child datastore from datastore cluster and makes it an individual storage pool.
The above behaviour is on par with the vCenter behaviour when adding and removing child datastore.
IKE version allows selecting ike (autoselect), ikev1, or ikev2.
Split connections gives an option of separating the first right subnet from the rest, and kicking out individual statements for each right subnet for better cross-compatibility.
Backported from PR: #4137
update per PR suggestion
Fixes#3138
Co-authored-by: Greg Goodrich <ggoodrich@ippathways.com>
Co-authored-by: Daan Hoogland <dahn@onecht.net>
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
This PR addresses the issue raised at #4545 (Fail to change Service offering from local <> shared storage).
When upgrading a VM service offering it is validated if the new offering has the same storage scope (local or shared) as the current offering. I think that the validation makes sense in a way of preventing running Root disks with an offering that does not match the current storage pool. However, the validation only compares both offerings and does not consider that it is possible to migrate Volumes between local <> shared storage pools.
The idea behind this implementation is that CloudStack should check the scope of the current storage pool which the ROOT volume is allocated; this, it is possible to migrate the volume between storage pools and list/upgrade according to the offerings that are supported for such pool.
This PR also fixes an issue where the API command that lists offerings for a VM should follow the same idea and list based on the storage pool that the volume is allocated and not the previous offering.
Fixes: #4545
The default length is 255, which caused a truncation of data if
the JSON object representing the backup volumes is too big.
It caused errors when backups were made on VMs with 3 volumes
or more.
`vm_instance.backup_volumes` has the type TEXT, which has a
maximal length of 65535 characters.
Fixes#4965
This PR makes sure no orphaned snapshot details are considered in the cleanup at startup job.
a real solution would be to implement some kind of cascading delete, but as the parent record is "only" marked as removed this would be a bit com
Co-authored-by: Daan Hoogland <dahn@onecht.net>
* prevent other vm disks getting deleted
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* vmware: fix inter-cluster stopped vm migration
Fixes#4838
For inter-cluster migration without shared storage, VMware needs a host to be specified. Fix is to specify an appropriate host in the target cluster.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix detached volume inter-cluster migration
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* cleanup unused method
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* review changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* changes
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* vmware: allow attached volume migration using VmwareStorageMotionStrategy
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* find vm clusterid with multiple ROOT volumes
VM can have multiple ROOT volumes and some can be on zone-wide store therefore iterate over all of them till a cluster ID is found.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix successive storage migration
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* fix intercluster check
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* refactor vm cluster, host method
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* remove inter-pod check
Added by mistake, VMware won't have pods
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* address review comment
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
When invoking migrateVirtualMachineWithVolume API call and a strategy isn't found the volumes are left in Migrating state
This PR puts back the volumes to Ready state.
This PR fixes: #4462
Problem Statement:
In case of VMware, when a VM having multiple data disk is destroyed (without expunge) and tried to recover the VM then the previous data disks are not attached to the VM like before destroy. Only root disk is attached to the VM.
Root cause:
All data disks were removed as part of VM destroy. Only the volumes which are selected to delete (while destroying VM) are supposed to be detached and destroyed.
Solution:
During VM destroy, detach and destroy only volumes which are selected during VM destroy. Detach the other volumes during expunge of VM.
Fixes#4201
This PR addresses the issue of a vm snapshot being indefinitely stuck is Expunging state in case deletion fails.
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
Fixes#4838
For inter-cluster migration without shared storage, VMware needs a host to be specified. Fix is to specify an appropriate host in the target cluster during a stopped VM migration. Also, find target datastore using the host in the target cluster.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Adds new/missing guest os mappings for XCP-ng/Xenserver 8.1
Copy guest OS mappings from XCP-ng/Xenserver 8.1 for XCP-ng/Xenserver 8.2
Adds Ubuntu 20.04 guest os mapping for XCP-ng/Xenserver 8.2
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Fixes#4517
Adds capacity checks for RandomAllocator (host allocator)
Factors out host cpu capability and capacity check wrt serviceoffering code into CapacityManager.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
This PR aims at introducing persistence mode in L2 networks and enhancing the behavior in Isolated networks
Doc PR apache/cloudstack-documentation#183
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
This contains 3 main changes
(1) add NETWORK_STATS_ethX for all nics with public ips in VPC VRs (current: NETWORK_STATS_eth1)
(2) DO NOT create records in user_statistics for each VPC tier (only one record per public nic per VPC VR)
(3) send NetworkUsageCommand before unplugging a NIC with public IPs from VPC VR
Public IP addresses dedicated to one domain should not be accessed
by other domains. Also, root admin should be able to display all
public ip addresses in system.
Currently following issues exist
1. Public IP address assigned to one domain can be accessed by
other sibling domains
If use.system.public.ip is false then child domains should not
see public ip of ROOT domain
Before fix
```
(test1) mgt01 > list publicipaddresses listall=true fordisplay=true allocatedonly=false forvirtualnetwork=true filter=ipaddress,
{
"count": 59,
"publicipaddress": [
```
After fix
```
(test) mgt01 > list publicipaddresses listall=true fordisplay=true allocatedonly=false forvirtualnetwork=true filter=ipaddress,
{
"count": 10,
```
* server: create DB entry for storage pool capacity when create storage pool
* Revert "server: create DB entry for storage pool capacity when create storage pool"
This reverts commit e790167bfe.
* server: create DB entry for storage pool capacity when create zone-wide storage pools
Update new systemvmtemplate for 4.15.1.0; synced:
http://download.cloudstack.org/systemvm/4.15/
A new template is necessary due to many security fixes over the last year, the 4.15.0 systemvmtemplate was created about a year ago.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
* Update vm_template table removed field when template is deleted
* Update method name
* address comment
* Extracted code to separate methods
* Address test failure
* refactor test cleanup
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
* Updated libvirt's native reboot operation for VM on KVM using ACPI event, and Added 'forced' reboot option to stop and start the VM (using rebootVirtualMachine API)
* Added 'forced' reboot option for System VM and Router
- New parameter 'forced' in rebootSystemVm API, to stop and then start System VM
- New parameter 'forced' in rebootRouter API, to force stop and then start Router
* Added force reboot tests for User VM, System VM and Router
Duplicated volumes after failed migration in Allocated state
Fix: Clean up the duplicate volume when the destination managed volume creation failed on migrate volume operation
While finding pools for volume migration list following compatible storages:
- all zone-wide storages of the same hypervisor.
- when the volume is attached to a VM, then all storages from the same cluster as that of VM.
- for detached volume, all storages that belong to clusters of the same hypervisor.
Fixes#4692Fixes#4400
If the template from which VR is created got deleted, the state
is set to inactive and removed to null.
Since the template is already deleted, the VR can't be created
using this template again.
If someone restarts network with cleanup then it will try to
deploy the vr from the old non existing template again.
So search only for active template which are not yet deleted.
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ
Documentation PR: apache/cloudstack-documentation#169
This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack
Other improvements addressed in addition to PowerFlex/ScaleIO support:
- Added support for config drives in host cache for KVM
=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)
- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
- Updated full deployment destination for preparing the network(s) on VM start
- Propagate the direct download certificates uploaded to the newly added KVM hosts
- Discover the template size for direct download templates using any available host from the zones specified on template registration
=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones
- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)
- Retry VM deployment/start when the host cannot grant access to volume/template
- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore
- Check the router filesystem is writable or not, before performing health checks
=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not
- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)
* Addressed some issues for few operations on PowerFlex storage pool.
- Updated migration volume operation to sync the status and wait for migration to complete.
- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.
- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).
- Updated resize volume error message string.
- Blocked the below operations on PowerFlex storage pool:
-> Extract Volume
-> Create Snapshot for VMSnapshot
* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.
- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html
Other fixes included:
- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)
- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.
- Perform basic tests (for connectivity and file system) on router before updating the health check config data
=> Validate the basic tests (connectivity and file system check) on router
=> Cleanup the health check results when router is destroyed
* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0
* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool
and Minor improvements in PowerFlex/ScaleIO storage plugin code
* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.
- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.
* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py
* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)
* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.
* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration
* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure
* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
- Fixes inter-cluster migration of VMs
- Allows migration of stopped VM with disks attached to different and suitable pools
- Improves inter-cluster detached volume migration
- Allows inter-cluster migration (clusters of same Pod) for system VMs, VRs on VMware
- Allows storage migration for stopped system VMs, VRs on VMware within same Pod if StoragePool cluster scopetype
Linked Primate PR: https://github.com/apache/cloudstack-primate/pull/789 [Changes merged in this PR after new UI merge]
Documentation PR: https://github.com/apache/cloudstack-documentation/pull/170
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Steps to reproduce the issue:
(1)Create 10000 service offerings (by db changes below or cloudmonkey).
```
DROP PROCEDURE IF EXISTS cloud.insert_service_offering;
DELIMITER $$
CREATE PROCEDURE cloud.insert_service_offering()
BEGIN
DECLARE count INT DEFAULT 10000;
SET @offeringid = (select max(id)+1 from disk_offering);
WHILE count > 0 DO
INSERT INTO disk_offering (id,name,uuid,display_text,disk_size,type,created) values (@offeringid,'test-offering-wei',uuid(), 'test-offering-wei',0,'Service',now());
INSERT INTO service_offering (id,cpu,speed,ram_size) values (@offeringid, 1, 500,256);
SET @offeringid = @offeringid + 1;
SET count = count - 1;
END WHILE;
END $$
DELIMITER ;
CALL cloud.insert_service_offering();
mysql> CALL cloud.insert_service_offering();
Query OK, 0 rows affected (2 min 30.85 sec)
```
(2) Check the total time of periodical capacity check in cloudstack.
Without this patch, it spend 2.5 seconds (2 hosts)
```
2021-01-15 16:10:12,793 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Running Capacity Checker ...
2021-01-15 16:10:15,287 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Done running Capacity Checker ...
```
With this patch ,it spend 1.3 seconds (2 hosts)
```
2021-01-15 16:12:43,604 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Running Capacity Checker ...
2021-01-15 16:12:44,927 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Done running Capacity Checker ...
```
If there are 100 hosts, the total time will be reduced from 100+ seconds to around 10 seconds.
* 4.15:
server: select root disk based on user input during vm import (#4591)
kvm: Use Q35 chipset for UEFI x86_64 (#4576)
server: fix wrong error message when create isolated network without SourceNat (#4624)
server: add possibility to scale vm to current customer offerings (#4622)
server: keep networks order and ips while move a vm with multiple networks (#4602)
server: throw exception when update vm nic on L2 network (#4625)
doc: fix typo in install notes (#4633)
* 4.14:
server: select root disk based on user input during vm import (#4591)
kvm: Use Q35 chipset for UEFI x86_64 (#4576)
server: fix wrong error message when create isolated network without SourceNat (#4624)
server: add possibility to scale vm to current customer offerings (#4622)
server: keep networks order and ips while move a vm with multiple networks (#4602)
server: throw exception when update vm nic on L2 network (#4625)
doc: fix typo in install notes (#4633)
This PR fixes an issue when move a vm from an account to another account.
Steps to reproduce the issue
(1) create a vm with multiple shared networks (in advanced zone, or advanced zone with security groups)
(2) create another account (in same domain who can also access the shared networks)
(3) move vm to new account, with a list of networkid
expected result: the vm has nics on the networks in same order as specified in API request, and nics have the same ips as before actual result: network order is not same as specified, ips are changed.
* server: fix cannot create vm if another vm with same name has been added and removed on the network
steps to reproduce the issue
(1) create vm-1 on network-1
(2) add vm-1 to network-2
(3) remove vm-1 from network-2
(4) create another vm with same name vm-1 on network-2
expected result: operation succeed
actual result: operation failed.
* #4600: add back a removed line
Fix db upgrade path conflict, add 4.15.1.0->4.16.0.0 for master, bump
systemvmtemplate version to 4.16.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Fix for mapping guest OS type read from OVF to existing guest OS in CloudStack database while registering VMware template
* Added unit tests to String Utils methods and updated the code
* Updated the java doc section
* Updated os description logic to keep equals ignore match with guest os display name
Update the guest OS from the OVF file after upload is completed
This PR fixes the template upload from local on VMware
Co-authored-by: dahn <daan.hoogland@gmail.com>
Co-authored-by: dahn <daan.hoogland@gmail.com>
This PR addresses an error that appears when you try to add a new host. I don't even understand why there was a cast to String in the first place. I will assume some classes send HypervisorType and some send a string (empty or otherwise). Shouldn't this be addressed to use the same type everywhere? With this fix adding a new xenserver host works fine.
Co-authored-by: dahn <daan.hoogland@gmail.com>
* Setting snapshot state to error on timeout
* Setting removed field so snapshot record is ignored by garbage collection
* Removed explicitly setting error status, renamed method from markFailed to markRemoved
* Renamed method, moved code a few lines down
* Moved remove logic
* Removed unused service
* Moved removed logic - last time, promise
For Basic network isolation methods are not provided, and exception is
thrown when trying to encode the Vlan id. That's why we have to check
before encoding that the list with isolation methods is not empty