This PR fixes#5047 which can be reproduced on Zones with _(I) Advanced Networks, (II) Security Groups enabled for the Zone, (III) network offering without Security Groups_; for instance, `DefaultSharedNetworkOffering` which does not list Security Group as supported service.
The issue is due to the following code inside the method `VirtualMachineManagerImpl.orchestrateReboot`:
[VirtualMachineManagerImpl.java#L3340](280c13a4bb/engine/orchestration/src/main/java/com/cloud/vm/VirtualMachineManagerImpl.java (L3340)).
```
final Answer rebootAnswer = cmds.getAnswer(RebootAnswer.class);
if (rebootAnswer != null && rebootAnswer.getResult()) {
if (dc.isSecurityGroupEnabled() && vm.getType() == VirtualMachine.Type.User) {
List<Long> affectedVms = new ArrayList<Long>();
affectedVms.add(vm.getId());
_securityGroupManager.scheduleRulesetUpdateToHosts(affectedVms, true, null);
}
return;
}
```
* forceha: fix vm is not started if it is poweroff from inside
steps to reproduce the issue
(1) make sure force.ha is true in global setting. if not, change it to true, and restart mgt server
(2) create a service offering , ha is not enabled
(3) create a vm
(4) log into the vm, and power off via cli.
expected result: vm is started again by cloudstack
actual result: vm is not started.
* forceha: fix vms are still running if host is force-removed
when host can be force removed, however vms are stopped in cloudstack, but not stopped on host
```
(localcloud) 🐱 > delete host id="a5625393-444d-4d0a-b31d-62baf88a8be1" forced=true
{
"success": true
}```
after some minutes, vms are still runnning on host
```
root@mgt01:~# ssh node63 virsh list
Id Name State
---------------------------
1 i-2-19-VM running
2 i-2-11-VM running
```
error message are
```
Cannot transmit host 2 to Enabled state
com.cloud.utils.fsm.NoTransitionException: No next resource state found for current state = Enabled event = DeleteHost
at com.cloud.resource.ResourceManagerImpl.resourceStateTransitTo(ResourceManagerImpl.java:1216)
at com.cloud.resource.ResourceManagerImpl$1.doInTransactionWithoutResult(ResourceManagerImpl.java:907)
```
* forceha: Make ForceHA dynamic
If VM details contain rootdisksize, volume entry in DB should reflect correct size when VM reset is performed.
Fixes#3957
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
There is a potential security issue with having http access to the VR from anywhere.
This PR restricts http access to the VR to the internal network only.
Fixes#4517
Adds capacity checks for RandomAllocator (host allocator)
Factors out host cpu capability and capacity check wrt serviceoffering code into CapacityManager.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Fixes#4729
As reported in the issue ACP 4.7 used a normal UUID in db for a presetup primary store on Xenserver.
Later the value has been changed to store's path with '/' removed.
Current changes try to retrieve SR's name-lable from store's path if UUID doesn't match path field for a pre-setup store.
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
* server: fix failed to remove template/iso if upload from local fails
When upload template/iso/volume from local fails, the install_path will not be a full path of file so removing it will fail.
```
mysql> select install_path from template_store_ref;
+--------------------------------------------------------------------+
| install_path |
+--------------------------------------------------------------------+
| template/tmpl/1/3/805f4763-248e-40ec-b79a-b868cc480d0a.qcow2 |
| template/tmpl/1/4/c7e32c9e-5e72-3726-85cf-aa5ccd84118d.qcow2 |
| template/tmpl/2/201/bc4f4f08-138a-31b8-af1a-d4450eff7982.qcow2 |
| template/tmpl/2/202 |
| template/tmpl/2/203/203-2-d47f8cde-a2a8-31e7-a826-2628ad98a6c8.iso |
| template/tmpl/2/204 |
| template/tmpl/5/205 |
| template/tmpl/2/206 |
| template/tmpl/2/207 |
| template/tmpl/2/208 |
| template/tmpl/2/209 |
| template/tmpl/2/210 |
+--------------------------------------------------------------------+
12 rows in set (0.00 sec)
mysql> select install_path from volume_store_ref;
+---------------------------------------------------------+
| install_path |
+---------------------------------------------------------+
| volumes/2/22 |
| volumes/2/19/f93face9-6521-4184-b89a-cb07f86bbae8.qcow2 |
| volumes/2/23 |
| volumes/2/24 |
+---------------------------------------------------------+
4 rows in set (0.00 sec)
```
* server: disallow removing template/iso in NotUpload and UploadInProgress state
When running get_bridge_physdev(brname) from security_group.py it is returned the bridge device as brname: instead of the expected brname.
We experienced this issue on CloudStack 4.13.1.0 with Security Groups enabled for Advanced Networking. Additionally, KVM nodes are running on Ubuntu 18.04.
PR #4303 (merged in 4.15) added support for Ubuntu 20.04 which turned out to fix get_bridge_physdev(brname); however, we faced the very same issue with the previous CloudStack and Ubuntu versions.
Even though we might not get a 4.13.2, and CloudStack 4.14.1.0 has just been released, this PR proposes merging the fix into branch 4.13 and then get it forwarded into 4.14. Thus, allowing users to have this fix referenced and also opening the possibility of addressing this in case of a potential 4.14.2.0.
* Prevent KVM from performing volume migrations of running instances
KVM has a limitation to modify instances definitions while they are on running state. Therefore, it is not possible to change volumes backend location easily.
There is a problem in the `migrateVolume` API. This API command ignores that limitation and causes an inconsistence on the database. ACS processes the migrate command, copies the volume to the destination storage, modifies the database and finishes the process with success. However, the running backend is still using the "old volume file".
This PR intends to prevent KVM to perform volumes migrations while KVM instances are in the running state and inform the user of an alternative API command that enables such operation on running instances.
* Update VolumeApiServiceImpl.java
Co-authored-by: Daniel Augusto Veronezi Salvador <daniel@scclouds.com.br>
Co-authored-by: Rohit Yadav <rohit@apache.org>
Steps to reproduce the issue
(1) create two vm wei-001 and wei-002, start them
(2) check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They have entries for both of wei-001 and wei-002
(3) stop wei-002, and restart VR (or restart network with cleanup).
check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They have entries for wei-001 only (as wei-002 is stopped)
(4) expunge wei-002. when it is done,
check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They do not have entries for wei-001.
VR health check fails at dhcp_check.py and dns_check.py
macchinina-vmware.ova has changed at http://dl.openvm.eu/cloudstack/macchinina/x86_64/
sha1, sha256 and m5 checksum have been updated for template file in test_template.py
Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
Steps to reproduce the issue:
(1)Create 10000 service offerings (by db changes below or cloudmonkey).
```
DROP PROCEDURE IF EXISTS cloud.insert_service_offering;
DELIMITER $$
CREATE PROCEDURE cloud.insert_service_offering()
BEGIN
DECLARE count INT DEFAULT 10000;
SET @offeringid = (select max(id)+1 from disk_offering);
WHILE count > 0 DO
INSERT INTO disk_offering (id,name,uuid,display_text,disk_size,type,created) values (@offeringid,'test-offering-wei',uuid(), 'test-offering-wei',0,'Service',now());
INSERT INTO service_offering (id,cpu,speed,ram_size) values (@offeringid, 1, 500,256);
SET @offeringid = @offeringid + 1;
SET count = count - 1;
END WHILE;
END $$
DELIMITER ;
CALL cloud.insert_service_offering();
mysql> CALL cloud.insert_service_offering();
Query OK, 0 rows affected (2 min 30.85 sec)
```
(2) Check the total time of periodical capacity check in cloudstack.
Without this patch, it spend 2.5 seconds (2 hosts)
```
2021-01-15 16:10:12,793 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Running Capacity Checker ...
2021-01-15 16:10:15,287 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Done running Capacity Checker ...
```
With this patch ,it spend 1.3 seconds (2 hosts)
```
2021-01-15 16:12:43,604 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Running Capacity Checker ...
2021-01-15 16:12:44,927 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Done running Capacity Checker ...
```
If there are 100 hosts, the total time will be reduced from 100+ seconds to around 10 seconds.
We can use cloudmonkey to scale a vm with dynamic offering, to same offering but with different cpunumber or memory.
Enable it on UI to improve user experience.
This PR fixes an issue when move a vm from an account to another account.
Steps to reproduce the issue
(1) create a vm with multiple shared networks (in advanced zone, or advanced zone with security groups)
(2) create another account (in same domain who can also access the shared networks)
(3) move vm to new account, with a list of networkid
expected result: the vm has nics on the networks in same order as specified in API request, and nics have the same ips as before actual result: network order is not same as specified, ips are changed.
* server: fix cannot create vm if another vm with same name has been added and removed on the network
steps to reproduce the issue
(1) create vm-1 on network-1
(2) add vm-1 to network-2
(3) remove vm-1 from network-2
(4) create another vm with same name vm-1 on network-2
expected result: operation succeed
actual result: operation failed.
* #4600: add back a removed line
This PR fixes a regression issue in #4497
In cloudstack 4.14 or before, the cpu topology is set only when cpucore per socket is set (to 4 or 6).
in other conditions, there is no cpu topology in vm xml definition.
with #4497, vm will have cpu topology in its xml definition, if cpucore per socket is not set.
<topology sockets='<vm cpu cores>' cores='1' threads='1'/>
Not sure if it causes any issue. I think it would be better not to add this part in vm xml definition if cpucore per socket is not set.
* Fix failure in validating IP address in case of multiple Management Servers
* refactor code
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>