Commit Graph

2331 Commits

Author SHA1 Message Date
Paul Angus b72aadb65d Updating pom.xml version numbers for release 4.11.4.0-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2019-07-03 09:28:09 +01:00
Rohit Yadav 6f1fc18332 Revert "Updating pom.xml version numbers for release 4.11.4.0-SNAPSHOT"
This reverts commit 5bfad44ef4 because
we'll need another RC on latest 4.11 branch towards 4.11.3.0.
2019-06-25 21:31:05 +05:30
Rohit Yadav 64d1316e60 engine/schema: add upgrade path for source 4.11.2.0 version
This adds a missing upgrade path for source version 4.11.2.0. Without
this fix, 4.11.2.0 users won't be able to upgrade to 4.11.3.0.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-25 21:31:05 +05:30
Paul Angus 5bfad44ef4 Updating pom.xml version numbers for release 4.11.4.0-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2019-06-22 10:21:02 +01:00
Paul Angus 51124b7b35 Updating pom.xml version numbers for release 4.11.3.0
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2019-06-10 16:15:05 +01:00
Rohit Yadav a3e5664b53
schema: add 4.11.2 to 4.11.3 systemvmtemplate upgrade path (#3381)
Add 4.11.2 to 4.11.3 systemvmtemplate upgrade path

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-07 23:03:05 +05:30
Nicolas Vazquez c9ce3e2344 router: Persistent DHCP leases file on VRs and cleanup /etc/hosts on VM deletion (#3351)
Since the CloudStack virtual router was redesigned on version 4.6 it has been observed that the DHCP leases file is not persistent across network operations. This causes conflicts on guest VMs static IPs, causing these static IPs to not be renewed by the DHCP server running on isolated and VPC networks' virtual routers (dnsmasq). On stopping or destroying a VM, its dhcp/dns records are not removed from the virtual router causing ghost effects.

Fixes #3272
Fixes #3354

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-03 17:04:16 +05:30
Nicolas Vazquez e86f671c8e KVM: Fix agents dont reconnect post maintenance (#3239)
* Keep connection alive when on maintenance

* Refactor cancel maintenance and unit tests

* Add marvin tests

* Refactor

* Changing the way we get ssh credentials

* Add check on SSH restart and improve marvin tests
2019-05-23 14:13:17 +02:00
Anurag Awasthi f9b61bc737 orchestration: Allow VM that has never started to have volumes attached (#3276)
With this patch b766bf7
we started tracking disks in attaching state so that other attach request can fail gracefully. However this missed the case where disks were in allocated state but attach was requested.

For the use case where users want to attach disk in allocated state but not ready, we need to have allocated-attaching transition as well. We must take care of returning to the original state - allocated or ready - when attach request has completed.

For the use case of unstarted vm's the disk must proceed as follows - "Allocated" -> Attaching -> Allocated. When VM is started, the disk is "created" and pool is assigned. For the use case of started VMs it's more trivial and disk proceeds as follows - Ready -> Attaching -> Ready.

Test this by creating a VM with "startvm=false", create a disk and try attaching it in allocated state. It would give an exception on latest 4.11 but will be fixed on this patch.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-10 23:40:38 +05:30
ustcweizhou b60daf7142 server: Fix exception while update domain resource count (#3204) 2019-04-29 08:51:08 +02:00
Nathan Johnson bf805d1483 Add back ability to disable backup of snapshot to secondary (#3122)
* The snapshot.backup.rightafter configuration variable was removed by:

SHA: 6bb0ca2f85

This adds it back, though named snapshot.backup.to.secondary now instead.

This global parameter, once set, will allow you to prevent automatic backups of
     snapshots to secondary storage, unless they're actually needed.

Fixes #3096

* updates per review
2019-02-04 19:08:42 -02:00
Nicolas Vazquez 13c81a8ee4 server: Prevent corner case for infinite PrepareForMaintenance (#3095)
A corner case was found on 4.11.2 for #2493 leading to an infinite loop in state PrepareForMaintenance

To prevent such cases, in which failed migrations are detected but still running on the host, this feature adds a new cluster setting host.maintenance.retries which is the number of retries before marking the host as ErrorInMaintenance if migration errors persist.

How Has This Been Tested?
- 2 KVM hosts, pick one which has running VMs as H
- Block migrations ports on H to simulate failures on migrations:
iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 49152:49215 -m comment --comment 'test block migrations' iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 16509 -m comment --comment 'test block migrations
- Put host H in Maintenance
- Observe that host is indefinitely in PrepareForMaintenance state (after this fix it goes into ErrorInMaintenance after retrying host.maintenance.retries times)
2018-12-28 15:14:16 +05:30
Paul Angus fb80e51307 Updating pom.xml version numbers for release 4.11.3.0-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-11-20 13:11:52 +00:00
Bitworks LLC f6e600e4d8 CLOUDSTACK-3009: Fix resource calculation CPU, RAM for accounts. (#3012)
The view "service_offering_view" doesn't include removed SOs, as a result when SO is removed, the bug happens. The PR introduces a change for resource calculation changing "service_offering_view" to "service_offering" table which has all service offerings.

Must be fixed in:

4.12
4.11
Fixes: #3009
2018-11-13 06:29:08 +05:30
Rohit Yadav 9cf57d2568
network: on rolling restart force stop old routers (#2926)
This force stops old VRs when performing rolling restart with
cleanup=true. This will ensure that VRs are powered off quickly than
wait longer for the normal ACPI shutdown. During testing, it was found
on VMware where VM stops are slow compared to XenServer and KVM.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-25 09:20:39 +05:30
Rohit Yadav 5ce14df31f
network: Allow ability to disable rolling restart feature (#2900)
This adds a global setting for admins who may not want the rolling
restart of routers or are seeing any issues around it. In future, this
setting may be removed.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-17 20:27:08 +05:30
Rohit Yadav ea771cfda4
router: Fixes #2719 program VR nics by device id order for VPC (#2888)
This fixes #2719 where private gateway IP might be incorrectly
programmed on a guest network nic. The VR would now check ipassoc
requests by mac addresses than provided nic/device id in case they are
wrong.

The root cause is that the device id information is lost when aggregated
commands are created upon starting of a new VPC VR, without the correct
device id in ip_associations json it mis-programs the VR.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-10 15:20:36 +05:30
Paul Angus fe10e684f9
Merge pull request #2743 from nuagenetworks/bugfix/marvin_config_drive
CLOUDSTACK-10380: Fix startvm giving another pw after pw reset
2018-09-26 10:21:52 -04:00
Rohit Yadav 70dbfa7883
systemvm: export $TYPE before patching ssvm/cpvm (#2855)
This fixes a regression introduced in #2799, by exporting $TYPE
before the `patch` is called to patch/extract archives for ssvm/cpvm.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-21 14:19:18 +05:30
René Moser 223a373e53 orchestration: Fixes #2845 PowerReportMissing for new VRs (#2846)
Fixes #2845
2018-09-18 11:34:31 +05:30
Frank Maximus 02e2825d2d CLOUDSTACK-10380: Fix startvm giving another password after password reset. 2018-09-17 16:33:35 +02:00
Nicolas Vazquez 8aff96cfc5 Fixes #2838 exception in Vmware full clones update (#2840)
Fixes #2838
2018-09-14 13:58:28 +05:30
Rohit Yadav 5a046e243a
systemvmtemplate: new 4.11.2 template and fixes (#2799)
VMware router will be rebooted based on #2794, per current config
the VRs on reboot will go through fsck checks slowing down the deployment
process by few seconds. This will ensure that fsck checks are done
on every 3rd boot of the VR. The `4` is used because 1st boot is done
during the build of systemvmtemplate appliance.

Add upgrade path for a new 4.11.2 systemvmtemplate.
Other changes:
- Add support for XS 7.5 Fixes #2834.
- Reboot VR only if mgmt gw is not pingable on vmware.
- Enable passive ftp by enabling nf_conntrack_helper. This is change in behaviour since linux 4.7

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-12 14:42:05 +05:30
Rohit Yadav 2ab3976c0d
CLOUDSTACK-9473: storage pool capacity check when volume is resized or migrated (#2829)
* CLOUDSTACK-9473: storage pool capacity check when volume is resized or migrated

Storage pool checker is not being called on resize and migrate volume.
This may lead to allocated percentage of storage above 100%.

Setup:
1 VMware cluster with 2 Hosts.

Executed Steps:

Applied the following global settings:
storage.overprovisioning.factor = 1
pool.storage.allocated.capacity.disablethreshold = 1
pool.storage.capacity.disablethreshold = 1
Restarted management server
Executed Resize and migrate pool and Observed that Storage pool checker is not performed on resizeVolume and migrateVolume.
Result:
Root cause analysis shows storage pool checker is not called when doing migration and resizing.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-07 22:01:16 +05:30
dahn eb3953f41a server: expunge if flag is set (#2825)
In integration work for CCS I found that the service call UserVmService.destroyVm(long uuid, boolean expunge) does not honour the expunge flag. I traced it down to the implementation VirtualMachineManagerImpl.destroy(String vmUuid, boolean expunge).
Testing: manual testing so far, testing will pose some crosscutting challanges as the behaviour and implementation are seperated by about five layers of abstraction.
2018-09-04 13:38:26 +05:30
Nicolas Vazquez c68713470d backport: Update DBCP version to 4.11 (#2809)
Backport #2718 to 4.11 branch for 4.11.2.0
2018-08-17 16:01:57 +05:30
Mike Tutkowski ab83c198a5 Changed the implementation of isVolumeOnManagedStorage(VolumeInfo) to check if the data store in question is for primary storage (and added a unit test from Daan Hoogland) 2018-08-10 11:24:18 -06:00
Khosrow Moossavi 67860d9f46 maven: Updating pom.xml version numbers for release 4.11.2.0-SNAPSHOT (#2728)
Fixes the version in pom etc. to be consistent with versioning pattern as X.Y.Z.0-SNAPSHOT after a minor release.

Signed-off-by: Khosrow Moossavi <khos2ow@gmail.com>
2018-07-06 17:27:12 +05:30
Paul Angus 8ba318da19 Updating pom.xml version numbers for release 4.11.2-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-06-26 17:53:54 +01:00
Paul Angus 2cb2dacbe7 Updating pom.xml version numbers for release 4.11.1.0
Signed-off-by: Paul Angus <paulangus@PA-Ansible-GUI.sblab.local>
2018-06-21 15:52:43 +01:00
Nicolas Vazquez 539d7e10f3
Merge pull request #2493 from shapeblue/fixmaintenance
CLOUDSTACK-10326: Prevent hosts fall into Maintenance when there are running VMs on it
2018-06-20 12:00:58 -03:00
Rohit Yadav 39471c8c00
configdrive: make fewer mountpoints on hosts (#2716)
This ensure that fewer mount points are made on hosts for either
primary storagepools or secondary storagepools.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-06-20 12:25:16 +05:30
nvazquez 08a8330633 CLOUDSTACK-10326: Fix for infinite loop on PrepareForMaintenance 2018-06-11 09:53:21 -03:00
nvazquez cc35f9ddb0 CLOUDSTACK-10326: Prevent hosts fall into Maintenance when there are running VMs on it 2018-06-11 09:53:20 -03:00
Daan Hoogland 84b0cf0809 comment on unencryption 2018-06-11 09:19:26 +00:00
Daan Hoogland e09069cee7 isisnot= 2018-06-08 17:16:36 +02:00
Daan Hoogland 2bf78e92a5 extra message 2018-06-08 16:45:40 +02:00
Daan Hoogland 82a46d1b8d debug message 2018-06-08 16:30:49 +02:00
Daan Hoogland 40f856181c imports 2018-06-08 13:20:48 +02:00
Daan Hoogland 384bce1a97 update without decrypt doesn't work 2018-06-08 12:55:05 +02:00
Daan Hoogland 5fcadbcc62 set unsensitive attributes as not 'Secure' 2018-06-07 22:26:26 +02:00
Daan Hoogland 935ca766dc remove old config artifacts from update path 2018-06-07 07:52:24 +00:00
Nicolas Vazquez 76367db8fb L2: add default L2 network offerings (#2683)
Adds default L2 network offerings. Adds check for existing default L2 networks.
2018-06-07 11:23:35 +05:30
Nicolas Vazquez 99ca81a676 ui: do not send conserve mode on L2 network offering creation from the UI (#2694)
Do not send conserve mode param on L2 network offering creation from the UI. Fix config drive NPE issue on L2 network.
2018-06-07 11:20:37 +05:30
Frank Maximus 8798014ca8 CLOUDSTACK-10377: Fix Network restart for Nuage (#2672)
Changes in PR #2508 have caused network restart to fail in a Nuage setup,
as the new VR takes the same IP as the old one, and the old VR is still running.
Nuage doesn't support multiple VM's having the same IP.
We delay provisioning the interfaces in VSD until the old VR interface is released.
2018-06-06 12:17:10 +05:30
Rafael Weingärtner 9b83337658 Create unit test cases for 'ConfigDriveBuilder' class (#2674)
* Create unit test cases for 'ConfigDriveBuilder' class

* add method 'getProgramToGenerateIso' as suggested by rohit and Daan

* fix encoding for base64 to StandardCharsets.US_ASCII

* fix MockServerTest.testIsMockServerCanUpgradeConnectionToSsl()

This is another method that is causing Jenkins to fail for almost a month
2018-06-04 13:20:09 +02:00
Mike Tutkowski e471a46a05 managed-storage: handle VM start in a new cluster on different host (#2656)
Example: A VM that uses managed storage is stopped. The VM is then started on a different host in the same cluster. The Start operation fails.

To get around this issue, you must either start the VM up on the same host or on a host in a different cluster.

The reason is due to a slightly erroneous check in VolumeOrchestrator.prepare.

To solve this issue, we should be checking if the cluster ID changes, not if the host ID changes.
2018-05-21 16:22:33 +05:30
Rohit Yadav acc5fdcdbd
CLOUDSTACK-10290: allow config drives on primary storage for KVM (#2651)
This introduces a new global setting `vm.configdrive.primarypool.enabled` to toggle creation/hosting of config drive iso files on primary storage, the default will be false causing them to be hosted on secondary storage. The current support is limited from hypervisor resource side and in current implementation limited to `KVM` only. The next big change is that config drive is created at a temporary location by management server and shipped to either KVM or SSVM agent via cmd-answer pattern, the data of which is not logged in logs. This saves us from adding genisoimage dependency on cloudstack-agent pkg.

The APIs to reset ssh public key, password and user-data (via update VM API) requires that VM should be shutdown. Therefore, in the refactoring I removed the case of updation of existing ISO. If there are objections I'll re-put the strategy to detach+attach new config iso as a way of updation. In the refactored implementation, the folder name is changed to lower-cased configdrive. And during VM start, migration or shutdown/removal if primary storage is enable for use, the KVM agent will handle cleanup tasks otherwise SSVM agent will handle them.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-21 14:27:23 +05:30
Slair1 f23278a438 CLOUDSTACK-10309: Add option on if to VM HA power-on a OOB-shut-off-VM (#2473)
When a user shuts down their VM from the guest OS (and VM HA is enabled), the VM just powers itself back on. Our environment is on KVM hosts.

CloudStack does not know the difference between a VM failing or being shutdown from within the guest OS.

This is a major pain point for all our users - especially since they don't pay for VMs when they are shutoff. It is not intuitive for end-users to understand why they can't shutdown VMs from within the guest OS. Especially when they all come from (non-cloudstack) VMware and Hyper-V environments where this is not an issue.

However, if a host fails, we need VM HA to still work.

This PR that creates a configuration option "ha.vm.restart.hostup". With this option set to false, if CloudStack sees a VM shutdown out-of-band, but the host it was on is still online, then it won't power the VM back on. The logic is that since the host is online, it was most likely shutdown from the guest OS.

For when a host actually fails, standard VM HA logic takes over and powers on VMs (if they have VM HA enabled) if the host they were on fails.

If that "ha.vm.restart.hostup" option is true (the default to match current functionality), it works like always, and even in-guest shutdowns of VMs causes CloudStack to power back on the VM.
2018-05-21 13:13:38 +05:30
Rafael Weingärtner b9ed42bd29
Fix primary storage count when deleting volumes (#2629)
* Primary Storage count for an account does not decrease when a Data Disk is deleted

When a data disk is created and not attached in a running VM, the "deleteVolume" will not decrement the count for used primary storage in the VMs accounting information. The property that is not being decremented is called "primarystoragetotal"; this information can be retrieved via "listAccounts" API method.

Steps to reproduce this issue:
1 - Create an account, deploy a VM in it
2 - Check the primary storage count for the account with listAccounts API
3 - Create a data disk
4 - Check the primary storage count for the account with listAccounts API
5 - Delete the Data disk
6 - Check the primary storage count for the account with listAccounts API - It is the same as before deleting the data disk (it should not be the same as the value in step 2!)

* formatting and cleanups

* fix imports that were wrongly changed during rebase
2018-05-16 15:28:28 -03:00