Commit Graph

31492 Commits

Author SHA1 Message Date
ustcweizhou b8522c97cb server: allow dedicate ip range to a domain if ips are used by an accout in the domain (#3206)
when we dedicate public ip range to a domain but some ips are used by an account in the domain,
the operation should be allowed but actually fails for now.
It is because cloudstack check if ips are used by same account by account name,
However, accountName is null when dedicate public ip range to a domain.

Modify the code to check account id only when dedicate ip range to account.
2019-05-31 12:24:33 +05:30
ustcweizhou bd78030385 server: update dhcp configurations in vrs while update default nic of running vms (#3205)
In virtual routers, there are different dnsmasq settings for default nic and non-default nic on vm.
We need to update dhcp informations on network vrs when default nic is changed.

For example, if 172.16.1.135 is non-default nic of vm VPC1-001-001, then

root@r-22-VM:~# cat /etc/dhcphosts.txt
02:00:1d:15:00:05,set:172_16_1_135,172.16.1.135,VPC1-001-001,710h
root@r-22-VM:~# cat /etc/dhcpopts.txt
172_16_1_135,3
172_16_1_135,6
172_16_1_135,15

If it is default nic,then

root@r-22-VM:~# cat /etc/dhcpopts.txt
root@r-22-VM:~# cat /etc/dhcphosts.txt
02:00:1d:15:00:05,172.16.1.135,VPC1-001-001,757h

Fixes #3201
2019-05-31 12:23:55 +05:30
Rohit Yadav 8c387f9de6
vmware: fix potential NPE when memory hotplug capability is checked (#3362)
This fixes potential NPE case when memory hotpluggability is checked
based on the guest OS descriptor.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-31 10:27:36 +05:30
ustcweizhou f6f381fc68 ui: fix enable static nat only towards first nic and not on any other interface (#3338)
When enable static nat in a vpc on UI, it only lists the primary and secondary ips of first nic of a vm, no matter which vpc tier is selected. The same issue happens when add a vm to load balancer.

Fixes #3334
2019-05-30 11:44:52 +05:30
smlshn f1efcc1af6 ui: reset multiselect actions when refreshing listView in Instance page (#3359)
Enables the toolbar to reset to its initial state after any multiSelectAction completed.

Fixes #3337
2019-05-30 11:38:14 +05:30
ustcweizhou 8e43d258f3 server: Fail to restart VPC with cleanup if there are multiple public IPs in different subnet" (#3342)
If there are multiple IPs in different subnet assigned to a VPC, after restarting VPC with cleanup, the VRs will be FAULT state.

Step to reproduce:
(1) create vpc, source nat IP is 10.11.118.X
(2) assign two public IPs in other subnet to this VPC. 10.11.119.X and 10.11.119.Y
(3) deploy two vms in the vpc, and enable static nat 10.11.119.X and 10.11.119.Y to these two vms
(4) restart vpc with cleanup. There are more than 1 nic allocated for 10.11.119 to new VRs

Logs as below:
2019-05-10 14:12:24,652 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-36:ctx-839f6522 job-652 ctx-35fb4667) (logid:1ab7aa37) Allocating nic for vm VM[DomainRouter|r-85-VM] in network Ntwk[200|Public|1] with requested profile NicProfile[0-0-null-10.11.118.157-vlan://untagged
2019-05-10 14:12:24,676 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-36:ctx-839f6522 job-652 ctx-35fb4667) (logid:1ab7aa37) Allocating nic for vm VM[DomainRouter|r-85-VM] in network Ntwk[200|Public|1] with requested profile NicProfile[0-0-null-10.11.119.110-vlan://119
2019-05-10 14:12:24,699 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-36:ctx-839f6522 job-652 ctx-35fb4667) (logid:1ab7aa37) Allocating nic for vm VM[DomainRouter|r-85-VM] in network Ntwk[200|Public|1] with requested profile NicProfile[0-0-null-10.11.119.110-vlan://119
2019-05-10 14:12:24,723 DEBUG [o.a.c.e.o.NetworkOrchestrator] (API-Job-Executor-36:ctx-839f6522 job-652 ctx-35fb4667) (logid:1ab7aa37) Allocating nic for vm VM[DomainRouter|r-85-VM] in network Ntwk[200|Public|1] with requested profile NicProfile[0-0-null-10.11.119.110-vlan://119

This is a regression issue caused by commit 1d382e0
2019-05-30 11:33:03 +05:30
dahn 910b08f72b server: fix duplicate tag exception as CloudRuntimeException (#3348)
See #3339: a runtime exception is thrown but it should be converted to an error return. Wrapping it in a CloudRuntimeException should do the trick.

Fixes #3339
2019-05-30 11:25:52 +05:30
Rohit Yadav 0929866956
server: ssh-keygen in PEM format and reduce main systemvm patching script (#3333)
On first startup, the management server creates and saves a random
ssh keypair using ssh-keygen in the database. The command does
not specify keys in PEM format which is not the default as generated
by latest ssh-keygen tool.

The systemvmtemplate always needs re-building whenever there is a change
in the cloud-early-config file. This also tries to fix that by introducing a
stage 2 bootstrap.sh where the changes specific to hypervisor detection
etc are refactored/moved. The initial cloud-early-config only patches
before the other scripts are called.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-23 18:08:00 +05:30
Nicolas Vazquez e86f671c8e KVM: Fix agents dont reconnect post maintenance (#3239)
* Keep connection alive when on maintenance

* Refactor cancel maintenance and unit tests

* Add marvin tests

* Refactor

* Changing the way we get ssh credentials

* Add check on SSH restart and improve marvin tests
2019-05-23 14:13:17 +02:00
Rohit Yadav 9ff819da2c
systemvm: new qemu-guest-agent based patching for KVM (#3278)
This introduces a new patching script for patching systemvms on KVM
using qemu-guest-agent that runs inside the systemvm on startup. This
also removes the vport device which was previously used by the legacy
patching script and instead uses the modern and new uniform guest
agent vport for host-guest communication.

Also updates the sytemvmtemplate build config to use the latest Debian
9.9.0 iso.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-10 23:42:19 +05:30
Anurag Awasthi f9b61bc737 orchestration: Allow VM that has never started to have volumes attached (#3276)
With this patch b766bf7
we started tracking disks in attaching state so that other attach request can fail gracefully. However this missed the case where disks were in allocated state but attach was requested.

For the use case where users want to attach disk in allocated state but not ready, we need to have allocated-attaching transition as well. We must take care of returning to the original state - allocated or ready - when attach request has completed.

For the use case of unstarted vm's the disk must proceed as follows - "Allocated" -> Attaching -> Allocated. When VM is started, the disk is "created" and pool is assigned. For the use case of started VMs it's more trivial and disk proceeds as follows - Ready -> Attaching -> Ready.

Test this by creating a VM with "startvm=false", create a disk and try attaching it in allocated state. It would give an exception on latest 4.11 but will be fixed on this patch.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-10 23:40:38 +05:30
ustcweizhou b60daf7142 server: Fix exception while update domain resource count (#3204) 2019-04-29 08:51:08 +02:00
Rohit Yadav 96611fc640
packaging: systemctl daemon-reload after agent install or upgrade (#3269)
This runs systemctl daemon-reload after cloudstack-agent is installed
or upgraded.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-04-09 14:21:09 +05:30
Rohit Yadav 55efaf14d9
packaging: don't skip unit tests while building packages (#3266)
This may slow down CI and release, but ensures that unit tests always
run as part of the packaging build process.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-04-08 13:51:30 +05:30
Boris Stoyanov - a.k.a Bobby c4b06ffe19
Merge pull request #3238 from shapeblue/tls-vmware-issue-fix
client: don't disable TLSv1, TLSv1.1 by default that breaks VMware env
2019-03-27 11:39:54 +02:00
Rohit Yadav 6e51bde228 client: don't disable TLSv1, TLSv1.1 by default that breaks VMware env
This fixes the issue that TLSv1 and TLSv1.1 are still used by CloudStack
management server to communicate with VMware vCenter server. With the
current defaults, the setup/deployment on VMware fails. Users/admins
can however setup the security file according to their env needs to
disable TLSv1 and TLSv1.1 for server sockets (8250/agent service for
example).

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-03-26 18:15:20 +05:30
Paul Angus 8b25fdf520 tests: fix some Marvin smoketests (#2869)
Fixes intermittent failures:
- Add a pause to avoid  tests restarting before VRs recovering from HA have fully booted.
- Add some pauses to allow services to restart and hosts to reconnect before continuing tests.
- Adding a loop around this method because sometimes VRs are overloaded and just respond with exception, so it'll catch it and try 5 times with 30sec cooldown. If that fails as well it'll fail the test.
- wait until host is up using explicit check
2019-03-26 15:33:40 +05:30
Dingane Hlaluku 55fb1c4eb6 server: Allow users to create L2 network types (#3158)
Allow users of all types to create L2 guest networks.

Fixes #3081
2019-03-25 13:12:19 +05:30
Rohit Yadav f7327c7457 systemd: Fix -Dpid arg passing to systemd usage service (#3210)
* systemd: Fix -Dpid arg passing to systemd usage service

This fixes regression introduced by refactoring PR #3163 where `-Dpid`
was incorrectly passed string `$$` instead of parent PID integer.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* fix systemd limitation, exec using /bin/sh instead and wrap in ${} syntax

https://www.freedesktop.org/software/systemd/man/systemd.service.html#Command%20lines

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* usage: don't hide exception from Gabriel's https://github.com/apache/cloudstack/pull/3207/files#diff-062fcf5ae32de59dfd6cd4f780e1d7cd

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-03-14 09:19:12 -03:00
Rohit Yadav cb3fed0e4e systemd: fix services to allow TLS configurations via java.security.ciphers (#3163)
* systemd: fix services to allow TLS configurations via java.security.ciphers

This fixes the management server and systemd services to allow the
java.security.ciphers file to configure disabled TLS protocols and
algorithms. This also cleans up systemd service files for agent and
usage server.

This fixes #3140

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* configure: fix travis failure due pycodestyle error

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-02-04 19:51:30 -02:00
Nathan Johnson bf805d1483 Add back ability to disable backup of snapshot to secondary (#3122)
* The snapshot.backup.rightafter configuration variable was removed by:

SHA: 6bb0ca2f85

This adds it back, though named snapshot.backup.to.secondary now instead.

This global parameter, once set, will allow you to prevent automatic backups of
     snapshots to secondary storage, unless they're actually needed.

Fixes #3096

* updates per review
2019-02-04 19:08:42 -02:00
Rohit Yadav 463372bc7e
packaging: management default file cleanup (#3139)
This cleanups management server default file, the `cloud.jks` is no
longer created by the management server but instead created in-memory
by the root CA plugin on management server startup.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-01-25 22:19:33 +05:30
Rohit Yadav 53ec27cd7a
scripts: don't restart systemvm cloud.service post cert import (#3134)
This ensures that the systemvm agent (cloud.service) is not restarted
on certificate import. The agent has an inbuilt logic to attempt reconnection.
If the old certificates/keystore is invalid agent will attempt reconnection.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-01-16 21:48:57 +05:30
Dingane Hlaluku e56c499fb8 vmware: syncVolumeToRootFolder method to avoid an infite recursive loop (#3105)
The static method syncVolumeToRootFolder() from VmwareStorageLayoutHelper.java:146 has been incorrectly called and leads to an infinite recursive call that ends up in a StackOverflowError. This PR fixes this.
public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, null); } -> public static void syncVolumeToRootFolder(DatacenterMO dcMo, DatastoreMO ds, String vmdkName, String vmName) throws Exception { syncVolumeToRootFolder(dcMo, ds, vmdkName, vmName, null); }
2019-01-07 13:59:45 +05:30
Nicolas Vazquez 13c81a8ee4 server: Prevent corner case for infinite PrepareForMaintenance (#3095)
A corner case was found on 4.11.2 for #2493 leading to an infinite loop in state PrepareForMaintenance

To prevent such cases, in which failed migrations are detected but still running on the host, this feature adds a new cluster setting host.maintenance.retries which is the number of retries before marking the host as ErrorInMaintenance if migration errors persist.

How Has This Been Tested?
- 2 KVM hosts, pick one which has running VMs as H
- Block migrations ports on H to simulate failures on migrations:
iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 49152:49215 -m comment --comment 'test block migrations' iptables -I OUTPUT -j REJECT -m state --state NEW -m tcp -p tcp --dport 16509 -m comment --comment 'test block migrations
- Put host H in Maintenance
- Observe that host is indefinitely in PrepareForMaintenance state (after this fix it goes into ErrorInMaintenance after retrying host.maintenance.retries times)
2018-12-28 15:14:16 +05:30
Rohit Yadav 68b4b84101
marvin: add missing test data for test_migration smoketest (#3106)
Add missing test data for test_migration smoketest, use test_template for smoketest.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-12-24 23:43:57 +05:30
Anurag Awasthi a7ccbdc790 api: allow keyword search in listSSHKeyPairs (#2920) (#3098)
Adds support for keyword search that was ignored by listsshkeypairs command.

Fixes: #2920
2018-12-23 00:34:53 +05:30
Craig Squire 8d53557ba7 api: don't throttle api discovery for listApis command (#2894)
Users reported that they weren't getting all apis listed in cloudmonkey when running a sync. After some debugging, I found that the problem is that the ApiDiscoveryService is calling ApiRateLimitServiceImpl.checkAccess(), so the results of the listApis command are being truncated because Cloudstack believes the user has exceeded their API throttling rate.

I enabled throttling with a 25 request per second limit. I then created a test role with only list* permissions and assigned it to a test user. When this user calls listApis, they will typically receive anywhere from 15-18 results. Checking the logs, you see The given user has reached his/her account api limit, please retry after 218 ms..

I raised the limit to 200 requests per second, restarted the management server and tried again. This time I got 143 results and no log messages about the user being throttled.
2018-12-12 23:55:32 +05:30
Rohit Yadav 408cce48a5
travis: fail fast if --with-marvin fails with nose (#3024)
* travis: fail fast if --with-marvin fails with nose

Install missing dependency pycrypto.
This fixes issue with recent Travis runs which gave incorrect results
around smoketests with simulator where each test run failed with an
error like "nosetests: error: no such option: --with-marvin".

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-12-07 23:45:19 +05:30
Craig Squire 290df5f423 api: Discover tags field on superclass of API responses (#3005)
Updated ApiServiceDiscoveryImpl to check superclasses of API responses for fields.

Fixes: #3002
2018-12-04 13:59:48 +05:30
Rohit Yadav 89c567add8
security: increase keystore setup/import timeout (#3076)
This increases and uses a default 15mins timeout for VR scripts and for
KVM agent increases timeout from 60s to 5mins. The timeout can
specifically occur when keystore does not get enough entropy from CPU
and script gets killed due to timeout. This is a very specific corner
case and generally should not happen on baremetal/prod environment, but
sometimes seen in nested/test environments.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-12-04 01:28:24 +05:30
Sven Vogel 17097929b6 packaging: correct permissions in spec file and fix class path specified variable (#3030)
Install CentOS 7 e.g. Build 1804 and Java build 1.8.0_181

if you inspect systemd in debug mode you will see some errors
1.
permission of the cloudstack-managment.service are not corretly set
2.
invalid classpath specified. it seems the string which is used will be divided... we now we use ${..} like the lines above ... confused
2018-12-01 01:38:01 +05:30
Boris Stoyanov - a.k.a Bobby 44bc516609 api: move ostypeid from DB id to DB uuid, backports #2528 (#3066)
This is a backport to 4.11 of #2528
2018-11-29 22:20:51 +05:30
Rohit Yadav 29b8a9da48
kvm: when untagged vxlan is used, use the default guest/public bridge (#3037)
When vxlan://untagged is used for public (or guest) network, use the
default public/guest bridge device same as how vlan://untagged works.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-28 22:22:30 +05:30
Rene Diepstraten d425a409fc sg: add secondary ips to the correct ipset based on ip family (#2990)
Currently secondary ipv6 addresses are added to the ipv4 ipset in security_group.py.
This doesn't work, so this patch adds a function to split a set of ips in ipv4 and ipv6 addresses.
Both the default_network_rules and network_rules_vmSecondaryIp functions now utilise this function and add the ips to the appropriate ipsets.
2018-11-28 19:30:13 +05:30
Rohit Yadav a84f7dfde9
marvin: add missing default test data (#3055)
This add missing test data for one of the keys for a recently added
migration test.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-28 00:17:13 +05:30
Rohit Yadav 411f368845 Merge branch 'origin/4.11.2.0-RC20181113T0924' into 4.11
Doing a ignored RC5 tree merge to get 4.11.2.0 tag on 4.11 branch. This
should have been done before merging the commit to move to
4.11.3.0-SNAPSHOT version on 4.11 branch.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-21 13:09:19 +05:30
Rohit Yadav fbb0d92687
surefire: ignore system classloader to make tests run (#3038)
Due to issue described in Surefix bug:
https://issues.apache.org/jira/browse/SUREFIRE-1588

Debian-based users/developers can no longer build CloudStack 4.11+
branches. The other workaround is to have the following jvm property:
jdk.net.URLClassPath.disableClassPathURLCheck=true

Signed-off-by: Rohit Yadav <rohit@apache.org>
2018-11-20 21:13:44 +05:30
Paul Angus fb80e51307 Updating pom.xml version numbers for release 4.11.3.0-SNAPSHOT
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-11-20 13:11:52 +00:00
Nicolas Vazquez 4fabbdb1ae test: Skip network migration tests for not supported hypervisors (#3021)
Skip network migration tests for not supported hypervisors instead of failing.
2018-11-14 15:04:08 +05:30
Paul Angus 5aae410dfc Updating pom.xml version numbers for release 4.11.2.0
Signed-off-by: Paul Angus <paul.angus@shapeblue.com>
2018-11-13 09:24:27 +00:00
Rohit Yadav 4d8e75cec8
systemvmtemplate: update debian 9.6 iso url and checksum (#3022)
This fixes the failing systemvmtemplate build due to 404 on old ISO url.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-13 06:30:09 +05:30
Bitworks LLC f6e600e4d8 CLOUDSTACK-3009: Fix resource calculation CPU, RAM for accounts. (#3012)
The view "service_offering_view" doesn't include removed SOs, as a result when SO is removed, the bug happens. The PR introduces a change for resource calculation changing "service_offering_view" to "service_offering" table which has all service offerings.

Must be fixed in:

4.12
4.11
Fixes: #3009
2018-11-13 06:29:08 +05:30
Paul Angus f95aec4a84
Merge pull request #3018 from shapeblue/fixrouterfilecreation
Prevent error on GroupAnswers on VR creation
2018-11-12 13:25:08 +00:00
Nicolas Vazquez bb7493ad4b configdrive: Add missing ConfigDrive entries on existing zones after upgrade (#3007)
After upgrade existing environments to 4.11, ConfigDrive cannot be enabled for existing zones due to missing entry on 'physical_network_service_providers' table.
2018-11-12 11:30:00 +05:30
nvazquez dea0b3eb78 Prevent error on GroupAnswers on VR creation 2018-11-09 15:30:57 -03:00
Nicolas Vazquez 7d8eb37924 [4.11] Fix set initial reservation on public IP ranges (#2980)
* Fix initial reservation on public IP ranges

* Do not allow dedicating a system VM IP range
2018-11-07 10:48:07 -02:00
Nicolas Vazquez af0c1e48cf Fix DirectNetworkGuru canHandle checks for lowercase isolation methods (#3010) 2018-11-07 09:53:01 -02:00
Rohit Yadav c6e53f6cc6
kvm: reset KVM host on heartbeat failure (#2984)
On actual testing, I could see that kvmheartbeat.sh script fails on NFS
server failure and stops the agent only. Any HA VMs could be launched
in different hosts, and recovery of NFS server could lead to a state
where a HA enabled VM runs on two hosts and can potentially cause
disk corruptions. In most cases, VM disk corruption will be worse than
VM downtime. I've kept the sleep interval between check/rounds but
reduced it to 10s. The change in behaviour was introduced in #2722.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-30 15:13:59 +05:30
Rene Diepstraten f7bc5807a3 vr: defer was broken in VR because of json name change
Committed at f0491d5c72 (#2979).

After upgrade from CS 4.10 to CS 4.11, multiple VRs did not start through.
It did not properly defer the finalize config in update_config.py.
Apparently, the json files are now called differently: where it used to
be vm_dhcp_entry.json it now has a uuid added, for example
vm_metadata.json.4d727b6e-2b48-49df-81c3-b8532f3d6745.
The if statement that checks if the finalize can be safely deferred
therefore no longer matches. This PR contains a fix so finalize is
defered again.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-29 16:19:33 +05:30