Commit Graph

2585 Commits

Author SHA1 Message Date
Suresh Kumar Anaparti a61ba8c755 Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s) 2020-10-12 22:03:53 +05:30
Rohit Yadav 36166046cf
ScaleIO: Storage Plugin (Phase 0+1) (#77)
* scaleio: prototype storage plugin

- plugin skeleton
- add storage pool, create/attach data disk

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* kvm: attach disk example

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* Updated ScaleIO storage plugin to support Volume operations

* ScaleIO storage plugin - Support for VM operations and other updates

* ScaleIO storage pool plugin changes

- Added validation to check existing ScaleIO storage pool and update capacity details
- Updated resize volume for ScaleIO to pick the rounded 8GB boundary size
- Added support for setting ScaleIO storage pool statistics (bandwidthLimitInKbps, iopsLimit)

* Fixed IOPS validation and volume size update when resizing ScaleIO volume

* Removed connect/disconnect disk changes from ScaleIO storage adaptor
- ScaleIO datastore driver does map/unmap ScaleIO volume (from MS) using grant/revoke access
- Not required to map/unmap ScaleIO volume from the storage adaptor

* Updated connect disk, to wait for ScaleIO volume to become available in the KVM host

* Updated ScaleIO storage provider, pool type, url scheme and related paramters to the new "PowerFlex" brand

* Fixed size rounding issue while creating PowerFlex volume and added validations to PowerFlex Gateway API client

* Updated host sdc connection check for ScaleIO/PowerFlex pool on host connect

* Updated volume snapshots support for volumes on ScaleIO/PowerFlex storage pool and Added some validations for ScaleIO disks in host

* Added primary storage level configurable setting "storage.pool.disk.wait" to wait for disk availability

- Confiure the disk availability wait time, mainly introduced for ScaleIO/PowerFlex storage pool (can be used for other managed storages), to wait for the disk to become available in the host before performing any operation on it

* Enabled template spooling to ScaleIO/PowerFlex storage pool and create VM from the spooled template.
Added ScaleIO SDC limits support for volumes using offering parameters: bandwidthLimitInKbps, iopsLimit.

* Added support for VM snapshots on ScaleIO/PowerFlex storage pool
Minor improvements for IOPS (SDC Limits) configuration

* Updated access for ScaleIO/PowerFlex volumes on VM Start and Stop
Added primary storage level configurable setting "storage.pool.client.timeout" for storage API client
Enabled cluster wide storage pool support for ScaleIO/PowerFlex storage
Minor improvements for ScaleIO/PowerFlex disk access in the KVM host

* Added support for direct download of templates (raw, qcow2) on ScaleIO/PowerFlex storage pool

* Added support for config drives in host cache for KVM

- Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
- Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
- Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
- Added new parameter "vm.configdrive.host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path for config drives

* Updated disk access while migrating the VM with volumes on ScaleIO/PowerFlex storage pool
Changed the parameter "vm.configdrive.host.cache.location" to "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties to specify the host cache path
Changes to create config drives on the "/config" directory on the host cache path
Changes to suppport migrate VM with config drive on the host cache path

* Additonal changes to support migrate VM with config drive on the host cache

* Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates
Updated full deployment destination for preparing the network(s) on VM start

* Propagate the direct download certificates uploaded to the newly added KVM hosts

* Code improvements for ScaleIO/PowerFlex storage plugin

* Updated storage stats collection and tests for ScaleIO/PowerFlex storage plugin

* Fix for template size of direct download templates on capacity check for ScaleIO/PowerFlex storage pool
Updated data object grant and revoke access for connected SDCs to ScaleIO/PowerFlex storage pool

* Discover the template size for direct download templates using any available host from the zones specified on template registration

When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones

* Maintain the config drive location and use it when required on any config drive operation (migrate, delete)

* Ensure the volume to be expunged, is expunge ready on storage cleanup

* Do not set the storage migration flag for the volumes on zone wide PowerFlex/ScaleIO pool when listing the hosts available for cross-cluster migration

* Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)

* Added alerts for PowerFlex/ScaleIO SDC disconnection on the host(s)

* Retry VM deployment/start when the host cannot access volume/template on the ScaleIO/PowerFlex storage

* Changes to find a potential host that can access the ScaleIO/PowerFlex storage pool

* Updated ScaleIO/PowerFlex storage pool stats for checking the available capacity and usage

* Updated ScaleIO/PowerFlex volumes naming convention to avoid the naming conflicts on sharing

* Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand

- Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore

* Updated ScaleIO/PowerFlex storage pool capacity stats

* Cleanup unused templates and host entries on PowerFlex/ScaleIO storage pool deletion

* Check the router filesystem is writable or not, before performing health checks

- Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
- The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.

* Updated the router filesystem writable check using script, instead cmd execution

- Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not

* Update volume stats (physical and virtual size) for the volumes on PowerFlex/ScaleIO storage pool

Co-authored-by: Suresh Kumar Anaparti <suresh.anaparti@shapeblue.com>
2020-10-07 16:02:02 +05:30
Rohit Yadav f7808cfb52 server: fix TransactionLegacy DB connection leaks due to DB switching by B&R thread (#4121)
BackupSync task would switch between databases to update backup usage
metrics in the cloud_usage.usage_backup table. The current framework
and the usage in ManagedContext causes database connection
(LegacyTransaction) leaks. When the thread runs faster, the issue is
easily reproducible and checking via heap dump analysis or using JMX
MBeans. This fixes by moving the task of backup data updation for
usage data to the usage server by publishing usage events instead of
switching between databases in a local thread while in a
ManagedContextRunnable.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit b54d19b3b9)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-06-18 14:06:24 +05:30
Spaceman1984 1ca3117fd4 storage: Fixed null pointer (#4130)
Fixes #4090

When trying to migrate a VM across 2 clusters, if a snapshot has been deleted and garbage collection has run to update the removed field, it is not possible to migrate the instance due to a null pointer.

(cherry picked from commit 6a683dcf77)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-06-18 14:06:15 +05:30
andrijapanicsb 6f96b3b2b3 Updating pom.xml version numbers for release 4.14.0.0
Signed-off-by: andrijapanicsb <andrija.panic@shapeblue.com>
2020-05-11 15:03:14 +01:00
Daan Hoogland 689e529d7b Merge release branch 4.13 to master
* 4.13:
  Fixed guest vlan range going missing when using zone wizzard (#4042)
  Volume migration (#4043)
2020-04-23 20:19:30 +02:00
dahn c1570b9c91
Volume migration (#4043)
* Update AncientDataMotionStrategy.java

fix When secondary storage usage is> 90%, VOLUME migration across primary storage will cause the migration to fail and lose VOLUME

* Update AncientDataMotionStrategy.java

Volume is migrated across Primary storage. If no secondary storage is available(Or used capacity> 90% ), the migration is canceled.
Before modification, if secondary storage cannot be found, copyVolumeBetweenPools return NUll

copyAsync considers answer = null to be a sign of successful task execution, so it deletes the VOLUME on the old primary storage. This is the root cause of data loss, because VOLUME did not perform the migration at all.

* code in comment removed

Co-authored-by: div8cn <35140268+div8cn@users.noreply.github.com>
Co-authored-by: Daan Hoogland <dahn@onecht.net>
2020-04-23 19:56:27 +02:00
Daan Hoogland b984184b7a Merge release branch 4.13 to master
* 4.13:
  Snapshot deletion issues (#3969)
  server: Cannot list affinity group if there are hosts dedicated… (#4025)
  server: Search zone-wide storage pool when allocation algothrim is firstfitleastconsumed (#4002)
2020-04-11 16:45:00 +02:00
dahn f18fe5e1da
Snapshot deletion issues (#3969)
* Fixes snapshot deletion

* Remove legacy '@Component', it is not necessary in this bean/class.

* Fix log message missing %d and remove snapshot on DB

* Remove "dummy" boolean return statement

* Manage snapshot deletion for KVM + NFS (primary storage)

* checkstyle trailing spaces

* rename options strings to *_OPTION

* Fix typo on deleteSnapshotOnSecondaryStorage and enhance log message

* Move the snapshotDao.remove(snapshotId); (#4006)

* Fix deletesnapshot worflow to handle both snapshots created in primary storage and snapshots backed up to secondary storage

* Fix extra space

* refactor out separate handling methods for secondary and primary (reducing returns)

* return false on unexpected error or log when expected

* != instead of ==

* secondary instead of backup storage

* init to null

* Handle snapshot deletion on primary storage. When primary store ref not found for snapshot do not fail the operation.

* Fix debug levels on log messages

Co-authored-by: GabrielBrascher <gabriel@apache.org>
Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
Co-authored-by: Harikrishna Patnala <harikrishna.patnala@gmail.com>
Co-authored-by: nvazquez <nicovazquez90@gmail.com>
2020-04-11 16:40:27 +02:00
Wei Zhou 6bf92fb136
server: Search zone-wide storage pool when allocation algothrim is firstfitleastconsumed (#4002) 2020-04-06 22:01:40 +02:00
Andrija Panic d52f3f4a6b
Update schema-41310to41400.sql (#3999)
* Update schema-41310to41400.sql

* update desc

* update the config key as well

* Update schema-41310to41400.sql (#4012)

* Update schema-41310to41400.sql

* update configkey desc
2020-04-04 14:07:14 +02:00
Spaceman1984 84bede2171
Updated upgrade paths (#3972)
* Updated upgrade paths

* Added license info

Co-authored-by: dahn <daan.hoogland@shapeblue.com>
2020-03-24 15:04:42 +01:00
Daan Hoogland 173174c804 Merge branch '4.13' 2020-03-24 13:41:45 +00:00
Spaceman1984 e9b652a5b7
Updated upgrade path (#3971)
* Updated upgrade path

* Added license info
2020-03-24 13:43:59 +01:00
Arthur Halet 3575f5ed52
vrouter in redundant mode acquire guest ips from first ip of th… (#3587) 2020-03-14 09:22:48 +01:00
pavanaravapalli d4b537efa7
UEFI Implementation: Enabled UEFI Support for Guest VM's on Hypervisor KVM,VMware. enabled boot modes [Legacy,Secure] support for UEFI boot with known caveats. (#3638)
Co-authored-by: Pavan Kumar Aravapalli <pavan_aravapalli@accelerite.com>
Co-authored-by: dahn <daan.hoogland@shapeblue.com>
2020-03-13 20:56:26 +01:00
Nicolas Vazquez efe00aa7e0
[KVM] Rolling maintenance (#3610) 2020-03-12 16:59:46 +01:00
Nicolas Vazquez 73122fd0a9
[KVM] Direct download agnostic of the storage provider (#3828)
* Remove constraint for NFS storage

* Add new property on agent.properties

* Add free disk space on the host prior template download

* Add unit tests for the free space check

* Fix free space check - retrieve avaiable size in bytes

* Update default location for direct download

* Improve the method to retrieve hosts to retry on depending on the destination pool type and scope

* Verify location for temporary download exists before checking free space

* In progress - refactor and extension

* Refactor and fix

* Last fixes and marvin tests

* Remove unused test file

* Improve logging

* Change default path for direct download

* Fix upload certificate

* Fix ISO failure after retry

* Fix metalink filename mismatch error

* Fix iso direct download

* Fix for direct download ISOs on local storage and shared mount point

* Last fix iso

* Fix VM migration with ISO

* Refactor volume migration to remove secondary storage intermediate

* Fix simulator issue
2020-03-06 19:56:54 +01:00
Rohit Yadav 58cf300fb6 Merge remote-tracking branch 'origin/4.13' 2020-03-06 14:22:46 +05:30
Nicolas Vazquez bd7d41bf6d
server: fix VM with ISO attached migration issue (#3935)
As previously described by PR #3929:
If vm has attached ISO, the migration fails with error message "org.libvirt.LibvirtException: Cannot access storage file /mnt/b33e5a1d-e4ea-3465-b6ac-c98dc8ff8af0/207-2-cc5fd717-2d57-3bb3-bcf6-2c930268db6c.iso"
2020-03-06 13:32:19 +05:30
Abhishek Kumar 8cc70c7d87
CloudStack Kubernetes Service (#3680) 2020-03-06 08:51:23 +01:00
Rohit Yadav d0e3c577c0 Merge remote-tracking branch 'origin/4.13' 2020-03-05 12:37:51 +05:30
Rohit Yadav b4fdf22397
kvm: fix/optimize propogating configs (#3911)
Make some changes based on @nvazquez 's comments in PR #3491
Fix a bug in #3491
2020-03-05 12:20:51 +05:30
Rohit Yadav 318924d801
CloudStack Backup & Recovery Framework (#3553) 2020-03-03 13:27:58 +01:00
Daan Hoogland a62a10c814 Merge branch '4.13' 2020-02-26 16:18:41 +01:00
Pearl Dsilva 4d8a2da133
api: Fix count and item issues returned by list APIs (#3894) 2020-02-26 15:14:23 +00:00
Abhishek Kumar 0ad2370baf
Enable Direct Download for System VMs (#3731)
* changes for configurable timeouts for direct download

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* server: refactor direct download config value retrieval

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactored direc download cmd, downloader classes

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* server, services: allow direct download template for SSVM, CPVM

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* list bypassed system templates

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* ignore direct download template during system tempalte download

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* add direct download entry while adding store

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix previous change, donot add multiple entries for direct download

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* connection request timeout as hidden configuration

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix template zone ref cleanup on zone deletion

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* fix previous commit test error, change implementation

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>

* refactored zone template cleanup

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
2020-02-26 13:38:31 +01:00
Wei Zhou ce894238d9
vpc: add bypassvlanoverlapcheck parameter when create private g… (#3899) 2020-02-23 21:21:08 +00:00
Wei Zhou 458d3b5b47
Multiple networks support for vms in advanced zone with securit… (#3639) 2020-02-19 14:02:12 +00:00
Daan Hoogland b01e011def Merge release branch 4.13 to master
* 4.13:
  KVM: Propagating changes on host parameters to the agents (#3491)
2020-02-19 14:15:52 +01:00
Wei Zhou ac7bcde45b
KVM: Propagating changes on host parameters to the agents (#3491) 2020-02-19 13:13:37 +00:00
Rakesh 4ab6b42250
server: Add new command to update security group name (#3739)
By default, once we create a security group we cant change its name.
In this feature, we introduce a new API command "updateSecurityGroup"
which allows us to rename the security group name. Although we can't
change the name of the "default" security group.
2020-02-19 13:09:52 +05:30
Andrija Panic 77fc1026bb
engine/schema: remove duplicate index region (#3882)
Remove duplicate index region.
2020-02-18 14:10:51 +05:30
Rohit Yadav d90341ebf1
cloudstack: add JDK11 support (#3601)
This adds support for JDK11 in CloudStack 4.14+:

- Fixes code to build against JDK11
- Bump to Debian 9 systemvmtemplate with openjdk-11
- Fix Travis to run smoketests against openjdk-11
- Use maven provided jdk11 compatible mysql-connector-java
- Remove old agent init.d scripts

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-02-12 12:58:25 +05:30
Nicolas Vazquez ce896a477d
[Vmware] Enable PVLAN support on L2 networks (#3732)
* Enable PVLAN support on L2 networks

* Fix prevent null pointer on details

* Add marvin tests

* Fixes from comments

* Fix: missing pvlan type on plugniccommand

* Fix checks on network creation for vlans overlap

* Fix remove prefix from secondary vlan id

* Improve checks on physical network for pvlans

* Fix compatibility with previous pvlan creation

* Fix shared networks backwards pvlan compatibility

* Add ui fix for pvlan type not passed to api

* Add check for isolated vlan id overlap

* Include check for dynamic vlan reserved for secondary vlan

* Fix marvin tests errors

* Fix redundant imports

* Skip marvin test for pvlan if dvswitch is not present

* spelling

Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
2020-02-07 15:43:01 +01:00
Wei Zhou fd5bea838b
New feature: Add support to destroy/recover volumes (#3688)
* server: fix resource count of primary storage if some volumes are Expunged but not removed

Steps to reproduce the issue
(1) create a vm and stop it. check resource count of primary storage
(2) download volume. resource count of primary storage is not changed.
(3) expunge the vm, the volume will be Expunged state as there is a volume snapshot on secondary storage. The resource count of primary storage decreased.
(4) update resource count of the account (or domain), the resource count of primary storage is reset to the value in step (2).

* New feature: Add support to destroy/recover volumes

* Add integration test for volume destroy/recover

* marvin: check resource count of more types

* messages translate to JP

* Update messages for CN

* translate message for NL

* fix two issues per Daan's comments

Co-authored-by: Andrija Panic <45762285+andrijapanicsb@users.noreply.github.com>
2020-02-07 11:25:10 +01:00
Daan Hoogland 10482da136 Merge release branch 4.13 to master
* 4.13:
  vr: add missing rule for port forwarding rule in vpc (#3857)
  vpc: set traffic type of private gateway IP to Public to fix ke… (#3851)
2020-02-06 20:38:07 +01:00
Wei Zhou a9a1737dd9
vpc: set traffic type of private gateway IP to Public to fix ke… (#3851) 2020-02-06 20:22:08 +01:00
Rakesh 70daee9b10 network: set restart_required to 0 after restarting network (#3803)
After restarting the network with or without cleanup option, the restart_required field in networks table should be reset to 0.
2020-02-04 14:42:46 +01:00
Abhishek Kumar 0f5b0e67f8
VM ingestion (#3606)
The VM ingestion feature allows CloudStack to discover, on-board, import existing VMs in an infra. The feature currently works only for VMware, with a hypervisor agnostic framework which may be extended for KVM and XenServer in future.
2020-02-03 15:43:52 +01:00
Rakesh a2a4968f51
server: Allow creating network with duplicate name (#3807)
Add a global setting to disable creating networks with same name in an account

Add a global setting to disable creating network without
mentioning the start and end IPv4 or IPv6 address

By default we can create networks with the same name in the account.
Sometimes we should not create the networks with same name.
This change adds a global setting which prevents creating the network with same name.
The default value is true and set it to false to prevent creating network with same names.

Also its possible to create a shared network without mentioning the
start and the end IPv4 or IPv6 address.
This change adds a global setting which prevents creating a shared
network without specifying the start and the end IPv4 or IPv6 address
2020-01-31 15:52:42 +05:30
Rakesh 1a5b7c362e
engine/orchestration: display numeric value instead of variable name (#3818)
If the disk size of the vm to be created is greater
than the volume size, then the exception message should
display the numeric value instead of variable name
2020-01-31 15:42:06 +05:30
Rohit Yadav 424f10cc77 Merge remote-tracking branch 'origin/4.13' 2020-01-31 14:18:11 +05:30
Abhishek Kumar 9d105b6546
template: copy md5 mismatch (#3383)
Fixes #3191

When a template is registered, code stores md5sum of the downloaded file in the vm_template table. However, this downloaded file could be deleted after template installation if it is not an actual (.qcow2, .ova, etc.) file. When the user copies a template using copyTemplate API, the actual template file will be copied across the image stores. Matching checksum for the copied templated file and the stored value from the vm_template table will result in a mismatch.
Changes will set an empty checksum value for the copied template while passing to download service which allows skipping wrong checksum check for the copied while install.
However, this results in a change in checksum value for concerned template entry in vm_template table post template install.

Co-authored-by: dahn <daan.hoogland@gmail.com>
2020-01-31 14:16:37 +05:30
Anurag Awasthi c0abfce8fa
Health check feature for virtual router (#3575) 2020-01-30 12:39:03 +01:00
Wei Zhou ac581d1546
New feature: Resource count (CPU/RAM) take only running vms into calculation (#3760)
* marvin: check resource count of more types

* New feature: add flag resource.count.running.vms.only to count resource consumption of only running vms

Stopped VMs do not use CPU/RAM actually.
A new global configuration resource.count.running.vms.only is added to determine whether resource (cpu/memory) of only running vms (including Starting/Stopping) will be taken into calculation of resource consumption.

* Add integration test for resource count of only running vms
2020-01-30 10:36:50 +01:00
Rakesh 920531f42d
network: set restart_required to 0 after restarting network (#3803)
After restarting the network with or without cleanup option, the restart_required field in networks table should be reset to 0.
2020-01-30 11:12:38 +05:30
Rohit Yadav 0cb2db6e1d Merge remote-tracking branch 'origin/4.13' 2020-01-28 11:26:40 +05:30
Andrija Panic 0095272a38 upgrade: kvm-local-pool-trailing-slash (#3813)
Stop asking user (in the upgrade documentation) to remove a trailing slash for local KVM pool - do it here in upgrade path - so not needed in DOC for the upgrade to 4.14 and onwards.
2020-01-28 11:18:27 +05:30
Wei Zhou 136505b22c server: double check host capacity when start/migrate a vm (#3728)
When start a vm or migrate a vm (away from a host in host maintenance), cloudstack will check capacity of all hosts and choose one. If there are hundreds of hosts on the platform, it will take some seconds. When cloudstack choose a host and start/migrate vm to it, the resource consumption of the host might have been changed. This normally happens when we start/migrate multiple vms.
It would be better to double check the host capacity when start vm on a host.

This PR includes the fix for cpucore capacity when start/migrate a vm.
2020-01-28 10:55:11 +05:30