CLOUDSTACK-9402 : Marvin tests for Source NAT and Static NAT features verification with NuageVsp (both overlay and underlay infra).
Co-Authored-By: Prashanth Manthena <prashanth.manthena@nuagenetworks.net>, Frank Maximus <frank.maximus@nuagenetworks.net>
CLOUDSTACK-9561 After domain/account deletion, snapshot taken by the
domain/account remains undeleted
While deleting the UserAccount Cleanup for the removed VMs/volumes are not
happening. For the removed VMs, snapshots doesn't get cleaned. Only for running
VMs(volumes in ready state) the cleanup happens.
When the VM is desroyed, the volume is marked as destroyed and later storage
garbage collector perform the cleanup. But if we try delete domain/account
before storage garbage collector runs, then it fails.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Made the changes to improve logging.CLOUSTACK-9465 Several log refactoring/improvement suggestions.
There are two scenarios of logging which needs refactoring/improvement:
Method invocation replaced by variable
This means that in the logging code, the method invocation is pre-defined as a variable. for simplicity, the method invocation should be replaced by the variable.
Delete variable which must be null
The variable in the logging code is null, there is no need to put the variable there.
* pr/1705:
Made the changes to improve logging.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This PR adds an ability to Pass a new parameter, locationType,
to the “createSnapshot” API command. Depending on the locationType,
we decide where the snapshot should go in case of managed storage.
There are two possible values for the locationType param
1) `Standard`: The standard operation for managed storage is to
keep the snapshot on the device. For non-managed storage, this will
be to upload it to secondary storage. This option will be the
default.
2) `Archive`: Applicable only to managed storage. This will
keep the snapshot on the secondary storage. For non-managed
storage, this will result in an error.
The reason for implementing this feature is to avoid a single
point of failure for primary storage. Right now in case of managed
storage, if the primary storage goes down, there is no easy way
to recover data as all snapshots are also stored on the primary.
This features allows us to mitigate that risk.
KVM hosts on shared storage failure was accepted by mgmt server with the
host state as Up, even though there was no primary/shared storage available on
it. This patch offers a quick fix by throwing an exception in the storage monitor
which connects storage pool on host. The failure is trapped by agent manager
that disconnects the agent without any investigation.
Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when
the storage is inaccessible (firewalled, or shutdown) before returning back with
an error. It is safe to assume that this won't add pressure on mgmt server due to
several reconnection attempts, and KVM agent would retry reconnection every 2
minutes.
For such KVM hosts, where failure happens due to storage issues; they will be
briefly put in Alert state but will be mostly be in Connecting state during which
the KVM host attempts to mount/reconfigure NFS storage pool.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Unhides snapshot.backup.rightafter from global configuration
If snapshot.backup.rightafter is set to false (defaults to true), snapshots are
not backed up to secondary storage
Adding support for cross-cluster storage migration for managed storage when using XenServerThis PR adds support for cross-cluster storage migration of VMs that make use of managed storage with XenServer.
Managed storage is when you have a 1:1 mapping between a virtual disk and a volume on a SAN (in the case of XenServer, an SR is placed on this SAN volume and a single virtual disk placed in the SR).
Managed storage allows features such as storage QoS and SAN-side snapshots to work (sort of analogous to VMware VVols).
This PR focuses on enabling VMs that are using managed storage to be migrated across XenServer clusters.
I have successfully run the following tests on this branch:
TestVolumes.py
TestSnapshots.py
TestVMSnapshots.py
TestAddRemoveHosts.py
TestVMMigrationWithStorage.py (which is a new test that is being added with this PR)
* pr/1671:
Adding support for cross-cluster storage migration for managed storage when using XenServer
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
Added checks to prevent netwrok update when router state is unknown or when
the new offering removes a service that is in use.
Added a new param forced to the updateNetwork API. The network will
undergo a forced update when this param is set to true.
CLOUDSTACK-8751 Clean network config like firewall rules etc, when network services are removed during network update.
Conflicts:
engine/schema/src/com/cloud/upgrade/DatabaseUpgradeChecker.java
engine/schema/test/com/cloud/upgrade/DatabaseUpgradeCheckerTest.java
tools/marvin/setup.py
This fixes class names to make things consistent as per the 4.9 PR on master.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Renames schema-490to491*.sql to schema490to4910*.sql
* Renames the Upgrade490to491 class to Upgrade490to4910
* Removes the unused s_logger contant from Upgrade490to4910
* Updates the version in tools/marvin/setup to 4.9.1.0-SNAPSHOT
Updating pom.xml version numbers for release 4.8.2.0-SNAPSHOTOften, patch and security releases do not require schema migrations or
data migrations. However, if an empty upgrade class and associated
scripts are not defined, the upgrade process will break. With this
change, if a release does not have an upgrade, a noop DbUpgrade is added
to the upgrade path. This approach allows the upgrade to proceed and
for the database to properly reflect the installed version. This change
should make the release process simpler as RMs no longer need to
rememeber to create this boilerplate code when starting a new release.
Beginning with the 4.8.2.0 and 4.9.1.0 releases, the project will
formally adopt a four (4) position release number to properly accomodate
rekeases that contain only CVE fixes. The DatabaseUpgradeChecker and
Version classes made assumptions that they would always parse and
compare three (3) position version numbers. This change adds the
CloudStackVersion value object that supports both three (3) and four (4)
version numbers. It encapsulates version comparsion logic, as well as,
the rules to allow three (3) and four (4) to interoperate.
* Modifies DatabaseUpgradeChecker to handle derive an upgrade path for
a version that was not explicitly specified. It determines the
releases the first release before it with database migrations and uses
that list as the basis for the list for version being calculated. A
noop upgrade is then added to the list which causes no schema changes
or data migrations, but will update the database to the version.
* Adds unit tests for the upgrade path calculation logic in
DatabaseUpgradeChecker
* Removes dummy upgrade logic for the 4.8.2.0 introduced in previous
versions of this patch
* Introduces the CloudStackVersion value object which parses and
compares three (3) and four (4) position version numbers. This class
is intended to replace com.cloud.maint.Version.
* Adds the junit-dataprovider dependency -- allowing test data to be
concisely generated separately from the execution of a test case.
Used extensively in the CloudStackVersionTest.
Signed-off-by: John Burwell <meaux@cockamamy.net>
/cc @rhtyd @karuturi
* pr/1654:
Adds support for four position versions and optional db upgrades
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Often, patch and security releases do not require schema migrations or
data migrations. However, if an empty upgrade class and associated
scripts are not defined, the upgrade process will break. With this
change, if a release does not have an upgrade, a noop DbUpgrade is added
to the upgrade path. This approach allows the upgrade to proceed and
for the database to properly reflect the installed version. This change
should make the release process simpler as RMs no longer need to
rememeber to create this boilerplate code when starting a new release.
Beginning with the 4.8.2.0 and 4.9.1.0 releases, the project will
formally adopt a four (4) position release number to properly accomodate
rekeases that contain only CVE fixes. The DatabaseUpgradeChecker and
Version classes made assumptions that they would always parse and
compare three (3) position version numbers. This change adds the
CloudStackVersion value object that supports both three (3) and four (4)
version numbers. It encapsulates version comparsion logic, as well as,
the rules to allow three (3) and four (4) to interoperate.
* Modifies DatabaseUpgradeChecker to handle derive an upgrade path for
a version that was not explicitly specified. It determines the
releases the first release before it with database migrations and uses
that list as the basis for the list for version being calculated. A
noop upgrade is then added to the list which causes no schema changes
or data migrations, but will update the database to the version.
* Adds unit tests for the upgrade path calculation logic in
DatabaseUpgradeChecker
* Removes dummy upgrade logic for the 4.8.2.0 introduced in previous
versions of this patch
* Introduces the CloudStackVersion value object which parses and
compares three (3) and four (4) position version numbers. This class
is intended to replace com.cloud.maint.Version.
* Adds the junit-dataprovider dependency -- allowing test data to be
concisely generated separately from the execution of a test case.
Used extensively in the CloudStackVersionTest.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In the 4.1.0-4.2.0 db upgrade path, it creates new tables to store secondary
(nfs) storage in image_store table and volumes in volume_store_ref table. In
the upgrade path, it first tries to migrate NFS storage pool where it excludes
storage pools which have been removed, but it migrates all the volumes without
checking if their storage pools have been removed. This causes fk constraint
failure as the volume/row being inserted refers to a storage pool which does
not exist in the image_store table.
The fix migrates all the nfs storage pools to image_store including removed
storage pools and in doing so migrates with the 'removed' field. This fixes
db upgrade for old pre-4.0 and 4.0/4.1 CloudStack clouds.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
[4.9/LTS] Add upgrade path from 4.9.0 to 4.9.1, change version to 4.9.1.0-SNAPSHOTThis adds db upgrade path from 4.9.0 to 4.9.1 and fixes a typo in default user role description (CLOUDSTACK-9449)
/cc @karuturi @jburwell -- this will cause issues when fwd-merged to master, I can do the fwd-merging if you would like to avoid fixing the conflicts yourself
@blueorangutan package
* pr/1646:
Updating pom.xml version numbers for release 4.9.1.0-SNAPSHOT
cloudstack: upgrade path from 4.9.0 to 4.9.1
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Adds db upgrade path from 4.9.0 to 4.9.1
- CLOUDSTACK-9449: Fix typo in default user role description
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Adds role_id column to cloud_usage.account, fixes UsageDaoImpl to insert
Accounts with role_id from account table.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9238: Fix URL length to 2048 for all url fields in VOI will update the PR to add max field length in the API commands too
* pr/1567:
API: update url field max length
not needed on host table
Fix URL length to 2048 for all url fields in VO
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9368: Fix for Support configurable NFS version for Secondary Storage mounts## Description
JIRA TICKET: https://issues.apache.org/jira/browse/CLOUDSTACK-9368
This pull request address a problem introduced in #1361 in which NFS version couldn't be changed after hosts resources were configured on startup (for hosts using `VmwareResource`), and as host parameters didn't include `nfs.version` key, it was set `null`.
## Proposed solution
In this proposed solution `nfsVersion` would be passed in `NfsTO` through `CopyCommand` to `VmwareResource`, who will check if NFS version is still configured or not. If not, it will use the one sent in the command and will set it to its storage processor and storage handler. After those setups, it will proceed executing command.
* pr/1518:
CLOUDSTACK-9368: Fix for Support configurable NFS version for Secondary Storage mounts
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Dynamically load drivers before creating our DB connectionsSolution to the mailing thread titled "MySQL : No suitable driver found for jdbc:mysql".
It doesn't harm that we explicitely load the MySQL driver, and for those which would use a commons-dbcp version < 1.4 this would fix it as well. Since JDBC 4.0, the JDBC driver can auto-register itself, but for some weird cases (like ours), it's not working. Therefore we need to explicitly load the JDBC driver.
* pr/1553:
Dynamic loading of DB driver + support for other DB providers
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Remodeling of Nuage VSP Plugin + CLOUDSTACK-9294Hi all,
We've remodeled the Nuage VSP plugin to use the same model as VMWare is using (non-OSS). Before, we had a runtime dependency to the Nuage Client, this has been changed to a compile-time dependency instead because of multiple reasons (build management, readability, maintainability, ...)
We've adapted the code so it now uses model objects defined in the Nuage client instead of passing a list of parameters to the Nuage client. This is a lot more readable, and a lot more maintainable.
I've had a chat with @DaanHoogland about this approach, and he told me that ACS is trying to move away from the whole non-OSS approach. We're looking into the Juniper approach, we would set up a custom maven repository which would host the required dependencies for the Nuage VSP plugin.
Any remarks or suggestions are always welcome :)
* pr/1494:
Nuage VSP : Extending Marvin test coverage
Nuage VSP : Fix for NPE while cleaning up account when there are still resources belonging to that account
CLOUDSTACK-9294 : Make sure to remove VR from VSD when removing the VPC
CLOUDSTACK-9242 : Remodel Nuage VSP plugin
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Bug-ID: CLOUDSTACK-8870: Skip external device usage collection if no external devices existexternal network device usage monitor thread that runs every 5mins by default (based on global config external.network.stats.interval) and runs coalesce query to acquire a lock. When there are no external devices exist, there is no need to run usage collection.
Added test case to verify that usage collection task is not run when there are no External LB or External FW
* pr/846:
Bug-ID: CLOUDSTACK-8870: Skip external device usage collection if no external devices exist
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-6928: fix issue disk I/O throttling not appliedDisk IO throttling (for KVM) is not applied in the merge of 4.2.
Tests passed:
(1) start vm
(2) attach volume
(3) start vm with volume
(4) migrate vm (with volume)
* pr/1410:
CLOUDSTACK-6928: fix issue disk I/O throttling not applied
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9348: NioConnection improvementsReopened PR with squashed changes for a re-review and testing after https://github.com/apache/cloudstack/pull/1493 and sub-sequent PRs got reverted
* pr/1549:
CLOUDSTACK-9348: NioConnection improvements
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Taking fast and efficient volume snapshots with XenServer (and your storage provider)A XenServer storage repository (SR) and virtual disk image (VDI) each have UUIDs that are immutable.
This poses a problem for SAN snapshots, if you intend on mounting the underlying snapshot SR alongside the source SR (duplicate UUIDs).
VMware has a solution for this called re-signaturing (so, in other words, the snapshot UUIDs can be changed).
This PR only deals with the CloudStack side of things, but it works in concert with a new XenServer storage manager created by CloudOps (this storage manager enables re-signaturing of XenServer SR and VDI UUIDs).
I have written Marvin integration tests to go along with this, but cannot yet check those into the CloudStack repo as they rely on SolidFire hardware.
If anyone would like to see these integration tests, please let me know.
JIRA ticket: https://issues.apache.org/jira/browse/CLOUDSTACK-9281
Here's a video I made that shows this feature in action:
https://www.youtube.com/watch?v=YQ3pBeL-WaA&list=PLqOXKM0Bt13DFnQnwUx8ZtJzoyDV0Uuye&index=13
* pr/1403:
Faster logic to see if a cluster supports resigning
Support for backend snapshots with XenServer
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9366: Capacity of one zone-wide primary storage ignoredDisable and Remove Host operation disables the primary storage capacity.
Steps to replicate:
Base Condition: There exists a host and storage pool with same id
Steps:
1. Find a host and storage pool having same id
2. Disable the host
3. CPU(1) and MEMORY(0) capacity in op_host_capacity for above host is disabled
4. STORAGE(3) capacity in op_host_capacity for storage pool with id same as above host is also disabled
RCA:
'host_id' column in 'op_host_capacity' table used for storing both storage pool id (for STORAGE capacity) and host id (MEMORY and CPU). While disabling a HOST we also disable the capacity associated with host.
Ideally while disabling capacity we should only disable MEMORY and CPU capacity, but we are not doing so.
Code Path:
ResourceManagerImpl.doDeleteHost() -> ResourceManagerImpl.resourceStateTransitTo() -> CapacityDaoImpl.updateCapacityState(null, null, null, host.getId(), capacityState.toString())
updateCapacityState is updating disabling all entries which matches the host_id. This will also disable a entry having storage pool id same as that of host id.
Changes:
introduced new capacityType parameter in updateCapacityState method and necessary changes to add capacity_type clause in sql
also fixed incorrect sql builder logic (unused code path for which it is never surfaced )
Added marvin test to check host and storagepool capacity when host is disabled
Test Result:
```
Before Fix:
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Enabled | MEMORY | 8589934592 | HOST |
| 3 | Enabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
9 rows in set (0.00 sec)
Disable Host 3 from UI.
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Disabled | MEMORY | 8589934592 | HOST |
| 3 | Disabled | CPU | 32000 | HOST |
| 3 | Disabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
After Fix:
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Enabled | MEMORY | 8589934592 | HOST |
| 3 | Enabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
3 rows in set (0.01 sec)
Disable Host 3 from UI.
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Disabled | MEMORY | 8589934592 | HOST |
| 3 | Disabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
3 rows in set (0.00 sec)
Sudhansus-MAC:cloudstack sudhansu$ nosetests-2.7 --with-marvin --marvin-config=setup/dev/advanced.cfg test/integration/component/maint/test_capacity_host_delete.py
==== Marvin Init Started ====
=== Marvin Parse Config Successful ===
=== Marvin Setting TestData Successful===
==== Log Folder Path: /tmp//MarvinLogs//Apr_22_2016_22_42_27_X4VBWD. All logs will be available here ====
=== Marvin Init Logging Successful===
==== Marvin Init Successful ====
===final results are now copied to: /tmp//MarvinLogs/test_capacity_host_delete_9RHSNB===
Sudhansus-MAC:cloudstack sudhansu$ cat /tmp//MarvinLogs/test_capacity_host_delete_9RHSNB/results.txt
test_01_op_host_capacity_disable_host (integration.component.maint.test_capacity_host_delete.TestHosts) ... === TestName: test_01_op_host_capacity_disable_host | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 1 test in 0.168s
OK
```
* pr/1516:
CLOUDSTACK-9366: Capacity of one zone-wide primary storage ignored
Signed-off-by: Will Stevens <williamstevens@gmail.com>
dynamic-roles: packaging improvementsOn some MySQL server envs, this may cause a SQL statement error, though
I was unable to reproduce it. Since it's not needed, an order by 'sort_order'
is enough, we can safely remove it.
/cc @swill @anshulgangwar @DaanHoogland and others
* pr/1551:
migrate-dynamicroles: use mysql.connector due to #1054
packaging: don't bundle systemvm.zip in rpms
packaging: backup commands.properties as it does not exist in new rpms
dynamic-roles: remove unnecessary order by ID
Signed-off-by: Will Stevens <williamstevens@gmail.com>
introduced new capacityType parameter in updateCapacityState method and necessary changes to add capacity_type clause in sql
also fixed incorrect sql builder logic (unused code path for which it is never surfaced )
Added marvin test to check host and storagepool capacity when host is disabled
Added conditions to ensure the capacity_type is added only when capacity_type length is greater than 0.
Added checks in marvin test to ensure the capacity exists for a host before disabling it.
Added checks to avoid index out of range exception
- Unit test to demonstrate denial of service attack
The NioConnection uses blocking handlers for various events such as connect,
accept, read, write. In case a client connects NioServer (used by
agent mgr to service agents on port 8250) but fails to participate in SSL
handshake or just sits idle, this would block the main IO/selector loop in
NioConnection. Such a client could be either malicious or aggresive.
This unit test demonstrates such a malicious client that can perform a
denial-of-service attack on NioServer that blocks it to serve any other client.
- Use non-blocking SSL handshake
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* 4.7:
Fix Sync of template.properties in Swift
Configure rVPC for router.redundant.vrrp.interval advert_int setting
Have rVPCs use the router.redundant.vrrp.interval setting
Resolve conflict as forceencap is already in master
Split the cidr lists so we won't hit the iptables-resture limits
Check the existence of 'forceencap' parameter before use
Do not load previous firewall rules as we replace everyhing anyway
Wait for dnsmasq to finish restart
Remove duplicate spaces, and thus duplicate rules.
Restore iptables at once using iptables-restore instead of calling iptables numerous times
Add iptables copnversion script.
On some MySQL server envs, this may cause a SQL statement error, though
I was unable to reproduce it. Since it's not needed, an order by 'sort_order'
is enough, we can safely remove it.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This reverts commit 7ce0e10fbc, reversing
changes made to 29ba71f2db.
This was reverted because it seemed to be related to an issue
when doing a DeployDC, causing an `addHost` error.
CLOUDSTACK-9265 cleanup around httpclient versionssome cleanup done
- replaced HttpStatus from org.apache.commons.httpclient with that from org.apache.http
- removed unthrown HttpException
- left auto reformat in place
* pr/1385:
CLOUDSTACK-9265 cleanup around httpclient versions
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
This PR addresses the following JIRA ticket:
https://issues.apache.org/jira/browse/CLOUDSTACK-8813
The problem is that there needs to be notifications sent when a host is added to, about to be removed from, and removed from a cluster.
Such notifications can be used for many purposes. For example, it can allow storage plug-ins to update ACLs on their storage systems. Also, it can allow us to clean up IQNs from ESXi hosts that are no longer needed.
* pr/816:
CLOUDSTACK-8813: Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.
Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (http://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.
This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host
For testing this feature `ipmisim` can be used:
https://pypi.python.org/pypi/ipmisim
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows root administrators to define new roles and associate API
permissions to them.
A limited form of role-based access control for the CloudStack management server
API is provided through a properties file, commands.properties, embedded in the
WAR distribution. Therefore, customizing API permissions requires unpacking the
distribution and modifying this file consistently on all servers. The old system
also does not permit the specification of additional roles.
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Dynamic+Role+Based+API+Access+Checker+for+CloudStack
DB-Backed Dynamic Role Based API Access Checker for CloudStack brings following
changes, features and use-cases:
- Moves the API access definitions from commands.properties to the mgmt server DB
- Allows defining custom roles (such as a read-only ROOT admin) beyond the
current set of four (4) roles
- All roles will resolve to one of the four known roles types (Admin, Resource
Admin, Domain Admin and User) which maintains this association by requiring
all new defined roles to specify a role type.
- Allows changes to roles and API permissions per role at runtime including additions or
removal of roles and/or modifications of permissions, without the need
of restarting management server(s)
Upgrade/installation notes:
- The feature will be enabled by default for new installations, existing
deployments will continue to use the older static role based api access checker
with an option to enable this feature
- During fresh installation or upgrade, the upgrade paths will add four default
roles based on the four default role types
- For ease of migration, at the time of upgrade commands.properties will be used
to add existing set of permissions to the default roles. cloud.account
will have a new role_id column which will be populated based on default roles
as well
Dynamic-roles migration tool: scripts/util/migrate-dynamicroles.py
- Allows admins to migrate to the dynamic role based checker at a future date
- Performs a harder one-way migrate and update
- Migrates rules from existing commands.properties file into db and deprecates it
- Enables an internal hidden switch to enable dynamic role based checker feature
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8901: PrepareTemplate job thread hard-coded to max 8 threads The thread pool was hardcoded to use 8 threads,
com.cloud.template.TemplateManagerImpl.configure(String, Map<String, Object>):
_preloadExecutor = Executors.newFixedThreadPool(8, new NamedThreadFactory("Template-Preloader"));
Added the change to pick threadpool size from configuration.
* pr/880:
CLOUDSTACK-8901: PrepareTemplate job thread hard-coded to max 8 threads
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CID-1338387: Deletion of method endPointSelector.selectHypervisorHostFollowing the discussions and analysis presented on PR #1056 create by @DaanHoogland
This PR is intended to push those changes that were discussed there regarding the of endPointSelector.selectHypervisorHost method.
* pr/1124:
Deletion of method endPointSelector.selectHypervisorHost
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events
Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher
* pr/1493:
CLOUDSTACK-9348: Use non-blocking SSL handshake
CLOUDSTACK-9348: Unit test to demonstrate denial of service attack
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-8302: Removing snapshots on RBDSnapshot removing implemented if primary datastore is RBD
https://issues.apache.org/jira/browse/CLOUDSTACK-8302
* pr/1230:
CLOUDSTACK-8302 - Cleanup snapshot on KVM with RBD Snapshot removing implemented on RBD. 1. On management side: when created new shanpshot we checking if our primary storage is RBD, then do not remove record from cloud.snapshot_store_ref with link to Ceph image via 'install_path' field. 2. On management side: when removing snapshot, also send command to agent 'DeleteCommand'. 3. On agent side: method implemented 'public Answer deleteSnapshot(final DeleteCommand cmd)'
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM idWhen calling listServiceOfferings with VM id as parameter. It is returning incompatible tagged offerings. It should only list all compatible tagged offerings. Compatible means the new service offering should contain all the tags of the existing service offering(Existing offering SUBSET of new offering). If that is the case It should list in the result and can be upgraded to that offering.
* pr/1321:
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM id
Signed-off-by: Will Stevens <williamstevens@gmail.com>
engine/schema: fix upgrade path to work with MySQL 5.7Found this issue when using MySQL 5.7 with Ubuntu 16.04. The upgrade path fix removes an invalid `IGNORE` param that is deprecated now, in the upgrade path we run the alter statement to add an index only if it does not exist so we're good.
For MySQL 5.7, we'll also need to update the docs at some point to include `server-id` along with other parameters. Some of the SQL statements used throughout engine/schema don't adhere to SQL 99 standard which is enforced by default in MySQL 5.7, therefore the following sql-mode (for backward compatibility with mysql 5.6 modes) will be necessary for anyone willing to use MySQL 5.7 (until we fix codebase wide raw and generated sql statements to be SQL99 compliant):
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
server-id = 1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
/cc @swill @jburwell @agneya2001 @wido @DaanHoogland and others
* pr/1517:
engine/schema: fix upgrade path to work with MySQL 5.7
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Snapshot removing implemented on RBD.
1. On management side: when created new shanpshot we checking if our primary storage is RBD,
then do not remove record from cloud.snapshot_store_ref with link to Ceph
image via 'install_path' field.
2. On management side: when removing snapshot, also send command to agent 'DeleteCommand'.
3. On agent side: method implemented 'public Answer deleteSnapshot(final DeleteCommand cmd)'
Found this issue when using MySQL 5.7 with Ubuntu 16.04 with following settings:
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
server-id = 1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Updated most dependencies to latest minor releases, EXCEPT:
- Gson 2.x
- Major spring framework version
- Servlet version
- Embedded jetty version
- Mockito version (beta)
- Mysql lib minor version upgrade (breaks mysql-ha plugin)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag
RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
* pr/1200:
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
Signed-off-by: Will Stevens <williamstevens@gmail.com>