CLOUDSTACK-9348: NioConnection improvementsReopened PR with squashed changes for a re-review and testing after https://github.com/apache/cloudstack/pull/1493 and sub-sequent PRs got reverted
* pr/1549:
CLOUDSTACK-9348: NioConnection improvements
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Taking fast and efficient volume snapshots with XenServer (and your storage provider)A XenServer storage repository (SR) and virtual disk image (VDI) each have UUIDs that are immutable.
This poses a problem for SAN snapshots, if you intend on mounting the underlying snapshot SR alongside the source SR (duplicate UUIDs).
VMware has a solution for this called re-signaturing (so, in other words, the snapshot UUIDs can be changed).
This PR only deals with the CloudStack side of things, but it works in concert with a new XenServer storage manager created by CloudOps (this storage manager enables re-signaturing of XenServer SR and VDI UUIDs).
I have written Marvin integration tests to go along with this, but cannot yet check those into the CloudStack repo as they rely on SolidFire hardware.
If anyone would like to see these integration tests, please let me know.
JIRA ticket: https://issues.apache.org/jira/browse/CLOUDSTACK-9281
Here's a video I made that shows this feature in action:
https://www.youtube.com/watch?v=YQ3pBeL-WaA&list=PLqOXKM0Bt13DFnQnwUx8ZtJzoyDV0Uuye&index=13
* pr/1403:
Faster logic to see if a cluster supports resigning
Support for backend snapshots with XenServer
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Unit test to demonstrate denial of service attack
The NioConnection uses blocking handlers for various events such as connect,
accept, read, write. In case a client connects NioServer (used by
agent mgr to service agents on port 8250) but fails to participate in SSL
handshake or just sits idle, this would block the main IO/selector loop in
NioConnection. Such a client could be either malicious or aggresive.
This unit test demonstrates such a malicious client that can perform a
denial-of-service attack on NioServer that blocks it to serve any other client.
- Use non-blocking SSL handshake
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This reverts commit 7ce0e10fbc, reversing
changes made to 29ba71f2db.
This was reverted because it seemed to be related to an issue
when doing a DeployDC, causing an `addHost` error.
Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
This PR addresses the following JIRA ticket:
https://issues.apache.org/jira/browse/CLOUDSTACK-8813
The problem is that there needs to be notifications sent when a host is added to, about to be removed from, and removed from a cluster.
Such notifications can be used for many purposes. For example, it can allow storage plug-ins to update ACLs on their storage systems. Also, it can allow us to clean up IQNs from ESXi hosts that are no longer needed.
* pr/816:
CLOUDSTACK-8813: Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.
Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (http://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.
This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host
For testing this feature `ipmisim` can be used:
https://pypi.python.org/pypi/ipmisim
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events
Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher
* pr/1493:
CLOUDSTACK-9348: Use non-blocking SSL handshake
CLOUDSTACK-9348: Unit test to demonstrate denial of service attack
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM idWhen calling listServiceOfferings with VM id as parameter. It is returning incompatible tagged offerings. It should only list all compatible tagged offerings. Compatible means the new service offering should contain all the tags of the existing service offering(Existing offering SUBSET of new offering). If that is the case It should list in the result and can be upgraded to that offering.
* pr/1321:
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM id
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag
RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
* pr/1200:
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9196: Fixing null pointer exception when vm meta data is synced on upgraded setuphttps://issues.apache.org/jira/browse/CLOUDSTACK-9196
NullPointerException can occur if XenServer reports non-existing VM in cloud DB.
* pr/1274:
CLOUDSTACK-9196: Fixing null pointer exception when vm meta data is synced on upgraded setup.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Removed unused variables from "NetworkStateListener" classWe removed the following variables from "com.cloud.network.NetworkStateListener"
. UsageEventDao _usageEventDao
. NetworkDao _networkDao
We changed the EventBus s_eventBus variable to private, the constructor not to use those variables and applied this change in classes com.cloud.network.IpAddressManagerImpl and org.apache.cloudstack.engine.orchestration.NetworkOrchestrator
* pr/1261:
Removed unused variables from class NetworkStateListener
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9185: [VMware DRS] VM sync failed with exception due to out-of-band changesSummary: The target "ClusteredVirtualMachineManagerImpl.HandlePowerStateReport" invoked during the VM power state sync is not found as HandlePowerStateReport was not implemented in ClusteredVirtualMachineManagerImpl and was private in VirtualMachineManagerImpl, which was resulting in InvocationTargetException. Changed HandlePowerStateReport() in VirtualMachineManagerImpl to protected.
* pr/1256:
CLOUDSTACK-9185: [VMware DRS] VM sync failed with exception due to out-of-band changes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8860: improve error messages in VM deployment code path.improved the error messages in vm deployment code path. added some more data to the error messages and also fixed some errors using internal ids to use uuids.
* pr/864:
CLOUDSTACK-8860: improve error messages in VM deployment code path.
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Conflicts:
engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java
[4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This PR applies the same fixes as in the PR #1259, but against branch 4.7.
Please refer to PR #1259 for the tests results and all the comments already made there.
Issues fixed are:
* CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector
* CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the keepalivd.conf
* CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings
Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```:
* test_04_rvpc_network_garbage_collector_nics
* test_05_rvpc_multi_tiers
The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change the settings (default is 600 seconds for each) and restart the Management Server before running the tests. I would suggest to set it to 60 seconds.
In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not reading their values through ConfigDao. Due to that, the settings were not being applied properly and the test was waiting to long to check the VPC routers.
* pr/1277:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
Implement CheckHealthCommand for NSX controllers
Fix log message that refers to agent, not host
Prevent NullPointerException when host does not belong to a pod
Summary: The target "ClusteredVirtualMachineManagerImpl.HandlePowerStateReport" invoked during the VM power state sync is not found as HandlePowerStateReport was not implemented in ClusteredVirtualMachineManagerImpl and was private in VirtualMachineManagerImpl, which was resulting in InvocationTargetException. Changed HandlePowerStateReport() in VirtualMachineManagerImpl to protected.
when we restart vpc tiers, the old nics will be removed, and create a new nic.
however, the device_id was set to the nic count, which may be already used.
this commit get the first device_id not in use as the device_id of new nic.
This issue also happen when we add multiple networks to a vm and remove them.
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs
Added logid to logging framework, now all API call logs can be tracked with this id end to end
* pr/1167:
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs Added logid to logging framework, now all API call logs can be tracked with this id end to end
Signed-off-by: Daan Hoogland <daan@onecht.net>
CLOUDSTACK-9047 rename enumsmake enums adhere to best practice naming conventions
* pr/1049:
CLOUDSTACK-9046 rename enums to adhere to naming conventions
CLOUDSTACK-9046 renamed enums in kvm plugin
CLOUDSTACK-9047 use 'State's only with context there are more types called 'State' (or to be called so but now 'state') So remove imports and prepend their enclosing class/context to them.
Signed-off-by: Daan Hoogland <daan@onecht.net>
* 4.6:
CLOUDSTACK-9075 - Uses the same vlan since it should have been already released
CLOUDSTACK-9075 - Adds VPC static routes test
CLOUDSTACK-9075 - Covers Private GW ACL with Redundant VPCs
CLOUDSTACK-9075 - Add method to get list of Physical Networks per zone
CLOUDSTACK-6276 Removing unused parameter in integration test for projects
CLOUDSTACK-6276 Removing unused parameter in integration test
CLOUDSTACK-6276 Fixing affinity groups for projects
CLOUDSTACK-6276 Fixing affinity groups for projectsWith some contributions from @resmo and @ustcweizhou.
This closes https://github.com/apache/cloudstack/pull/508
To test manually (need at least 2 hosts):
Create a project
Create an affinity group in that project
Deploy a vm with that affinity group
Deploy a second vm with that affinity group
They should be on different hosts
Ran old and new tests for affinity groups on the simulator
Test create affinity group as admin in project ... === TestName: test_01_admin_create_aff_grp_for_project | Status : SUCCESS ===
ok
Test create affinity group as domain admin for projects ... === TestName: test_02_doadmin_create_aff_grp_for_project | Status : SUCCESS ===
ok
Test create affinity group as user for projects ... === TestName: test_03_user_create_aff_grp_for_project | Status : SUCCESS ===
ok
Test create affinity group that exists (same name) for projects ... === TestName: test_4_user_create_aff_grp_existing_name_for_project | Status : SUCCESS ===
ok
#Delete Affinity Group by id. ... === TestName: test_01_delete_aff_grp_by_id | Status : SUCCESS ===
ok
#Delete Affinity Group by id should fail for user not in project ... === TestName: test_02_delete_aff_grp_by_id_another_user | Status : SUCCESS ===
ok
test DeployVM in anti-affinity groups ... === TestName: test_01_deploy_vm_anti_affinity_group | Status : SUCCESS ===
ok
test DeployVM in anti-affinity groups with more vms than hosts. ... === TestName: test_02_deploy_vm_anti_affinity_group_fail_on_not_enough_hosts | Status : SUCCESS ===
ok
List affinity group for a vm for projects ... === TestName: test_01_list_aff_grps_for_vm | Status : SUCCESS ===
ok
List multiple affinity groups associated with a vm for projects ... === TestName: test_02_list_multiple_aff_grps_for_vm | Status : SUCCESS ===
ok
List affinity groups by id for projects ... === TestName: test_03_list_aff_grps_by_id | Status : SUCCESS ===
ok
List Affinity Groups by name for projects ... === TestName: test_04_list_aff_grps_by_name | Status : SUCCESS ===
ok
List Affinity Groups by non-existing id for projects ... === TestName: test_05_list_aff_grps_by_non_existing_id | Status : SUCCESS ===
ok
List Affinity Groups by non-existing name for projects ... === TestName: test_06_list_aff_grps_by_non_existing_name | Status : SUCCESS ===
ok
List affinity group should list all for a vms associated with that group for projects ... === TestName: test_07_list_all_vms_in_aff_grp | Status : SUCCESS ===
ok
Update the list of affinityGroups by using affinity groupids ... === TestName: test_01_update_aff_grp_by_ids | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 16 tests in 581.706s
OK
Deploy vm as Admin in Affinity Group belonging to regular user (should fail) ... === TestName: test_01_deploy_vm_another_user | Status : SUCCESS ===
ok
Create Affinity Group as admin for regular user ... === TestName: test_02_create_aff_grp_user | Status : SUCCESS ===
ok
List Affinity Groups as admin for all the users ... === TestName: test_03_list_aff_grp_all_users | Status : SUCCESS ===
ok
List Affinity Groups belonging to admin user ... === TestName: test_04_list_all_admin_aff_grp | Status : SUCCESS ===
ok
List Affinity Groups belonging to regular user passing account id and domain id ... === TestName: test_05_list_all_users_aff_grp | Status : SUCCESS ===
ok
List Affinity Groups belonging to regular user passing group id ... === TestName: test_06_list_all_users_aff_grp_by_id | Status : SUCCESS ===
ok
Delete Affinity Group belonging to regular user ... === TestName: test_07_delete_aff_grp_of_other_user | Status : SUCCESS ===
ok
Test create affinity group as admin ... === TestName: test_01_admin_create_aff_grp | Status : SUCCESS ===
ok
Test create affinity group as domain admin ... === TestName: test_02_doadmin_create_aff_grp | Status : SUCCESS ===
ok
Test create affinity group as user ... === TestName: test_03_user_create_aff_grp | Status : SUCCESS ===
ok
Test create affinity group that exists (same name) ... === TestName: test_04_user_create_aff_grp_existing_name | Status : SUCCESS ===
ok
Test create affinity group with existing name but within different account ... === TestName: test_05_create_aff_grp_same_name_diff_acc | Status : SUCCESS ===
ok
Test create affinity group of non-existing type ... === TestName: test_06_create_aff_grp_nonexisting_type | Status : SUCCESS ===
ok
Delete Affinity Group by name ... === TestName: test_01_delete_aff_grp_by_name | Status : SUCCESS ===
ok
Delete Affinity Group as admin for an account ... === TestName: test_02_delete_aff_grp_for_acc | Status : SUCCESS ===
ok
Delete Affinity Group which has vms in it ... === TestName: test_03_delete_aff_grp_with_vms | Status : SUCCESS ===
ok
Delete Affinity Group with id which does not belong to this user ... === TestName: test_05_delete_aff_grp_id | Status : SUCCESS ===
ok
Delete Affinity Group by name which does not belong to this user ... === TestName: test_06_delete_aff_grp_name | Status : SUCCESS ===
ok
Delete Affinity Group by id. ... === TestName: test_08_delete_aff_grp_by_id | Status : SUCCESS ===
ok
Root admin should be able to delete affinity group of other users ... === TestName: test_09_delete_aff_grp_root_admin | Status : SUCCESS ===
ok
Deploy VM without affinity group ... === TestName: test_01_deploy_vm_without_aff_grp | Status : SUCCESS ===
ok
Deploy VM by aff grp name ... === TestName: test_02_deploy_vm_by_aff_grp_name | Status : SUCCESS ===
ok
Deploy VM by aff grp id ... === TestName: test_03_deploy_vm_by_aff_grp_id | Status : SUCCESS ===
ok
test DeployVM in anti-affinity groups ... === TestName: test_04_deploy_vm_anti_affinity_group | Status : SUCCESS ===
ok
Deploy vms by affinity group id ... === TestName: test_05_deploy_vm_by_id | Status : SUCCESS ===
ok
Deploy vm in affinity group of another user by name ... === TestName: test_06_deploy_vm_aff_grp_of_other_user_by_name | Status : SUCCESS ===
ok
Deploy vm in affinity group of another user by id ... === TestName: test_07_deploy_vm_aff_grp_of_other_user_by_id | Status : SUCCESS ===
ok
Deploy vm in multiple affinity groups ... === TestName: test_08_deploy_vm_multiple_aff_grps | Status : SUCCESS ===
ok
Deploy multiple vms in multiple affinity groups ... === TestName: test_09_deploy_vm_multiple_aff_grps | Status : SUCCESS ===
ok
Deploy VM by aff grp name and id ... === TestName: test_10_deploy_vm_by_aff_grp_name_and_id | Status : SUCCESS ===
ok
List affinity group for a vm ... === TestName: test_01_list_aff_grps_for_vm | Status : SUCCESS ===
ok
List multiple affinity groups associated with a vm ... === TestName: test_02_list_multiple_aff_grps_for_vm | Status : SUCCESS ===
ok
List affinity groups by id ... === TestName: test_03_list_aff_grps_by_id | Status : SUCCESS ===
ok
List Affinity Groups by name ... === TestName: test_04_list_aff_grps_by_name | Status : SUCCESS ===
ok
List Affinity Groups by non-existing id ... === TestName: test_05_list_aff_grps_by_non_existing_id | Status : SUCCESS ===
ok
List Affinity Groups by non-existing name ... === TestName: test_06_list_aff_grps_by_non_existing_name | Status : SUCCESS ===
ok
List affinity group should list all for a vms associated with that group ... === TestName: test_07_list_all_vms_in_aff_grp | Status : SUCCESS ===
ok
Update the list of affinityGroups by using affinity groupids ... === TestName: test_01_update_aff_grp_by_ids | Status : SUCCESS ===
ok
Update the list of affinityGroups by using affinity groupnames ... === TestName: test_02_update_aff_grp_by_names | Status : SUCCESS ===
ok
Update the list of affinityGroups for vm which is not associated ... === TestName: test_03_update_aff_grp_for_vm_with_no_aff_grp | Status : SUCCESS ===
ok
Update the list of Affinity Groups to empty list ... SKIP: Skip - Failing - work in progress
Update the list of Affinity Groups on running vm ... === TestName: test_05_update_aff_grp_on_running_vm | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 42 tests in 976.432s
OK (SKIP=1)
* pr/1134:
CLOUDSTACK-6276 Removing unused parameter in integration test for projects
CLOUDSTACK-6276 Removing unused parameter in integration test
CLOUDSTACK-6276 Fixing affinity groups for projects
Signed-off-by: Remi Bergsma <github@remi.nl>
Removed unused code from the EngineHostDao Implementation After analysing the code within the EngineHostDaoImpl class, we noticed that methods:
countBy;
findByGuid;
findAndUpdateDirectAgentToLoad;
findAndUpdateApplianceToLoad;
markHostsAsDisconnected;
listAllUpAndEnabledNonHAHosts;
findLostHosts;
getRunningHostCounts;
getNextSequence;
countRoutingHostsByDataCenter;
updateResourceState;
findByTypeNameAndZoneId;
findHypervisorHostInCluster;
lockOneRandomRow;
And variables:
status_logger;
state_logger;
Have no usage. Thus, in order to clean up the code, we decided to remove them from EngineHostDaoImpl and its interface (EngineHostDao).
All of EngineHostDaoImpl's attributes were set to private, given that they are not accessed by any other class.
* pr/942:
Removed unused code from the EngineHostDao Implementation.
Signed-off-by: Remi Bergsma <github@remi.nl>