CLOUDSTACK-9348: NioConnection improvementsReopened PR with squashed changes for a re-review and testing after https://github.com/apache/cloudstack/pull/1493 and sub-sequent PRs got reverted
* pr/1549:
CLOUDSTACK-9348: NioConnection improvements
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Taking fast and efficient volume snapshots with XenServer (and your storage provider)A XenServer storage repository (SR) and virtual disk image (VDI) each have UUIDs that are immutable.
This poses a problem for SAN snapshots, if you intend on mounting the underlying snapshot SR alongside the source SR (duplicate UUIDs).
VMware has a solution for this called re-signaturing (so, in other words, the snapshot UUIDs can be changed).
This PR only deals with the CloudStack side of things, but it works in concert with a new XenServer storage manager created by CloudOps (this storage manager enables re-signaturing of XenServer SR and VDI UUIDs).
I have written Marvin integration tests to go along with this, but cannot yet check those into the CloudStack repo as they rely on SolidFire hardware.
If anyone would like to see these integration tests, please let me know.
JIRA ticket: https://issues.apache.org/jira/browse/CLOUDSTACK-9281
Here's a video I made that shows this feature in action:
https://www.youtube.com/watch?v=YQ3pBeL-WaA&list=PLqOXKM0Bt13DFnQnwUx8ZtJzoyDV0Uuye&index=13
* pr/1403:
Faster logic to see if a cluster supports resigning
Support for backend snapshots with XenServer
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9366: Capacity of one zone-wide primary storage ignoredDisable and Remove Host operation disables the primary storage capacity.
Steps to replicate:
Base Condition: There exists a host and storage pool with same id
Steps:
1. Find a host and storage pool having same id
2. Disable the host
3. CPU(1) and MEMORY(0) capacity in op_host_capacity for above host is disabled
4. STORAGE(3) capacity in op_host_capacity for storage pool with id same as above host is also disabled
RCA:
'host_id' column in 'op_host_capacity' table used for storing both storage pool id (for STORAGE capacity) and host id (MEMORY and CPU). While disabling a HOST we also disable the capacity associated with host.
Ideally while disabling capacity we should only disable MEMORY and CPU capacity, but we are not doing so.
Code Path:
ResourceManagerImpl.doDeleteHost() -> ResourceManagerImpl.resourceStateTransitTo() -> CapacityDaoImpl.updateCapacityState(null, null, null, host.getId(), capacityState.toString())
updateCapacityState is updating disabling all entries which matches the host_id. This will also disable a entry having storage pool id same as that of host id.
Changes:
introduced new capacityType parameter in updateCapacityState method and necessary changes to add capacity_type clause in sql
also fixed incorrect sql builder logic (unused code path for which it is never surfaced )
Added marvin test to check host and storagepool capacity when host is disabled
Test Result:
```
Before Fix:
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Enabled | MEMORY | 8589934592 | HOST |
| 3 | Enabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
9 rows in set (0.00 sec)
Disable Host 3 from UI.
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Disabled | MEMORY | 8589934592 | HOST |
| 3 | Disabled | CPU | 32000 | HOST |
| 3 | Disabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
After Fix:
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Enabled | MEMORY | 8589934592 | HOST |
| 3 | Enabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
3 rows in set (0.01 sec)
Disable Host 3 from UI.
mysql> select ohc.host_id, ohc.`capacity_state`, case capacity_type when 0 then 'MEMORY' when 1 then 'CPU' ELSE 'STORAGE' END as 'capacity_type' , total_capacity, case capacity_type when 0 then 'HOST' when 1 then 'HOST' ELSE 'STORAGE POOL' END as 'HOST/STORAGE POOL' from op_host_capacity ohc where host_id=3;
+---------+----------------+---------------+----------------+-------------------+
| host_id | capacity_state | capacity_type | total_capacity | HOST/STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
| 3 | Disabled | MEMORY | 8589934592 | HOST |
| 3 | Disabled | CPU | 32000 | HOST |
| 3 | Enabled | STORAGE | 2199023255552 | STORAGE POOL |
+---------+----------------+---------------+----------------+-------------------+
3 rows in set (0.00 sec)
Sudhansus-MAC:cloudstack sudhansu$ nosetests-2.7 --with-marvin --marvin-config=setup/dev/advanced.cfg test/integration/component/maint/test_capacity_host_delete.py
==== Marvin Init Started ====
=== Marvin Parse Config Successful ===
=== Marvin Setting TestData Successful===
==== Log Folder Path: /tmp//MarvinLogs//Apr_22_2016_22_42_27_X4VBWD. All logs will be available here ====
=== Marvin Init Logging Successful===
==== Marvin Init Successful ====
===final results are now copied to: /tmp//MarvinLogs/test_capacity_host_delete_9RHSNB===
Sudhansus-MAC:cloudstack sudhansu$ cat /tmp//MarvinLogs/test_capacity_host_delete_9RHSNB/results.txt
test_01_op_host_capacity_disable_host (integration.component.maint.test_capacity_host_delete.TestHosts) ... === TestName: test_01_op_host_capacity_disable_host | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 1 test in 0.168s
OK
```
* pr/1516:
CLOUDSTACK-9366: Capacity of one zone-wide primary storage ignored
Signed-off-by: Will Stevens <williamstevens@gmail.com>
dynamic-roles: packaging improvementsOn some MySQL server envs, this may cause a SQL statement error, though
I was unable to reproduce it. Since it's not needed, an order by 'sort_order'
is enough, we can safely remove it.
/cc @swill @anshulgangwar @DaanHoogland and others
* pr/1551:
migrate-dynamicroles: use mysql.connector due to #1054
packaging: don't bundle systemvm.zip in rpms
packaging: backup commands.properties as it does not exist in new rpms
dynamic-roles: remove unnecessary order by ID
Signed-off-by: Will Stevens <williamstevens@gmail.com>
introduced new capacityType parameter in updateCapacityState method and necessary changes to add capacity_type clause in sql
also fixed incorrect sql builder logic (unused code path for which it is never surfaced )
Added marvin test to check host and storagepool capacity when host is disabled
Added conditions to ensure the capacity_type is added only when capacity_type length is greater than 0.
Added checks in marvin test to ensure the capacity exists for a host before disabling it.
Added checks to avoid index out of range exception
- Unit test to demonstrate denial of service attack
The NioConnection uses blocking handlers for various events such as connect,
accept, read, write. In case a client connects NioServer (used by
agent mgr to service agents on port 8250) but fails to participate in SSL
handshake or just sits idle, this would block the main IO/selector loop in
NioConnection. Such a client could be either malicious or aggresive.
This unit test demonstrates such a malicious client that can perform a
denial-of-service attack on NioServer that blocks it to serve any other client.
- Use non-blocking SSL handshake
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* 4.7:
Fix Sync of template.properties in Swift
Configure rVPC for router.redundant.vrrp.interval advert_int setting
Have rVPCs use the router.redundant.vrrp.interval setting
Resolve conflict as forceencap is already in master
Split the cidr lists so we won't hit the iptables-resture limits
Check the existence of 'forceencap' parameter before use
Do not load previous firewall rules as we replace everyhing anyway
Wait for dnsmasq to finish restart
Remove duplicate spaces, and thus duplicate rules.
Restore iptables at once using iptables-restore instead of calling iptables numerous times
Add iptables copnversion script.
On some MySQL server envs, this may cause a SQL statement error, though
I was unable to reproduce it. Since it's not needed, an order by 'sort_order'
is enough, we can safely remove it.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This reverts commit 7ce0e10fbc, reversing
changes made to 29ba71f2db.
This was reverted because it seemed to be related to an issue
when doing a DeployDC, causing an `addHost` error.
CLOUDSTACK-9265 cleanup around httpclient versionssome cleanup done
- replaced HttpStatus from org.apache.commons.httpclient with that from org.apache.http
- removed unthrown HttpException
- left auto reformat in place
* pr/1385:
CLOUDSTACK-9265 cleanup around httpclient versions
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
This PR addresses the following JIRA ticket:
https://issues.apache.org/jira/browse/CLOUDSTACK-8813
The problem is that there needs to be notifications sent when a host is added to, about to be removed from, and removed from a cluster.
Such notifications can be used for many purposes. For example, it can allow storage plug-ins to update ACLs on their storage systems. Also, it can allow us to clean up IQNs from ESXi hosts that are no longer needed.
* pr/816:
CLOUDSTACK-8813: Notify listeners when a host has been added to a cluster, is about to be removed from a cluster, or has been removed from a cluster
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.
Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (http://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.
This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host
For testing this feature `ipmisim` can be used:
https://pypi.python.org/pypi/ipmisim
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows root administrators to define new roles and associate API
permissions to them.
A limited form of role-based access control for the CloudStack management server
API is provided through a properties file, commands.properties, embedded in the
WAR distribution. Therefore, customizing API permissions requires unpacking the
distribution and modifying this file consistently on all servers. The old system
also does not permit the specification of additional roles.
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Dynamic+Role+Based+API+Access+Checker+for+CloudStack
DB-Backed Dynamic Role Based API Access Checker for CloudStack brings following
changes, features and use-cases:
- Moves the API access definitions from commands.properties to the mgmt server DB
- Allows defining custom roles (such as a read-only ROOT admin) beyond the
current set of four (4) roles
- All roles will resolve to one of the four known roles types (Admin, Resource
Admin, Domain Admin and User) which maintains this association by requiring
all new defined roles to specify a role type.
- Allows changes to roles and API permissions per role at runtime including additions or
removal of roles and/or modifications of permissions, without the need
of restarting management server(s)
Upgrade/installation notes:
- The feature will be enabled by default for new installations, existing
deployments will continue to use the older static role based api access checker
with an option to enable this feature
- During fresh installation or upgrade, the upgrade paths will add four default
roles based on the four default role types
- For ease of migration, at the time of upgrade commands.properties will be used
to add existing set of permissions to the default roles. cloud.account
will have a new role_id column which will be populated based on default roles
as well
Dynamic-roles migration tool: scripts/util/migrate-dynamicroles.py
- Allows admins to migrate to the dynamic role based checker at a future date
- Performs a harder one-way migrate and update
- Migrates rules from existing commands.properties file into db and deprecates it
- Enables an internal hidden switch to enable dynamic role based checker feature
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8901: PrepareTemplate job thread hard-coded to max 8 threads The thread pool was hardcoded to use 8 threads,
com.cloud.template.TemplateManagerImpl.configure(String, Map<String, Object>):
_preloadExecutor = Executors.newFixedThreadPool(8, new NamedThreadFactory("Template-Preloader"));
Added the change to pick threadpool size from configuration.
* pr/880:
CLOUDSTACK-8901: PrepareTemplate job thread hard-coded to max 8 threads
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CID-1338387: Deletion of method endPointSelector.selectHypervisorHostFollowing the discussions and analysis presented on PR #1056 create by @DaanHoogland
This PR is intended to push those changes that were discussed there regarding the of endPointSelector.selectHypervisorHost method.
* pr/1124:
Deletion of method endPointSelector.selectHypervisorHost
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9348: Use non-blocking SSL handshake in NioConnection/Link- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling connect/accept events
Changes are covered the NioTest so I did not write a new test, advise how we can improve this. Further, I tried to invest time on writing a benchmark test to reproduce a degraded server but could not write it deterministic-ally (sometimes fails/passes but not always). Review, CI testing and feedback requested /cc @swill @jburwell @DaanHoogland @wido @remibergsma @rafaelweingartner @GabrielBrascher
* pr/1493:
CLOUDSTACK-9348: Use non-blocking SSL handshake
CLOUDSTACK-9348: Unit test to demonstrate denial of service attack
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-8302: Removing snapshots on RBDSnapshot removing implemented if primary datastore is RBD
https://issues.apache.org/jira/browse/CLOUDSTACK-8302
* pr/1230:
CLOUDSTACK-8302 - Cleanup snapshot on KVM with RBD Snapshot removing implemented on RBD. 1. On management side: when created new shanpshot we checking if our primary storage is RBD, then do not remove record from cloud.snapshot_store_ref with link to Ceph image via 'install_path' field. 2. On management side: when removing snapshot, also send command to agent 'DeleteCommand'. 3. On agent side: method implemented 'public Answer deleteSnapshot(final DeleteCommand cmd)'
Signed-off-by: Will Stevens <williamstevens@gmail.com>
- Uses non-blocking socket config in NioClient and NioServer/NioConnection
- Scalable connectivity from agents and peer clustered-management server
- Removes blocking ssl handshake code with a non-blocking code
- Protects from denial-of-service issues that can degrade mgmt server responsiveness
due to an aggressive/malicious client
- Uses separate executor services for handling ssl handshakes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM idWhen calling listServiceOfferings with VM id as parameter. It is returning incompatible tagged offerings. It should only list all compatible tagged offerings. Compatible means the new service offering should contain all the tags of the existing service offering(Existing offering SUBSET of new offering). If that is the case It should list in the result and can be upgraded to that offering.
* pr/1321:
CLOUDSTACK-8847: ListServiceOfferings is returning incompatible tagged offerings when called with VM id
Signed-off-by: Will Stevens <williamstevens@gmail.com>
engine/schema: fix upgrade path to work with MySQL 5.7Found this issue when using MySQL 5.7 with Ubuntu 16.04. The upgrade path fix removes an invalid `IGNORE` param that is deprecated now, in the upgrade path we run the alter statement to add an index only if it does not exist so we're good.
For MySQL 5.7, we'll also need to update the docs at some point to include `server-id` along with other parameters. Some of the SQL statements used throughout engine/schema don't adhere to SQL 99 standard which is enforced by default in MySQL 5.7, therefore the following sql-mode (for backward compatibility with mysql 5.6 modes) will be necessary for anyone willing to use MySQL 5.7 (until we fix codebase wide raw and generated sql statements to be SQL99 compliant):
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
server-id = 1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
/cc @swill @jburwell @agneya2001 @wido @DaanHoogland and others
* pr/1517:
engine/schema: fix upgrade path to work with MySQL 5.7
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Snapshot removing implemented on RBD.
1. On management side: when created new shanpshot we checking if our primary storage is RBD,
then do not remove record from cloud.snapshot_store_ref with link to Ceph
image via 'install_path' field.
2. On management side: when removing snapshot, also send command to agent 'DeleteCommand'.
3. On agent side: method implemented 'public Answer deleteSnapshot(final DeleteCommand cmd)'
Found this issue when using MySQL 5.7 with Ubuntu 16.04 with following settings:
sql-mode="STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,ERROR_FOR_DIVISION_BY_ZERO,NO_ZERO_DATE,NO_ZERO_IN_DATE,NO_ENGINE_SUBSTITUTION"
server-id = 1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Updated most dependencies to latest minor releases, EXCEPT:
- Gson 2.x
- Major spring framework version
- Servlet version
- Embedded jetty version
- Mockito version (beta)
- Mysql lib minor version upgrade (breaks mysql-ha plugin)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag
RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
* pr/1200:
CLOUDSTACK-9130: Make RebootCommand similar to start/stop/migrate agent commands w.r.t. "execute in sequence" flag RebootCommand now behaves in the same way as start/stop/migrate agent commands w.r.t. to sequential/parallel execution.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9196: Fixing null pointer exception when vm meta data is synced on upgraded setuphttps://issues.apache.org/jira/browse/CLOUDSTACK-9196
NullPointerException can occur if XenServer reports non-existing VM in cloud DB.
* pr/1274:
CLOUDSTACK-9196: Fixing null pointer exception when vm meta data is synced on upgraded setup.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Removed unused variables from "NetworkStateListener" classWe removed the following variables from "com.cloud.network.NetworkStateListener"
. UsageEventDao _usageEventDao
. NetworkDao _networkDao
We changed the EventBus s_eventBus variable to private, the constructor not to use those variables and applied this change in classes com.cloud.network.IpAddressManagerImpl and org.apache.cloudstack.engine.orchestration.NetworkOrchestrator
* pr/1261:
Removed unused variables from class NetworkStateListener
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixed return type Void to void in DataMotionStrategy.The main changes are:
- Changing methods Void to void.
- Removal of the method Void copyAsync(DataObject srcData, DataObject
destData, AsyncCompletionCallback<CopyCommandResult> callback) that was
never used.
We noticed that methods form that class are using the return type Void
with capital V. This way that method has to return a null value at the
end.
Removed trim lines from XenServerStorageMotionStrategy.
The trim lines were removed from XenServerStorageMotionStrategy.
* pr/969:
Changed return of methods from DataMotionStrategy, Void to void
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9185: [VMware DRS] VM sync failed with exception due to out-of-band changesSummary: The target "ClusteredVirtualMachineManagerImpl.HandlePowerStateReport" invoked during the VM power state sync is not found as HandlePowerStateReport was not implemented in ClusteredVirtualMachineManagerImpl and was private in VirtualMachineManagerImpl, which was resulting in InvocationTargetException. Changed HandlePowerStateReport() in VirtualMachineManagerImpl to protected.
* pr/1256:
CLOUDSTACK-9185: [VMware DRS] VM sync failed with exception due to out-of-band changes
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-8860: improve error messages in VM deployment code path.improved the error messages in vm deployment code path. added some more data to the error messages and also fixed some errors using internal ids to use uuids.
* pr/864:
CLOUDSTACK-8860: improve error messages in VM deployment code path.
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Conflicts:
engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java
[4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This PR applies the same fixes as in the PR #1259, but against branch 4.7.
Please refer to PR #1259 for the tests results and all the comments already made there.
Issues fixed are:
* CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector
* CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the keepalivd.conf
* CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings
Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```:
* test_04_rvpc_network_garbage_collector_nics
* test_05_rvpc_multi_tiers
The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change the settings (default is 600 seconds for each) and restart the Management Server before running the tests. I would suggest to set it to 60 seconds.
In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not reading their values through ConfigDao. Due to that, the settings were not being applied properly and the test was waiting to long to check the VPC routers.
* pr/1277:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
Implement CheckHealthCommand for NSX controllers
Fix log message that refers to agent, not host
Prevent NullPointerException when host does not belong to a pod
Add Health Check Command to NSX pluginThe NSX plugin does not support the HeathCheckCommand. Instead it fakes a PingCommand as a call tot he control cluster status API.
However, we have seen in production that the management server will sometimes find the NSX controller to be behind on ping and that will trigger a HealthCheckCommand which will return with an unsupported command answer.
Once this happens the controller is put into Alert state and will not recover until the management sever is restarted.
In addition, during the investigation, there will be a null pointer exception due tot he fact that the NSX controllers do not live in a pod.
This PR tries to address those two issues.
* pr/1293:
Implement CheckHealthCommand for NSX controllers
Fix log message that refers to agent, not host
Prevent NullPointerException when host does not belong to a pod
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
Fix unable to setup more than one Site2Site VPN Connection
FIX S2S VPN rVPC: Check only redundant routers in state MASTER
PEP8 of integration/smoke/test_vpc_vpn
Add S2S VPN test for Redundant VPC
Make integration/smoke/test_vpc_vpn Hypervisor independant
FIX VPN: non-working ipsec commands
[UI] MADNESS
[DB] Add force_encap field to s2s_customer_gateway table
[ROUTER] Add forceencaps field to python router ipsec config method
[TEST] unittest needs rework
[MARVIN] Add forceencap field to VpnCustomerGateway class in marvin base
[CORE] Add Force UDP Encapsulation option to Site2Site VPN
CLOUDSTACK-9186: Root admin cannot see VPC created by Domain admin user
CLOUDSTACK-9192: UpdateVpnCustomerGateway is failing
CLOUDSTACK-6485 prevent ip asignment of private gw iface
CLOUDSTACK-9204 Do not error when staticroute is already gone
make both check lines consistent
CLOUDSTACK-9181 Prevent syntax error in checkrouter.sh
CLOUDSTACK-9202 Bump ssh timeout
[4.7] ADD Force UDP encapsulation option to Site2Site VPNThis PR adds the option to enable forced UDP encapsulation of ESP packets during a setup of a site2site vpn. This options enforces the 'forceencaps' option in the openswan ipsec config:
https://wiki.strongswan.org/projects/strongswan/wiki/ConnSection
* pr/1317:
[UI] MADNESS
[DB] Add force_encap field to s2s_customer_gateway table
[ROUTER] Add forceencaps field to python router ipsec config method
[TEST] unittest needs rework
[MARVIN] Add forceencap field to VpnCustomerGateway class in marvin base
[CORE] Add Force UDP Encapsulation option to Site2Site VPN
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
CLOUDSTACK-9220 Sort list of domains on Domain tab in UI
Admin cannot see VMs on port forwarding page
Fix mariadb related listCapacity bug (CLOUDSTACK-8966)
CLOUDSTACK-9213 - Split the ACL rules using comma instead of dash.
CLOUDSTACK-9213 - Formatting the code
Summary: The target "ClusteredVirtualMachineManagerImpl.HandlePowerStateReport" invoked during the VM power state sync is not found as HandlePowerStateReport was not implemented in ClusteredVirtualMachineManagerImpl and was private in VirtualMachineManagerImpl, which was resulting in InvocationTargetException. Changed HandlePowerStateReport() in VirtualMachineManagerImpl to protected.
CLOUDSTACK-9134: set device_id as the first device_id not in use instead of nic count
when we restart vpc tiers, the old nics will be removed, and create a new nic.
however, the device_id was set to the nic count, which may be already used.
this commit get the first device_id not in use as the device_id of new nic.
This issue also happen when we add multiple networks to a vm and remove them.
* pr/1209:
CLOUDSTACK-9134: set device_id as the first device_id not in use instead of nic count
Signed-off-by: Daan Hoogland <daan@onecht.net>
when we restart vpc tiers, the old nics will be removed, and create a new nic.
however, the device_id was set to the nic count, which may be already used.
this commit get the first device_id not in use as the device_id of new nic.
This issue also happen when we add multiple networks to a vm and remove them.
Quota service while allowing for scalability will make sure that the cloud is
not exploited by attacks, careless use and program errors. To address this
problem, we propose to employ a quota-enforcement service that allows resource
usage within certain bounds as defined by policies and available quotas for
various entities. Quota service extends the functionality of usage server to
provide a measurement for the resources used by the accounts and domains using a
common unit referred to as cloud currency in this document. It can be configured
to ensure that your usage won’t exceed the budget allocated to accounts/domain
in cloud currency. It will let user know how much of the cloud resources he is
using. It will help the cloud admins, if they want, to ensure that a user does
not go beyond his allocated quota. Per usage cycle if a account is found to be
exceeding its quota then it is locked. Locking an account means that it will not
be able to initiat e a new resource allocation request, whether it is more
storage or an additional ip. Needless to say quota service as well as any action
on the account is configurable.
Changes from Github code review:
- Added marvin test for quota plugin API
- removed unused commented code
- debug messages in debug enabled check
- checks for nulls, fixed access to member variables and feature
- changes based on PR comments
- unit tests for UsageTypes
- unit tests for all Cmd classes
- unit tests for all service and manager impls
- try-catch-finally or try-with-resource in dao impls for failsafe db switching
- remove dead code
- add missing quota calculation case (regression fixed)
- replace tabs with spaces in pom.xmls
- quota: though default value for quota_calculated is 0, the usage server
makes it null while entering usage entries. Flipping the condition so
as to acocunt for that.
- quotatypes: fix NPE in quota type
- quota framework test fixes
- made statement period configurable
- changed default email templates to reflect the fact that exhausted quota may not result in a locked account
- added quotaUpdateCmd that refreshes quota balances and sends alerts and statements
- report quotaSummary command returns quota balance, quota usage and state for all account
- made UI framework changes to allow for text area input in edit views
- process usage entries that have greater than 0 usage
- orocess quota entries only if tariff is non zero
- if there are credit entries but no balance entry create a dummy balance entry
- remove any credit entries that are before the last balance entry
when displaying balance statement
- on a rerun the last balance is now getting added
FS: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Quota+Service+-+FS
PR: https://github.com/apache/cloudstack/pull/768
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs
Added logid to logging framework, now all API call logs can be tracked with this id end to end
* pr/1167:
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs Added logid to logging framework, now all API call logs can be tracked with this id end to end
Signed-off-by: Daan Hoogland <daan@onecht.net>
CLOUDSTACK-9047 rename enumsmake enums adhere to best practice naming conventions
* pr/1049:
CLOUDSTACK-9046 rename enums to adhere to naming conventions
CLOUDSTACK-9046 renamed enums in kvm plugin
CLOUDSTACK-9047 use 'State's only with context there are more types called 'State' (or to be called so but now 'state') So remove imports and prepend their enclosing class/context to them.
Signed-off-by: Daan Hoogland <daan@onecht.net>
CLOUDSTACK-9107: Description of global config agent.load.threshold and the actual behavior are not in sync
Made configuration parameter description and behavior in sync.
* pr/1172:
CLOUDSTACK-9107: Description of global config agent.load.threshold and the actual behavior are not in sync Made configuration parameter description and behavior in sync.
Signed-off-by: Remi Bergsma <github@remi.nl>