* Create database upgrade from 4.11.0.0 to 4.11.1.0. Add missing VMWare version to OS mapping SQL in the schema-41100to41110.sql.
* add unit test and add 4.11.0.0 entry to _upgradeMap
* CLOUDSTACK-10278 - WIP: need to test this script before create a pull request
* CLOUDSTACK-10278 - added more idempotent stored procs and moved all lines, that end with a semicolon in existing proc, onto one line because com/cloud/utils/db/ScriptRunner.java executes the sql as soon as it reads in line with a semicolon delimeter at the end.
* CLOUDSTACK-10278 - changed more sql statements to call idempotent stored procs
* CLOUDSTACK-10278 - WIP: need to test this script before create a pull request
* CLOUDSTACK-10278 - added more idempotent stored procs and moved all lines, that end with a semicolon in existing proc, onto one line because com/cloud/utils/db/ScriptRunner.java executes the sql as soon as it reads in line with a semicolon delimeter at the end.
* CLOUDSTACK-10278 - changed more sql statements to call idempotent stored procs
The new CA framework introduced basic support for comma-separated
list of management servers for agent, which makes an external LB
unnecessary.
This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).
This FR introduces two new global settings:
- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
for the agent's background task that checks and switches to agent's
preferred host.
The indirect.agent.lb.algorithm supports following algorithm options:
- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).
Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the mangement server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.
Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
'indirect.agent.lb.algorithm' global settings.
On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.
First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.
From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.
Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.
Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addresses.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Refactored nuage tests
Added simulator support for ConfigDrive
Allow all nuage tests to run against simulator
Refactored nuage tests to remove code duplication
* Move test data from test_data.py to nuage_test_data.py
Nuage test data is now contained in nuage_test_data.py instead of
test_data.py
Removed all nuage test data from nuage_test_data.py
* CLOUD-1252 fixed cleanup of vpc tier network
* Import libVSD into the codebase
* CLOUDSTACK-1253: Volumes are not expunged in simulator
* Fixed some merge issues in test_nuage_vsp_mngd_subnets test
* Implement GetVolumeStatsCommand in Simulator
* Add vspk as marvin nuagevsp dependency, after removing libVSD dependency
* correct libVSD files for license purposes
pep8 pyflakes compliant
4.10.0.0 users when upgrade to 4.11.0.0 may face db related
discrepancies due to some PRs that got merged without moving their sql
changes to 4.10->4.11 upgrade path. The 4.10.0.0 users can run those
missing sql statements manually and then upgrade to 4.11.0.0, since a
workaround like this is possible this ticket is not marked a blocker. In
4.11.1.0+, we'll move those changes from 4.9.3.0->4.10.0.0 upgrade path
to 4.10.0.0->4.11.0.0 upgrade path. Ideally we should not be doing this,
but this will fix issues for a future 4.10.0.0 user who may want to
upgrade to 4.11.1.0 or 4.12.0.0+.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Automate dynamic roles migration for missing props file
- In case commands.properties file is missing, enables dynamic roles.
- Adds a new -D or --default flag to migrate-dynamicroles.py script
to simply update the global setting and use the default role-rule
permissions.
- Add warning message, ask admins to move to dynamic roles during upgrade
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes regression failures seen in Trillian, fixes NPEs that cause Travis related failures.
This also removes the aria2 dependency from rpms that require users to enable/install epel-release.
This finally updates the checksums for 4.11 systemvmtemplates in db upgrade path.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Allowed zone-wide primary storage based on a custom plug-in to be added via the GUI in a KVM-only environment (previously this only worked for XenServer and VMware)
Added support for root disks on managed storage with KVM
Added support for volume snapshots with managed storage on KVM
Enable creating a template directly from a volume (i.e. without having to go through a volume snapshot) on KVM with managed storage
Only allow the resizing of a volume for managed storage on KVM if the volume in question is either not attached to a VM or is attached to a VM in the Stopped state.
Included support for Reinstall VM on KVM with managed storage
Enabled offline migration on KVM from non-managed storage to managed storage and vice versa
Included support for online storage migration on KVM with managed storage (NFS and Ceph to managed storage)
Added support to download (extract) a managed-storage volume to a QCOW2 file
When uploading a file from outside of CloudStack to CloudStack, set the min and max IOPS, if applicable.
Included support for the KVM auto-convergence feature
The compression flag was actually added in version 1.0.3 (1000003) as opposed to version 1.3.0 (1003000) (changed this to reflect the correct version)
On KVM when using iSCSI-based managed storage, if the user shuts a VM down from the guest OS (as opposed to doing so from CloudStack), we need to pass to the KVM agent a list of applicable iSCSI volumes that need to be disconnected.
Added a new Global Setting: kvm.storage.live.migration.wait
For XenServer, added a check to enforce that only volumes from zone-wide managed storage can be storage motioned from a host in one cluster to a host in another cluster (cannot do so at the time being with volumes from cluster-scoped managed storage)
Don’t allow Storage XenMotion on a VM that has any managed-storage volume with one or more snapshots.
Enabled for managed storage with VMware: Template caching, create snapshot, delete snapshot, create volume from snapshot, and create template from snapshot
Added an SIOC API plug-in to support VMware SIOC
When starting a VM that uses managed storage in a cluster other than the one it last was running in, we need to remove the reference to the iSCSI volume from the original cluster.
Added the ability to revert a volume to a snapshot
Enabled cluster-scoped managed storage
Added support for VMware dynamic discovery
IPv4 and IPv6 are two different protocols and the presence of IPv6
in a network does not mean that IPv4 aliases/multiple subnets should
not be configured or supported by the VR.
This if-statement was written almost 5 years ago in a attempt to
add IPv6 support to CloudStack but was never fully implemented.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Extending Config Drive support
* Added support for VMware
* Build configdrive.iso on ssvm
* Added support for VPC and Isolated Networks
* Moved implementation to new Service Provider
* UI fix: add support for urlencoded userdata
* Add support for building systemvm behind a proxy
Co-Authored-By: Raf Smeets <raf.smeets@nuagenetworks.net>
Co-Authored-By: Frank Maximus <frank.maximus@nuagenetworks.net>
Co-Authored-By: Sigert Goeminne <sigert.goeminne@nuagenetworks.net>
During storage expunge domain resource statistics for primary storage space resource counter is not updated for domain. This leads to the situation when domain resource statistics for primary storage is overfilled (statistics only increase but not decrease).
Global scheduled task resourcecount.check.interval > 0 provides a workaround but not fixes the problem truly because when accounts inside domains use primary_storage allocation/deallocation intensively it leads to service block of operation.
NB: Unable to implement marvin tests because it (marvin) places in database weird primary storage volume size of 100 when creating VM from template. It might be a sign of opening a new issue for that bug.
CloudStack volumes and templates are one single virtual disk in case of XenServer/XCP and KVM hypervisors since the files used for templates and volumes are virtual disks (VHD, QCOW2). However, VMware volumes and templates are in OVA format, which are archives that can contain a complete VM including multiple VMDKs and other files such as ISOs. And currently, Cloudstack only supports Template creation based on OVA files containing a single disk. If a user creates a template from a OVA file containing more than 1 disk and launches an instance using this template, only the first disk is attached to the new instance and other disks are ignored.
Similarly with uploaded volumes, attaching an uploaded volume that contains multiple disks to a VM will result in only one VMDK to being attached to the VM.
FS: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Support+OVA+files+containing+multiple+disks
This behavior needs to be improved in VMWare to support OVA files with multiple disks for both uploaded volumes and templates. i.e. If a user creates a template from a OVA file containing more than 1 disk and launches an instance using this template, the first disk should be attached to the new instance as the ROOT disk and volumes should be created based on other VMDK disks in the OVA file and should be attached to the instance.
Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows using templates and ISOs avoiding secondary storage as intermediate cache on KVM. The virtual machine deployment process is enhanced to supported bypassed registered templates and ISOs, delegating the work of downloading them to primary storage to the KVM agent instead of the SSVM agent.
Template and ISO registration:
- When hypervisor is KVM, a checkbox is displayed with 'Direct Download' label.
- API methods registerTemplate and registerISO are both extended with this new parameter directdownload.
- On template or ISO registration, no download job is sent to SSVM agent, CloudStack would only persist an entry on template_store_ref indicating that template or ISO has been marked as 'Direct Download' (bypassing Secondary Storage). These entries are persisted as:
template_id = Template or ISO id on vm_template table
store_id NULL
download_state = BYPASSED
state = Ready
(Note: these entries allow users to deploy virtual machine from registered templates or ISOs)
- An URL validation command is sent to a random KVM host to check if template/ISO location can be reached. Metalink are also supported by this feature. In case of a metalink, it is fetched and URL check is performed on each of its URLs.
- Checksum should be provided as indicated on #2246: {ALGORITHM}CHKSUMHASH
- After template or ISO is registered, it would be displayed in the UI
Virtual machine deployment:
When a 'Direct Download' template is selected for deployment, CloudStack would delegate template downloading to destination storage pool via destination host by a new pluggable download manager.
Download manager would handle template downloading depending on URL protocol. In case of HTTP, request headers can be set by the user via vm_template_details. Those details should be persisted as:
Key: HTTP_HEADER
Value: HEADERNAME:HEADERVALUE
In case of HTTPS, a new API method is added uploadTemplateDirectDownloadCertificate to allow user importing a client certificate into all KVM hosts' keystore before deployment.
After template or ISO is downloaded to primary storage, usual entry would be persisted on template_spool_ref indicating the mapping between template/ISO and storage pool.
Steps to reproduce issue
Deploy a VM
Take snapshot of the root volume
Delete the snapshot
Before the garbage collector has run, shutdown the VM and assign the VM to other user.
When garage collector executes NPE shows in the logs.
This feature allow admins to dedicate a range of public IP addresses to the SSVM and CPVM, such that they can be subject to specific external firewall rules. The option to dedicate a public IP range to the System VMs (SSVM & CPVM) is added to the createVlanIpRange API method and the UI.
Solution:
Global setting 'system.vm.public.ip.reservation.mode.strictness' is added to determine if the use of the system VM reservation is strict (when true) or preferred (false), false by default.
When a range has been dedicated to System VMs, CloudStack should apply IPs from that range to
the public interfaces of the CPVM and the SSVM depending on global setting's value:
If the global setting is set to false: then CloudStack will use any unused and unreserved public IP
addresses for system VMs only when the pool of reserved IPs has been exhausted
If the global setting is set to true: then CloudStack will fail to deploy the system VM when the pool
of reserved IPs has been exhausted, citing the lack of available IPs.
UI Changes
Under Infrastructure -> Zone -> Physical Network -> Public -> IP Ranges, button 'Account' label is refactored to 'Set reservation'.
When that button is clicked, dialog displayed is also refactored, including a new checkbox 'System VMs' which indicates if range should be dedicated for CPVM and SSVM, and a note indicating its usage.
When clicking on button for any created range, UI dialog displayed indicates whether IP range is dedicated for system vms or not.
The first PR(#1176) intended to solve #CLOUDSTACK-9025 was only tackling the problem for CloudStack deployments that use single hypervisor types (restricted to XenServer). Additionally, the lack of information regarding that solution (poor documentation, test cases and description in PRs and Jira ticket) led the code to be removed in #1124 after a long discussion and analysis in #1056. That piece of code seemed logicless (and it was!). It would receive a hostId and then change that hostId for other hostId of the zone without doing any check; it was not even checking the hypervisor and storage in which the host was plugged into.
The problem reported in #CLOUDSTACK-9025 is caused by partial snapshots that are taken in XenServer. This means, we do not take a complete snapshot, but a partial one that contains only the modified data. This requires rebuilding the VHD hierarchy when creating a template out of the snapshot. The point is that the first hostId received is not a hostId, but a system VM ID(SSVM). That is why the code in #1176 fixed the problem for some deployment scenarios, but would cause problems for scenarios where we have multiple hypervisors in the same zone. We need to execute the creation of the VHD that represents the template in the hypervisor, so the VHD chain can be built using the parent links.
This commit changes the method com.cloud.hypervisor.XenServerGuru.getCommandHostDelegation(long, Command). From now on we replace the hostId that is intended to execute the “copy command” that will create the VHD of the template according to some conditions that were already in place. The idea is that starting with XenServer 6.2.0 hotFix ESP1004 we need to execute the command in the hypervisor host and not from the SSVM. Moreover, the method was improved making it readable and understandable; it was also created test cases assuring that from XenServer 6.2.0 hotFix ESP1004 and upward versions we change the hostId that will be used to execute the “copy command”.
Furthermore, we are not selecting a random host from a zone anymore. A new method was introduced in the HostDao called “findHostConnectedToSnapshotStoragePoolToExecuteCommand”, using this method we look for a host that is in the cluster that is using the storage pool where the volume from which the Snaphost is taken of. By doing this, we guarantee that the host that is connected to the primary storage where all of the snapshots parent VHDs are stored is used to create the template.
Consider using Disabled hosts when no Enabled hosts are found
This also closes#2317
This extends work presented on #2048 on which the ability to extend the management range is provided.
Aim
This PR allows separating the management network subnet on which SSVM and CPVM are from the virtual routers management subnet.
Detailed use case
PCI compliance requires that network elements are defined as ‘in scope’ or ‘out of scope’, for compliance purposes. The SSVM and CPVM are both in scope as they allow public HTTP or HTTPS connections. The virtual routers have been defined as out of scope as they have been placed entirely in a firewalled network's segment. However, all of the system VM types share management network. As SSVM and CPVM are both in scope this would bring the virtual routers into scope as well, requiring individual audits of every virtual router. As this is not practical, the ‘management network’ which the SSVM and CPVM are on, and the management network which the virtual routers are on, must be separated by a firewall.
Description
By this feature it is possible to dedicate a created range for SSVM and CPVM (system vms) and provide a VLAN ID for its range.
A new boolean global configuration is added: system.vm.management.ip.reservation.mode.strictness. If enabled, the use of System VMs management IP reservation is strict, preferred if not. Default value is false (preferred).
Strict reservation: System VMs should try to get a private IP from a range marked for system vms. If not available, deployment fails
Preferred reservation: System VMS will try to get a private IP from a range marked for system vms. If not available, IP for range not marked for system vms is taken.
In CLOUDSTACK-9886, we are reading ping.interval and ping.timeout using configdao which involves direct reading of DB. So, replaced it with ConfigKey based approach.
Snapshot on primary storage not cleaned up after Storage migration. This happens in the following scenario:
Steps To Reproduce
Create an instance on the local storage on any host
Create a scheduled snapshot of the volume:
Wait until ACS created the snapshot. ACS is creating a snapshot on local storage and is transferring this snapshot to secondary storage. But the latest snapshot on local storage will stay there. This is as expected.
Migrate the instance to another XenServer host with ACS UI and Storage Live Migration
The Snapshot on the old host on local storage will not be cleaned up and is staying on local storage. So local storage will fill up with unneeded snapshots.
Consider this scenario:
1. User launches a VM from Template and keep it running
2. Admin logins and deleted that template [CloudPlatform does not check existing / running VM etc. while the deletion is done]
3. User resets the VM
4. CloudPlatform fails to star the VM as it cannot find the corresponding template.
It throws error as
java.lang.RuntimeException: Job failed due to exception Resource [Host:11] is unreachable: Host 11: Unable to start instance due to can't find ready template: 209 for data center 1
at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:113)
at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:495)
Client is requesting better handing of this scenario. We need to check existing / running VM's when the template is deleted and warn admin about the possible issue that may occur.
REPRO STEPS
==================
1. Launches a VM from Template and keep it running
2. Now delete that template
3. Reset the VM
4. CloudPlatform fails to star the VM as it cannot find the corresponding template.
EXPECTED BEHAVIOR
==================
Cloud platform should throw some warning message while the template is deleted if that template is being used by existing / running VM's
ACTUAL BEHAVIOR
==================
Cloud platform does not throw as waring etc.
Ran into an issue today where we passed both the "id" and "domainid" parameters into "listAccounts" and received a response despite the account id passed not belonging to the domainid passed.
Allow usage of "domainid" AND "id" in "listAccounts"
- Adding "AccountDoa::findActiveAccountById"
- Adding "AccountDaoImpl::findActiveAccountById"
- Removing seemingly pointless "listForDomain" parameter
- Updating "typeNEQ" value from "5" to "Account.ACCOUNT_TYPE_PROJECT"
(which is "5")
- Only attempt to load domain for "path" query parameter once
"searchForAccountsInternal" input validation logic pseudo-code:
- If "domainid" set, check immediately
- If "id" not set:
- and user is admin and "listall" is true
- if "domainid" not set, use caller domain id
- force "isrecursive" true
- else use caller account id
- Else if "domainid" and "name" set
- verify existence of account and that user has access
- Else:
- if "domainid" not set, locate account by "id"
- else, locate account by "id" and "domainid"
- verify account found and caller has access rights
Python base64 requires that the string is a multiple of 4 characters but
the Apache codec does not. RFC states is not mandatory so the data should
not fail the VR script (vmdata.py).
Signed-off-by: Marc-Aurèle Brothier <m@brothier.org>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This includes test related fixes and code review fixes based on
reviews from @rafaelweingartner, @marcaurele, @wido and @DaanHoogland.
This also includes VMware disk-resize limitation bug fix based on comments
from @sateesh-chodapuneedi and @priyankparihar.
This also includes the final changes to systemvmtemplate and fixes to
code based on issues found via test failures.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This moves the systevmtemplate migration logic from previous upgrade path
to 4.10.0.0->4.11.0.0 upgrade path.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In default/fresh installations, the guest os type for systemvms with id=15
or Debian 5 (32-bit) can cause memory allocation issues to guest. Using
Other Linux 64-bit as guest OS systemvms get all the allocated RAM. This
avoids OOM related kernel panics for certain VRs such as rVRs, lbvm etc.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes test failures around VMware with the new systemvmtemplate.
In addition:
- Does not skip rVR related test cases for VMware
- Removes rc.local
- Processes unprocessed cmd_line.json
- Fixed NPEs around VMware tests/code
- On VMware, use udevadm to reconfigure nic/mac address than rebooting
- Fix proper acpi shutdown script for faster systemvm shutdowns
- Give at least 256MB of swap for VRs to avoid OOM on VMware
- Fixes smoke tests for environment related failures
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactors and simplifies systemvm codebase file structures keeping
the same resultant systemvm.iso packaging
- Password server systemd script and new postinit script that runs
before sshd starts
- Fixes to keepalived and conntrackd config to make rVRs work again
- New /etc/issue featuring ascii based cloudmonkey logo/message and
systemvmtemplate version
- SystemVM python codebase linted and tested. Added pylint/pep to
Travis.
- iptables re-application fixes for non-VR systemvms.
- SystemVM template build fixes.
- Default secondary storage vm service offering boosted to have 2vCPUs
and RAM equal to console proxy.
- Fixes to several marvin based smoke tests, especially rVR related
tests. rVR tests to consider 3*advert_int+skew timeout before status
is checked.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes timezone issue where dates show up as nvalid in UI
- Introduces new event timeline listing/filtering of events
- Several UI improvements to add columns in list views
- Bulk operations support in instance list view to shutdown and destroy
multiple-selected VMs (limitation: after operation, redundant entries
may show up in the list view, refreshing VM list view fixes that)
- Align table thead/tbody to avoid splitting of tables
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactor cloud-early-config and make appliance specific scripts
- Make patching work without requiring restart of appliance and remove
postinit script
- Migrate to systemd, speedup booting/loading
- Takes about 5-15s to boot on KVM, and 10-30seconds for VMware and XenServer
- Appliance boots and works on KVM, VMware, XenServer and HyperV
- Update Debian9 ISO url with sha512 checksum
- Speedup console proxy service launch
- Enable additional kernel modules
- Remove unknown ssh key
- Update vhd-util URL as previous URL was down
- Enable sshd by default
- Use hostnamectl to add hostname
- Disable services by default
- Use existing log4j xml, patching not necessary by cloud-early-config
- Several minor fixes and file refactorings, removed dead code/files
- Removes inserv
- Fix dnsmasq config syntax
- Fix haproxy config syntax
- Fix smoke tests and improve performance
- Fix apache pid file path in cloud.monitoring per the new template
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows CloudStack administrators to create layer 2 networks on CloudStack. As these networks are purely layer 2, they don't require IP addresses or Virtual Router, only VLAN is necessary (provided by administrator or assigned by CloudStack). Also, network services should be handled externally, e.g. DNS, DHCP, as they are not provided by L2 networks.
As a consequence, a new Guest Network type is created within CloudStack: L2
Description:
Network offerings and networks support new guest type: L2.
L2 Network offering creation allows administrator to select Specify VLAN or let CloudStack assign it dynamically.
L2 Network creation allows administrator to specify VLAN tag (if network offerings allows it) or simply create network.
VM deployments on L2 networks:
VMs should not IP addresses or any network service
No Virtual Router deployed on network
If Specify VLAN = true for network offering, network gets implemented using a dynamically assigned VLAN
UI changes
A new button is added on Networks tab, available for admins, to allow L2 networks creation
At present, The management IP range can only be expanded under the same subnet. According to existing range, either the last IP can be forward extended or the first IP can be backward extended. But we cannot add an entirely different range from the same subnet. So the expansion of range is subnet bound, which is fixed. But when the range gets exhausted and a user wants to deploy more system VMs, then the operation would fail. The purpose of this feature is to expand the range of management network IPs within the existing subnet. It can also delete and list the IP ranges.
Please refer the FS here: https://cwiki.apache.org/confluence/display/CLOUDSTACK/Expansion+of+Management+IP+Range
MySQLTransactionRollbackException is seen frequently in logs
Root Cause
Attempts to lock rows in the core data access layer of database fails if there is a possibility of deadlock. However Operations are not getting retried in case of deadlock. So introducing retries here
Solution
Operations would be retried after some wait time in case of dead lock exception.