Since we would introduce a way to specify each service provider in the network
offering, it's better for redundant virtual router as a separate service
provider.
Also isRedundant() flag in the network offering would be removed. Redundant
virtual router temporality won't work from now. Until we're able to add
different network elements/service providers in network_offering.
In the past, the NetworkElement would cover almost all the functionality that
e.g. virtual router can cover: firewall, source NAT, static NAT, password,
VPN... So anyone want to implement the NetworkElement would have to implement
these service's specific methods, even it wouldn't support it. Also, if we want
to find a e.g. FirewallServiceProvider, we have to proceed all the current
network service providers, to call a method to know if it support such service.
That's neither elegant nor scaling way to do it.
As the first step, this patch separates each ServiceProvider from NetworkElement
(there are some interface already out of NetworkElement, so this patch slightly
modifies them too), and only the class would implement the correlated interface, would
have the ability to do these services.
Changes:
- VirtualMachineMgr puts the constraint that if Root volume is already READY, we provide the clusterId in the plan to the deploymentPlanner. Planner then searches for resources only under that cluster.
- If no deployment could be found, deploying VM fails.
- Fixed this, such that incase the root volume is recreatable, we call the planner again by removing the cluster constraint. Planner will then search for resources in other clusters.
- Works for system VMs(SSVM, consoleproxy, virual routers).
This reverts commit 42ab3c94c210d5a29289a5dfd0e44ae99c427f8b.
The commit may not fit for our new network as service framework, because we
would make single router and redundant router as two different service provider,
so the change of network offering should clean up the old network and then setup
new one. Make single router work as redundant router later make no sense in such
condition.
Then every router would have one guest ip. The gateway ip would be used if the
router is not redundant, otherwise the guest ip would be used for guest network.
Changes:
- We were ordering clusters based on capacity of the first-fit host found in each cluster. Due to this, there were cases where we deployed VMs to one cluster instead of balancing off within clusters.
- Now we order the list of clusters by aggregate capacity and choose the ones that have enough capacity for the required VM in this order.
- This should balance the load between clusters instead of bombarding one.
Conflicts:
server/src/com/cloud/capacity/dao/CapacityDao.java
server/src/com/cloud/capacity/dao/CapacityDaoImpl.java
Changes:
- Added a new API 'migrateSystemVm' backed by MigrateSystemVMCmd.java to migrate system VMs (SSVM, consoleproxy, domain routers(router, LB, DHCP))
- This is Admin only action
- The existing API 'migratevirtualmachine' is only for user VMs
1) Introduce new managers - ProjectManager and DomainManager. Moved all domain related code from AccountManager to DomainManager.
2) Moved some code from ManagementServerImpl to the correct managers.
3) New resource limit for Domain - Project
Reviewed-by: Alex
Changes:
- When management server starts, it goes through all the pending work items from op_it_work table and schedules HA work for each. It used to mark each item as done. Instead we should keep the item as pending and let it get marked as Done after the HA work is done.
- Changes in VirtualMachineMgr::advanceStop() :
a) if we find a VM with null hostId, we stop the VM only if it is forced stopped.
b) if VM state transition to Stopping fails,for state Starting and Migrating we try to find the pending work item and then do cleanup the VM. In case state is Stopping we can cleanup directly.
c) We proceed releasing all resources only if state transitioned to 'Stopping'.
- Changes in HA:
a) Depend on VirtualMachineMgr::advanceStop() in case host is not found to do VM cleanup
- When Vm state between mgmt server and agent syncs from starting -> running, mark any pending work item as done.
Conflicts:
server/src/com/cloud/vm/VirtualMachineManagerImpl.java
reviewed-by: Alex/Kelven
Changes:
1. UserVmManagerImpl :: finalizeStart()
Added null check for the cmds.getAnswers() object. Return ‘true’ if null.
2. VirtualMachineManagerImpl :: advanceStart()
Move the line to set PodId to the vm being started above the state transition where hostId gets set, so that podId is not null in case management server goes down when vm starts on the agent. On restart, podId is not updated during fullsync. So this will prevent podId remaining null.
vm.setPodId(dest.getPod().getId());
Added two New values "all" and "default" to global config "network.loadbalancer.haproxy.stats.visibility" . With this change, it can take six possible value:
global - stats visible from public network.
guest-network - stats visible only to guestnetwork.
link-local - stats visible only to link local network(for xen and kvm).
disabled - stats disabled.
all - stats available on public,guest and link-local. (Newly added)
default - stats availble on the serving http port, this does need any specific http port.(Newly added)
Except "default" and "disabled", all the rest of 4 need to configure the stats port.
Force stop the router would release all the resources it used, but router may
still running. Add a column "stop_pending" in the database, and stop it when the
router come back.
Admin would able to choose to force destroy such router, then recover the
network using restartNetwork command with cleanup=false.
Changes:
A KVM agent always connects to the management server itself, we dont have to do direct connect. This part of code was missing updating the DB host entry with hosttags.
Corrected the code to save the hosttags while adding a KVM host.
remove heartbeat entry for this Primary Storage, when put this Primary Storage into maintenance mode
create heartbeat entry for this Primary Storage, when cancal maintenance for this Primary Storage
status 11275: resolved fixed
status 11036: resolved fixed
1) Use row locks instead of global lock when update resource_count table. When update resource_count for account, make sure that we lock account+all related domains
2) Insert resource_count records for account/domain at the moment when account/domain is created.
3) As a part of DB upgrade, insert missing resource_count records for all non-removed accounts/domains
Conflicts:
core/src/com/cloud/alert/AlertManager.java
server/test/com/cloud/agent/MockAgentManagerImpl.java
clean up tests for security group manager v2
move interval to listener -- allows it to be configurable if needed
fix mocks
Enhanced logging for security group manager (from zucchini)
fix merge issues
merge issues
Changes :
- Fixing API doc +response name + errorMessage
- Adding seperate events to Egress rules
- Egress rules Using the same database table as that of ingress with new column type.
Pending Tasks:
- db upgrade
- database table rename from security_ingress_rule to generic name, renaming some of the jave class from ingress to generic name.
- Retesting on kvm
Changes:
To make sure migration does not attempt to pick a host that has running VMs more than the max guest VM's limit:
- Changed manual migration to call host allocators to return a list of hosts suitable for migration. Host allocators check for the max guest VM limit.
- Earlier we returned hosts with enough capacity but now Host Allocators make other checks along with capacity. So the list of hosts returned are hosts that have enough capacity AND satisfy all other conditions like host tags, max guests limit etc. Or in other words Allocators dont return the hosts that dont satisfy all conditions even if they have capacity.
-Therefore, now we mark the list of hosts returned for manual migration as 'suitable' hosts instead of 'hasenoughCapacity' in the HostResponse.
- HA migration already calls allocators, so no change is needed there.
Changes:
- Adding a new table 'hypervisor_capabilities' that will record capabilities for each hypervisor version. Added db schema changes for this.
- Currently a few capabilities have been added, namely, 'max_guests_limit' and 'security_group_enabled'
- Added a new column 'hypervisor_version' to host table. StartupRouting command now takes in this parameter. It should be set when a host connects.
- If a host's hypervisor version is not present, we find all the capabilities rows for that hypervisor type and use the first record.
- 'max_guests_limit' is the maximum number of running guest Vms that a host can have for the given hypervisor.
- Host Allocators use this limit and skip a host if the number of running VMs on that host exceeds this limit.
status 11326: resolved fixed
Also added more logging to the agent rebalance code.
Conflicts:
server/src/com/cloud/agent/manager/ClusteredAgentManagerImpl.java