Part 2
commit 797839360c65cd348d2eb20630521177ab0919de
bug 9154: redundant virtual router
commit 8ff7f230204d4d3a7a4adee75523a9a84f4276fe
bug 9154: Replace domain_router.is_master with domain_router.redundant_state in DB
commit 230b99e9e0b152648f1dd2a5eab6f22315b8e7b4
bug 9154: Add redundant state to DomainRouterResponse
commit ccefb5ff5e83d713798a347c99bce1a0d04b4317
bug 9154: Add router fault state report
commit 7a3090378f9785caecf741b70554f6ea17c41764
bug 9154: Send alert if found two virtual routers in master state
commit 66831056e4bf27665871bccd24e6159071564847
bug 9154: Code clean up
commit bf3f58a85741fa7118bd848a42d8b21baa4478d4
bug 9154: Add isRedundantRouter to DomainRouterResponse
Part 1
This backport contained:
commit 52317c718c25111c2535657139b541db0c9d1e1f
bug 9154: Initial check in for enabling redundant virtual router
commit 54199112055d754371bfb141168fb5538bf6d6ea
Add host verification for CheckRouterCommand
commit cef978a228c90056ead9be10cbc4de74c2b8de76
Fix CheckRouterAnswer's isMaster report
commit 4072f0a6991ac3b63601a1764fbe14188965f62f
Some build fixes and code refactoring for redundant router
commit 4d3350b7cd8ee2706a9bace4437fc194e36c8dd5
Redundant Router: Fix OVS
commit 6a228830e7c46d819fa0c3317e159e041337e887
Fix findByNetwork()/findByNetworkAndPod()'s return
commit c627777b3d5bdbcd60db4032cebd349a5b1ecd83
Redundant Router: Fix isVmAlive()
commit e1275d2514adc41f8744f5107d4069c38be195f1
Only issue CheckRouterCommand to redundant routers
And all modification to the scripts till
commit 4e3942462ed3fde3a3d7011e95839e2128fba514
logging changes
in the master branch.
haproxy tunning:
0. Test case:
httpd running in 5 user VMs, all of them created on a xenserver host(16 core, 42G memroy, 10G network)
domR running on an anther host with same hardware configuration.
test application, ab, running on anther host behind an anther seperate switch
1.haproxy is not a memory intensive app. I can get 4625.96 connection/s with 1G memory. While it's really a CPU intensive app, domR always uses around 100% CPU on the host.
2.By default, you can't get better connection/s rate, because ip_conntrack_max and tw_bucket are too small, you will see the error in domR like:
"TCP: time wait bucket table overflow" or "nf_conntrack: table full, dropping packet".
So I increase these numbers to 1000000 from 65536, then I can steadly get around 4600 connection/s when memory is >= 1G.
Here is the connection per second, tested by "ab -n 1000000 -c 100 http://192.168.170.152:880/test.html"
domR memory conn/s
128M: 3545.55
256M: 4081.38
512M: 4318.18
1G: 4625.96
7G: 4745.53
3. If I enable notrack for both connections between domr/user vm, and public network, that tell iptable in domR don't track the connection during my test, then I can get better number, around
5800 connections/s. But we can't enable notrack, as iptables is used to track throughput in domR.
4. In a word, with this commit, the connection rate of haproxy can be increased from 1000-2000/s to 4700/s when domR's memory is larger than 1G.
5. How many CPU need to assign to domR to get this number? Haven't finished yet, as CPU is shared by all the VMs on the host, if other VMs are busy, it will impact the performance of haproxy.
Changes specific for Xen hypervisor, and DB upgrade. Changes for vmware chcked-in already in commit 1c310a0d2ae81108386f0dd5c2e899ff00fee9e9, e71112e2f587f5d6c9c6d5337cfeb1f239f29633. KVM will not support this feature.
Changes:
- Added a new column `source_template_id` to vm_template table to carry the parent/source template ID from which the tempalte was created
- Added the column in db upgrade 224 to 225
- Changed code to save the source_template_id if there is one associated to the volume/ volume from which the snapshot was taken
- API response returns the sourcetemplateid field, if set, in all template usecases.
status 9623: resolved fixed
Also set ram_size to 1024 for console proxy offering during the upgrade
Conflicts:
core/src/com/cloud/vm/SecondaryStorageVmVO.java
server/src/com/cloud/agent/manager/allocator/impl/UserConcentratedAllocator.java
server/src/com/cloud/consoleproxy/ConsoleProxyManagerImpl.java
server/src/com/cloud/storage/allocator/LocalStoragePoolAllocator.java
server/src/com/cloud/storage/secondary/SecondaryStorageManagerImpl.java
- Local fix to not log the content for ModifySshKeyCommand.
- For commands that do not want to log the parameters, added the facility to indicate this.
- For such commands, we remove the parameters from the log.
status 9336: resolved fixed
Following changes were made:
* deleteSecurityGroup/authorizeSecurityGroupIngress - removed account/domainId parameters as SG is uniquely identified by id now
* removed account_name field from securityGroup DB table; removed allowed_security_group/allowed_sec_grp_acct from security_ingress_rule.
These values were used for api response generation only for performance purposes; added caching on API level to improve performance
* Added missing security checks for securityGroups/ingressRules
Since private and public keys are logged, this is a Security concern
Changes: Added capability to 'Command' instances to support excluding certain fields from getting logged using GSON @Expose annotation.
- Update system vm_instance's template_id if it does not match the system vm template.
- Use _templateDao.findSystemVMTemplate to find the latest system vm template.
- Added a new flag 'allocation_state' to zone,pod,cluster and host
- The possible values for this flag are 'Enabled' or 'Disabled'
- When a new zone,pod,cluster or host is added, allocation_state is 'Disabled' by default.
- For existing zone,pod,cluster or host, the state is 'Enabled'.
- All Add/Update/List commands for each of zone,pod,cluster or host can now take a new parameter 'allocationstate'
- If 'allocation_state' is 'Disabled', Allocators skip that zone or pod or cluster or pod.
- For a root admin, ListZones lists all zones including the 'Disabled' zones. But for any other user, the 'Disabled' zones are not included in the response.
- For any usecase that creates/deploys/adds/registers a resource and takes in zone as parameter, now we check if the Zone is 'Disabled'. If yes then the operation cannot be performed by a user other than root-admin. Add volume, snapshot, templates are examples of this usecase.
- To enable the root admin to test a particular pod/cluster/host, deployVM command takes in 'host_id' parameter that can be passed in only by root admin.
If this parameter is passed in by the admin, allocators do not search for hosts and use that host only. StoragePools are searched in the cluster of that host.
If VM cannot be deployed to that host, allocators and deployVM fails without retrying
Bug 7723 - merge or re-write host tagging into master / 2.2
Bug 7627 - Need more logging for Allocators
Bug 8317 - Add better resource allocation failure messages
Changes for Deployment Planner to use host and storagePool allocators to find deployment destination.
Also has the changes for host tag feature.
Improved the logging for allocators.