Bug 13641 - OVM add host to OVM cluster results in host remaining in state: Alert
Bug 13652 - OVM add primary storage to OVM cluster FAIL
making Ovm work on Acton
status 13662: resolved fixed
status 13641: resolved fixed
status 13652: resolved fixed
reviewed-by: edison
API docs. Uses XML/XSLT generation system originally by Abhinandan Prateek
(I think). CSS updates from Jessica Tomechak. gen_toc.py and build script
changes by me.
Changes:
To migrate systems using 'use.user.concentrated.pod.allocation' as true and 'vm.allocation.algorithm' as true, we need to
add following changes:
- There will be 5 values to 'vm.allocation.algorithm': 'random', 'firstfit', 'userdispersing', 'userconcentratedpod_random', 'userconcentratedpod_firstfit'
- 'userconcentratedpod_random' means we apply user concentration to pods and clusters. To hosts and pools we use random ordering.
- 'userconcentratedpod_firstfit' means we apply user concentration to pods and clusters. To hosts and pools we use firstfit ordering.
Changes:
- Throw error is anyone tries to update the resource limits for ROOT domain using updateResourceLimit API
- For ROOT domain always return -1 (infinite) limit
- DB upgrade: remove any limits set for ROOT domain
2) Added elasticIp and elasticLb network capabilities. Provided support to create network offering with these capabilities.
3) Added one more default network offering having elasticip and elasticlb
4) Public network support to Basic zone. You can associate/disassociate IP addresses now
Conserve mode means, we can use same IP for different purposes, in order to
"conserve" ip resources. But in this offering, all the service providers should
be the same, and the network created from this offering may be prohibited from
update to different network offering whose services are provided by different
service providers - because different service providers would need different IPs
for different services.
If user want to update the "conserve mode" network with the network offering
that has different service providers, each public IP should have only one usage,
only them the update is allowed.
* SSVM to act as a direct connect agent
* Storage Resources handle SSVM commands
* create-schema.sql already has simulator_network_label. removing the label from create-schema-simulator.sql
Commit 4f904d5fd9dbe5252b7a6075f712e9254059e2c0 "Changes to
PhysicalNetworkTrafficType to accomodate the simulator hypervisor type" broke
master. This patch fixes it.
- Changes to schema file schema-2214to30.sql: moved out cleanup to separate file, added some NAAS changes
- Added physicalNetwork setup to Upgrade2214to30.java data migration
- Unit test and sample file
bug 12318: NaaS: Dynamic CIDR for virtual router
This patch in fact use ExternalGuestNetworkGuru to replace GuestNetworkGuru. The
problem is the virtual router would normally use 10.1.1.0/8 as CIDR, but when we
want to upgrade to external firewall e.g. Netscaler, the CIDR would need to be
changed to different value e.g. 10.x.x.0/24 based on VLAN, because the external
firewall can not support one CIDR for multiply VLAN right now. So we have to use
the same policy for virtual router.
This patch also add one field "specified_cidr" to the networks table. If this
field is true, then it means user specify the CIDR of this network, thus we can
not granutee the CIDR after upgrade is valid, so we would like to prohibit the
upgrade of network offering.
This should also fix bug 12318. The reason for bug 12318 is the pre-set gateway
address of domR is overrided by ExternalGuestNetworkGuru. After this patch,
ExternalGuestNetworkGuru would respect the existed value in Nic, rather than
simply wiping it out. It would do calcuation to get the relevant address after
VLAN changed.
More clean up can be done in the future, when we proved that this policy change
doesn't break...
status 12234: resolved fixed
status 12318: resolved fixed
As per the new design following would be done.
(a) any ISO-derived disk can be extracted
(b) there will be a global config to disable extraction of ISO based volumes.
That way people concerned about (a) can just use (b) to fix it.
Reviewed by : Kishan.
status 11811: resolved fixed
Changes:
- Added a two new deployment planners 'UserDispersingPlanner' and 'UserConcentratedPodPlanner' to the DeploymentPlanners
- Planner can be chosen by setting the global config variable 'vm.allocation.algorithm' to either of the following values:
('random', 'firstfit', 'userdispersing', 'userconcentratedpod')
- By default, the value is 'random'. When the value is 'random', FirstFitPlanner is invoked as before that shuffles the resource lists.
- Now Admin can choose whether the deployment heuristic should be applied starting at cluster or pod level. This can be done by using the
global config variable 'apply.allocation.algorithm.to.pods' which is false by default. Thus by default as earlier, planner starts at clusters directly.
'UserConcentratedPodPlanner' changes:
- Earlier to 3.0, FirstFitPlanner used to reorder the clusters in case this heuristic was chosen.
- Now this is done by a separate planner and is applied only when 'vm.allocation.algorithm' is set to this planner
- It reorders the capacity based clusters/pods such that those pods having more number of Running Vms for the given account are tried first.
- Note that this userconcentration is applied only to pods and clusters. Not to hosts or storagepools within a cluster.
'UserDispersingPlanner' changes:
- 'UserDispersingPlanner' reorders the capacity ordered pods and clusters based on number of 'Running' VMs for the given account in ascending order. Aim is to choose thodes pods/clusters first which have less number of Running VMs for the given account
- Admin can provide weights to capacity and user dispersion so that both parameters get considered in reordering the pods/clusters. This can be done by setting
the global config parameter 'vm.user.dispersion.weight'. Default value is 1. Thus if this planner is chosen, by default, ordering will be done only by number of Running Vms, unless the weight is changed.
- HostAlllocators and StoragePoolAllocators also reorder the hosts and pools by ascending order of number of Running VMS/ Ready Volumes respectively for the given account. Thus try to choose that host or pool within a cluster with less number of VMs for the account.
-made Netscaler, SRX, F5 network elements as pluggable service
-added abstract load balancer device manager ExternaLoadBalancerDeviceManager
-made both F5 and Netscaler pluggable service to extend ExternaLoadBalancerDeviceManager
-added abstract firewall device manager ExternalFirewallDeviceManager
-made SRX pluugable service to extende ExternalFirewallDeviceManager
-added API's to configure and manage netscaler devices
status 11938: resolved fixed
reviewed-by: Frank Zhang
This fix would cover following scenario:
* the customer is upgrading from 2.2.11 to 2.2.13.
* the incorrect indexes are being dropped as a part of 2.2.12 to 2.2.13 upgarde, but we still insert them as a part of 2.2.11 to 2.2.12, and it might lead to the db upgrade failure. The only one way to handle this case - remove them from 2.2.11 to 2.2.12 upgrade path
only owner of the network can access it; if it's domain - all accounts in the domain and domain children can have an access.
* aclType replaces 2 old fields: isShared and isDomainSpecific.
* All 2.2.x account specific networks will have aclType=Account; 2.2.x Domain specific networks - aclType=domain; 2.2.x Zone level networks - aclType=Domain with domainId = Root domain id
- Create Default physicalnetwork and add traffic types while creating a zone
- DeleteProvider should error out if there are networks using the provider.
- Other validations
- ListSupportedNetworkServiceProvidersCmd will now return Providers along with its element's services and boolean 'canEnableIndividualServices' that indicates if for this Provider services can be enabled/disabled
- add & update NetworkServiceProvider changed to take in the list of services to enable. While adding a provider, if list is null then all services supported by the element are enabled by default.
- ListNetworkServices enhanced to take in a provider name and returns services of that specific provider.
As DhcpElement/VirtualRouterElement/RedundantVirtualRouterElement is decided to
be the service provider of the physical network, this API should be called to
add a new element, with correlated network service provider ID.
Then e.g. ConfigureVirtualRouterElementCmd should be called to configure and
enable the element.
DHCP range, domain name, etc. are the property of network, not virtual router
specific.
The focus of virtual router configuration would on separate enable/disable each
service it provided.
- Make all API commands Async and add events
- Make BroadcatsDomainRange case insensitive
- Process all _networkElements to build the Service -> Provider map during NetworkMgr::configure()
It would cover the configuration of DHCPElement, VirtualRouterElement and
RedundantVirtualRouterElement.
Also add foreign key in domain_router table to reflect the domain_router is
created from which element and use what configuration.
- Create Zone changes and changes to data_center table to remove vlan, securityGroup fields
- Physical Network lifecycle APIs
- Physical Network Service Provider APIs
- DB schema changes
* moved all services to the separate table, map them to the network_offering+provider.
* added state/securityGroupEnabled properties for the networkOffering
* added ability to list by state/securityGroupEnabled in listNetworkOfferings api command
2) New service: SourceNat
Since we would introduce a way to specify each service provider in the network
offering, it's better for redundant virtual router as a separate service
provider.
Also isRedundant() flag in the network offering would be removed. Redundant
virtual router temporality won't work from now. Until we're able to add
different network elements/service providers in network_offering.
1) Introduce new managers - ProjectManager and DomainManager. Moved all domain related code from AccountManager to DomainManager.
2) Moved some code from ManagementServerImpl to the correct managers.
3) New resource limit for Domain - Project
We've added "Strategy.Managed" for source nat ip address, to prevent it from
releasing when we try to execute restartNetwork command. But we didn't update
the existed nics when mgmt upgraded. This would result in restartNetwork command
fail(NPE) when try to restart an existed network.
status 11504: resolved fixed
Reviewed-by: Alena
Added two New values "all" and "default" to global config "network.loadbalancer.haproxy.stats.visibility" . With this change, it can take six possible value:
global - stats visible from public network.
guest-network - stats visible only to guestnetwork.
link-local - stats visible only to link local network(for xen and kvm).
disabled - stats disabled.
all - stats available on public,guest and link-local. (Newly added)
default - stats availble on the serving http port, this does need any specific http port.(Newly added)
Except "default" and "disabled", all the rest of 4 need to configure the stats port.
Force stop the router would release all the resources it used, but router may
still running. Add a column "stop_pending" in the database, and stop it when the
router come back.
Admin would able to choose to force destroy such router, then recover the
network using restartNetwork command with cleanup=false.
status 11036: resolved fixed
1) Use row locks instead of global lock when update resource_count table. When update resource_count for account, make sure that we lock account+all related domains
2) Insert resource_count records for account/domain at the moment when account/domain is created.
3) As a part of DB upgrade, insert missing resource_count records for all non-removed accounts/domains
Conflicts:
core/src/com/cloud/alert/AlertManager.java
server/test/com/cloud/agent/MockAgentManagerImpl.java
Changes :
- Fixing API doc +response name + errorMessage
- Adding seperate events to Egress rules
- Egress rules Using the same database table as that of ingress with new column type.
Pending Tasks:
- db upgrade
- database table rename from security_ingress_rule to generic name, renaming some of the jave class from ingress to generic name.
- Retesting on kvm
Changes:
- Adding a new table 'hypervisor_capabilities' that will record capabilities for each hypervisor version. Added db schema changes for this.
- Currently a few capabilities have been added, namely, 'max_guests_limit' and 'security_group_enabled'
- Added a new column 'hypervisor_version' to host table. StartupRouting command now takes in this parameter. It should be set when a host connects.
- If a host's hypervisor version is not present, we find all the capabilities rows for that hypervisor type and use the first record.
- 'max_guests_limit' is the maximum number of running guest Vms that a host can have for the given hypervisor.
- Host Allocators use this limit and skip a host if the number of running VMs on that host exceeds this limit.
Added New value "link-local" to global config network.loadbalancer.haproxy.stats.visibility . With this change it can take new parameter "link-local" value apart from the existing 3 values global,guest-network,disabled.
global - stats visible from public network
guest-network - stats visible only to guestnetwork.
link-local - stats visible only to link local network.
disabled - stats disabled.
Added New value "link-local" to global config network.loadbalancer.haproxy.stats.visibility . With this change it can take new parameter "link-local" value apart from the existing 3 values global,guest-network,disabled.
global - stats visible from public network
guest-network - stats visible only to guestnetwork.
link-local - stats visible only to link local network
disabled - stats disabled.
Changes:
- CreateTemplate and RegisterTemplate now support adding a template tag. It is a string value. This is root-admin only action - only admin can add template tags.
- ListTemplates will return the template tag in response.
- HostAllocator changed to use template tag along with the existing tag on service offering. If both tags are present, allocator now finds hosts satisfying both tags. If no hosts have both tags, allocation will fail.
- DB changes to add new column to vm_template table.
- DB upgrade changes for upgrade from 2.2.10 to 2.2.11
status 11029: resolved fixed
Commit also includes the following:
* map firewall rule to pf/lb/staticNat/vpn when the firewall rule is created as a part of pf/lb/staticNat/vpn rule creation
* when delete firewall rules, also delete related firewall rule
1) Added new apis: createFirewallRule, deleteFirewallRule, listFirewallRules
2) Modified existing apis - added boolean openFirewall parameter to createPortForwardingRule/createIpForwardingRule/createRemoteAccessVpn. If parameter is set to true, open firewall on the domR before creating an actual PF rule there
Modified backend calls appropriately.
3) Schema changes for firewall_rules table:
* startPort/endPort can be null now
* added icmp_type, icmp_code fields (can be not null only when protocol is icmp)
4) Added new manager - FirewallManagerImpl
Conflicts:
api/src/com/cloud/api/BaseCmd.java
client/tomcatconf/commands.properties.in
server/src/com/cloud/api/ApiResponseHelper.java
server/src/com/cloud/configuration/DefaultComponentLibrary.java
server/src/com/cloud/network/lb/LoadBalancingRulesManagerImpl.java
server/src/com/cloud/network/rules/RulesManagerImpl.java
1) Added new apis: createFirewallRule, deleteFirewallRule, listFirewallRules
2) Modified existing apis - added boolean openFirewall parameter to createPortForwardingRule/createIpForwardingRule/createRemoteAccessVpn. If parameter is set to true, open firewall on the domR before creating an actual PF rule there
Modified backend calls appropriately.
3) Schema changes for firewall_rules table:
* startPort/endPort can be null now
* added icmp_type, icmp_code fields (can be not null only when protocol is icmp)
4) Added new manager - FirewallManagerImpl
* when management server dies and notifies other management servers about this, the running management server has to cleanup host_transfer records belonging to the died management server
* issue agent load balancing task only when agent load (number of connected agents in the system) exceeds "agent.load.threshold" - 70% by default
* when management server dies and notifies other management servers about this, the running management server has to cleanup host_transfer records belonging to the died management server
* issue agent load balancing task only when agent load (number of connected agents in the system) exceeds "agent.load.threshold" - 70% by default
Conflicts:
server/src/com/cloud/configuration/Config.java
setup/db/db/schema-228to229.sql
The step to upgrade xenserver,
1. put cluster in Unmanaged state through UI , then MS will not talk to hosts in the cluster
2. upgrade xenserver according to XenServer upgrade guide.
3. put cluster in Managed state through UI, then MS will reconnect hosts
TODO,
1. UI
2. vm pool sync , leveraged from kelven's work
The step to upgrade xenserver,
1. put cluster in Unmanaged state through UI , then MS will not talk to hosts in the cluster
2. upgrade xenserver according to XenServer upgrade guide.
3. put cluster in Managed state through UI, then MS will reconnect hosts
TODO,
1. UI
2. vm pool sync , leveraged from kelven's work
Part 2
commit 797839360c65cd348d2eb20630521177ab0919de
bug 9154: redundant virtual router
commit 8ff7f230204d4d3a7a4adee75523a9a84f4276fe
bug 9154: Replace domain_router.is_master with domain_router.redundant_state in DB
commit 230b99e9e0b152648f1dd2a5eab6f22315b8e7b4
bug 9154: Add redundant state to DomainRouterResponse
commit ccefb5ff5e83d713798a347c99bce1a0d04b4317
bug 9154: Add router fault state report
commit 7a3090378f9785caecf741b70554f6ea17c41764
bug 9154: Send alert if found two virtual routers in master state
commit 66831056e4bf27665871bccd24e6159071564847
bug 9154: Code clean up
commit bf3f58a85741fa7118bd848a42d8b21baa4478d4
bug 9154: Add isRedundantRouter to DomainRouterResponse
Part 1
This backport contained:
commit 52317c718c25111c2535657139b541db0c9d1e1f
bug 9154: Initial check in for enabling redundant virtual router
commit 54199112055d754371bfb141168fb5538bf6d6ea
Add host verification for CheckRouterCommand
commit cef978a228c90056ead9be10cbc4de74c2b8de76
Fix CheckRouterAnswer's isMaster report
commit 4072f0a6991ac3b63601a1764fbe14188965f62f
Some build fixes and code refactoring for redundant router
commit 4d3350b7cd8ee2706a9bace4437fc194e36c8dd5
Redundant Router: Fix OVS
commit 6a228830e7c46d819fa0c3317e159e041337e887
Fix findByNetwork()/findByNetworkAndPod()'s return
commit c627777b3d5bdbcd60db4032cebd349a5b1ecd83
Redundant Router: Fix isVmAlive()
commit e1275d2514adc41f8744f5107d4069c38be195f1
Only issue CheckRouterCommand to redundant routers
And all modification to the scripts till
commit 4e3942462ed3fde3a3d7011e95839e2128fba514
logging changes
in the master branch.
This reverts commit 5537e65129f669da7ca9a22e53cc3daa9e69ac97.
Reverting the fix because only the unique key was missing; the column was inserted as a part of 225 to 226 upgrade
This reverts commit 5537e65129f669da7ca9a22e53cc3daa9e69ac97.
Reverting the fix because only the unique key was missing; the column was inserted as a part of 225 to 226 upgrade
status 10475: resolved fixed
Also added support for logging. By default the logs go to cloud.log file under current dir; you can specify another log location using -l option
status 10475: resolved fixed
Also added support for logging. By default the logs go to cloud.log file under current dir; you can specify another log location using -l option
* remove lb to vm mappings for Removed vms (result of the bug in 2.1.x when mappings weren't removed during the vm expunge)
* encode.api.response is false by default
* remove lb to vm mappings for Removed vms (result of the bug in 2.1.x when mappings weren't removed during the vm expunge)
* encode.api.response is false by default
status 10305: resolved fixed
While creating a system vm offering specify the type. If no type specified the default to domainrouter.
While requesting a set of system offering specify the paramter systemvmtype.
status 10305: resolved fixed
While creating a system vm offering specify the type. If no type specified the default to domainrouter.
While requesting a set of system offering specify the paramter systemvmtype.
status 9697: resolved fixed
Do encoding for ASCII chars only (done to eliminate problems with multiple language support)
To disable encoding, set "encode.api.response" to false
status 9697: resolved fixed
Do encoding for ASCII chars only (done to eliminate problems with multiple language support)
To disable encoding, set "encode.api.response" to false
This reverts commit 97f2b9936a8b9e3a057116d327b058253458b4ef.
Use the following solution instead:
* add unique_name field to the network_offerings table. Use this filed as a unique offering identifier in the code
* Added db upgrade steps to 225to226 sql script
Conflicts:
server/src/com/cloud/offerings/NetworkOfferingVO.java
This reverts commit 97f2b9936a8b9e3a057116d327b058253458b4ef.
Use the following solution instead:
* add unique_name field to the network_offerings table. Use this filed as a unique offering identifier in the code
* Added db upgrade steps to 225to226 sql script
* remove records from load_balancer_vm_map if the parent record is missing in load_balancer table
* remove user vm records from vm_instance table when corresponding record is missing in user_vm table
* remove records from load_balancer_vm_map if the parent record is missing in load_balancer table
* remove user vm records from vm_instance table when corresponding record is missing in user_vm table
2) Added new config parameter 'allow.subdomain.network.access' - default value is true. If it's set to false, the child domain can't use the network of the parent domain
Conflicts:
server/src/com/cloud/network/NetworkManagerImpl.java
2) Added new config parameter 'allow.subdomain.network.access' - default value is true. If it's set to false, the child domain can't use the network of the parent domain
This patch enable redundant virtual routers.
1. To enable this feature, db need to be updated using follow SQL by now(we
would get a UI way later):
UPDATE network_offerings SET redundant_router=1 WHERE guest_type="Virtual" AND
system_only=0;
2. System would try to start up two routers at different hosts. But if there is
only one host in the zone, system would start up two routers on it.
3. The failover part is using keepalived, and connection tracking part is using
conntrackd. There would be one master router and one backup router. The status
of router(master or backup) can be query from the database table domain_router
now. Management server would update the status every 30s by default.
4. The routers for the same zone would use same external NIC(same ip and mac).
The script used for fail-over would ensure only one external NIC present in the
network at any time.
5. Currently management server don't got the ability to stop one of router is
both of them reported as master. The feature is in the todo list.
After two routers start up, disconnect anyone of them, the guest network
shouldn't be affected, and established connection(http, ssh, etc.) should still
works. The fail-over on gateway part should be 3~4 seconds.
Currently the patch works with KVM. Would deal with vmware and XenServer soon.