Edison Su
5fade1ff43
bug 13416: backport patch from master to 2.2.14, need to restart cloud-agent on kvm host if cancelmaitaineance command is send
...
status 13416: resolved fixed
Reviewed-by: frank
2012-02-02 15:17:35 -08:00
anthony
2d6c426775
bug 12844: fixed merge
...
reviewed-by : edison
2012-02-01 11:32:35 -08:00
anthony
e1aa9c0ead
bug 12844: fixed a regression
...
reviewed-by : edison
2012-01-31 17:09:41 -08:00
anthony
c530cbad2a
bug 12844, 13394: 1. if connect to host fails, don't need to investigate
...
2. add ha parameter to dissconnect host to indicate if HA VMs on this host
status 12844, 13394: resolved fixed
reviewed-by : edison
2012-01-31 15:23:07 -08:00
Alena Prokharchyk
57cc61396d
Schedule HA is a part of handleDisconnect, not removeAgent
...
Reviewed-by: Alex Huang
2012-01-31 10:24:21 -08:00
abhi
606708e0a3
bug 12849: remove agent will kickstart HA if the host status is Down or Alert. The update is therefore moved before it
...
reviewed by: kishan
2012-01-30 18:10:37 +05:30
Edison Su
d9287f0e43
tell agent to reconnect to mgt server, if cancelmaintainance cmd is called
2012-01-19 17:14:00 -08:00
Edison Su
d910b7f85d
bug 12622: start ha with vm investigation when host is disconnected
...
status 12622: resolved fixed
2012-01-05 14:48:46 -08:00
Edison Su
6ecb0f2b6b
bug 12616: 40 hosts connecting to mgt server, need to set workers > 40 in mgt server.
...
status 12616: resolved fixed
2012-01-04 21:36:40 -08:00
Edison Su
4a7c684526
bug 12616: advanced startup command for direct connected agent
...
status 12616: resolved fixed
2012-01-03 18:29:40 -08:00
Alena Prokharchyk
4439fd8a51
bug 12790: use processDisconnect() when disconnect the agent during agent LB process
...
status 12790: resolved fixed
2011-12-29 16:56:46 -08:00
anthony
1ba2d1c8d5
add more logs
2011-11-02 17:04:18 -07:00
Kelven Yang
3aba30543c
bug 11624: command via AgentManagerImpl.sendTo() needs to be redirect to HypervisorGuru for command filtering, the filtering mechanism is required by VMware hypervisor to redirect storage/snapshot commands to SSVM
2011-10-17 18:03:54 -07:00
anthony
0bdd6ded96
timeout is not set for some commands
2011-09-29 12:17:08 -07:00
prachi
97bdb58b6d
Bug 11404 - VM was in Running state, had null for a pod_id, basically didnt allow creation of subsequent vm's
...
Reviewed-by: Alex
Changes:
- When management server starts, it goes through all the pending work items from op_it_work table and schedules HA work for each. It used to mark each item as done. Instead we should keep the item as pending and let it get marked as Done after the HA work is done.
- Changes in VirtualMachineMgr::advanceStop() :
a) if we find a VM with null hostId, we stop the VM only if it is forced stopped.
b) if VM state transition to Stopping fails,for state Starting and Migrating we try to find the pending work item and then do cleanup the VM. In case state is Stopping we can cleanup directly.
c) We proceed releasing all resources only if state transitioned to 'Stopping'.
- Changes in HA:
a) Depend on VirtualMachineMgr::advanceStop() in case host is not found to do VM cleanup
- When Vm state between mgmt server and agent syncs from starting -> running, mark any pending work item as done.
2011-09-15 18:47:05 -07:00
anthony
a308823549
bug 11413: when mark host ad disconnected, set lastping to now - pingtimeout
...
status 11413: resolved fixed
2011-09-12 18:46:58 -07:00
anthony
a369885a0f
1. added timeout in Command Class, then each command can configure itself timeout, if timeout is not configed, use the default timeout , which is 30 minute
...
2. added following configurable timeout
PrimaryStorageDownloadWait("Storage", TemplateManager.class, Integer.class, "primary.storage.download.wait", "10800", "In second, timeout for download template to primary storage", null),
CreateVolumeFromSnapshotWait("Storage", StorageManager.class, Integer.class, "create.volume.from.snapshot.wait", "10800", "In second, timeout for create template from snapshot", null),
CopyVolumeWait("Storage", StorageManager.class, Integer.class, "copy.volume.wait", "10800", "In second, timeout for copy volume command", null),
CreatePrivateTemplateFromVolumeWait("Storage", UserVmManager.class, Integer.class, "create.private.template.from.volume.wait", "10800", "In second, timeout for CreatePrivateTemplateFromVolumeCommand", null),
CreatePrivateTemplateFromSnapshotWait("Storage", UserVmManager.class, Integer.class, "create.private.template.from.snapshot.wait", "10800", "In second, timeout for CreatePrivateTemplateFromSnapshotCommand", null),
BackupSnapshotWait("Storage", StorageManager.class, Integer.class, "backup.snapshot.wait", "10800", "In second, timeout for BackupSnapshotCommand", null),
2011-09-07 19:18:36 -07:00
anthony
9842a9aed3
bug 10078:
...
1. introduce migratewait in global configuration, the default value is 1 hour
2. use async xapi VM migration API
status 10078: resolved fixed
2011-09-07 12:40:30 -07:00
Kelven Yang
a7ac75f920
bug 11304: restore host status after initialization failure
2011-09-02 15:17:57 -07:00
anthony
57e731b60e
set timeout for CheckOnHostCommand to 50 s
2011-09-02 15:01:06 -07:00
frank
18f87c2108
Merge branch 'cvm' into 2.2.y
...
Conflicts:
api/src/com/cloud/api/BaseCmd.java
cloud.spec
core/src/com/cloud/storage/template/DownloadManagerImpl.java
server/src/com/cloud/agent/manager/AgentManagerImpl.java
server/src/com/cloud/configuration/DefaultComponentLibrary.java
server/src/com/cloud/deploy/FirstFitPlanner.java
server/src/com/cloud/host/dao/HostDao.java
server/src/com/cloud/network/security/SecurityGroupListener.java
server/src/com/cloud/storage/StorageManagerImpl.java
server/src/com/cloud/storage/listener/StoragePoolMonitor.java
server/src/com/cloud/vm/UserVmManagerImpl.java
server/src/com/cloud/vm/VirtualMachineManagerImpl.java
utils/src/com/cloud/utils/SerialVersionUID.java
2011-08-19 16:08:35 -07:00
Murali Reddy
37512883f1
bug 11148: VMs that got stopped during Host Maintenance have host_id associated with them
...
status 11148: resolved fixed
enabled vm stop, if the host is last valid host in cluster
2011-08-17 18:11:23 +05:30
anthony
5f9884d97a
Bug 10197:
...
1. don't try HA vms if host hypervisor version changes
2. fixed a bug related to VM full sync with hosttrack enabled
2011-08-02 16:48:27 -07:00
Alex Huang
f043f63eaa
Merged changes from 2.2.8.zucchini
2011-08-02 15:33:48 -07:00
anthony
7d02ed344e
Bug 10197: do not check timeout against cluster which is not managed
2011-08-01 17:00:58 -07:00
Sheng Yang
6c493bfb82
Add exception message for AgentManagerImpl.investigate()
2011-07-27 10:53:06 -07:00
Sheng Yang
3a8e13f968
Add exception message for AgentManagerImpl.investigate()
2011-07-27 10:52:48 -07:00
Alex Huang
c610925304
moved agent ping to in memory rather than db based
2011-07-25 15:21:06 -07:00
Alex Huang
10ac7753ed
Switched ping to use the same db connection so that running out of db connections won't affect basic operations
2011-07-25 10:36:00 -07:00
Kelven Yang
3a6f3b71e0
bug 10791: add data integrity check upon management server startup
2011-07-21 17:08:29 -07:00
alena
c21273d23a
bug 10734: removed global lock in "DirectAgentScanTimerTask". This lock used to prevent the task from executing on multiple management server simultaniously.
...
status 10734: resolved fixed
2011-07-21 16:18:43 -07:00
anthony
3881e13387
bug 10197:
...
The step to upgrade xenserver,
1. put cluster in Unmanaged state through UI , then MS will not talk to hosts in the cluster
2. upgrade xenserver according to XenServer upgrade guide.
3. put cluster in Managed state through UI, then MS will reconnect hosts
TODO,
1. UI
2. vm pool sync , leveraged from kelven's work
2011-07-19 15:26:25 -07:00
alena
c48c3edfbc
bug 10271: don't include removed records when search for local storage pool
...
status 10217: resolved fixed
2011-07-19 11:10:53 -07:00
Alex Huang
d54f6d536a
propagating transaction isolation fix for merovingian2
2011-07-18 16:48:49 -07:00
alena
7a04334b60
bug 10734: removed global lock in "DirectAgentScanTimerTask". This lock used to prevent the task from executing on multiple management server simultaniously.
...
status 10734: resolved fixed
2011-07-18 15:00:13 -07:00
Alex Huang
e52a97b969
Switched ping to use the same db connection so that running out of db connections won't affect basic operations
2011-07-18 14:22:49 -07:00
anthony
18003deedf
bug 10628: root cause is CheckHealthCommand return false, XenServerInvestigator is not called
...
status 10628: resolved fixed
2011-07-14 20:42:26 -07:00
anthony
468136be74
bug 9855: two fixes.
...
1. can not cancel maintenace mode.
2. maintenance related modes are preserved through MS restart
status 9855: resolved fixed
2011-06-27 13:48:12 -07:00
alena
41f12eb642
Pass isForRebalance parameter to processConnect method of all the listeners - some listeners don't have to be notified when connection happens as a a part of Agent Rebalance process (VirtualMachineManagerImpl listener for instance)
2011-06-27 10:20:41 -07:00
alena
0bf34f3612
bug 10447: don't notify VirtualMachineManager listener when do host rebalance - vm sync is not needed in this case.
...
status 10447: resolved fixed
2011-06-27 10:20:40 -07:00
Edison Su
3642aef4c6
bug 10423: agent in ssvm needs to add default keystore, as we copying templates through https://**realhostip .**
...
status 10423: resolved fixed
2011-06-24 14:45:47 -04:00
Edison Su
28f0068151
add new option to force destroy vm when delete host, if the VMs are created on local storage
2011-06-23 20:36:13 -04:00
anthony
62249f3eae
1. return message to UI if adding primary storage failed
...
2. delete primary storage entry if if adding primary storage failed
2011-06-22 18:44:33 -07:00
Edison Su
ad5162ef86
fix ebtable cleanup issue: on ubuntu, it's not got deleted if vm is stopped
2011-06-16 19:26:24 -04:00
Edison Su
2e8d1bbd6c
bug 10190: add log if failed to delete host when host is in UP state
2011-06-15 12:02:31 -04:00
Kelven Yang
24c87c306b
merge adding host fix from 2.2.4
2011-06-14 17:16:19 -07:00
Frank
379cbc1d55
Store all parameters of url call to BaseCmd.fullUrlParams so there will be no
...
changes in future API because all parameters can be retrieve from API command itself
2011-06-08 10:25:15 -07:00
alena
14cdc7de14
bug 9127: covered failure scenarios for agent LB.
...
status 9127: resolved fixed
The feature is completed; please file separate bugs if any issue arises during the testing.
Wiki link describing how agentLB works: http://intranet.lab.vmops.com/engineering/release-2.2-features/agent-load-balancing
2011-06-05 17:35:30 -07:00
Alex Huang
019cc78976
Fixes problems in routing between management servers
2011-06-05 16:06:54 -07:00
Alex Huang
d9e0bcfa1e
bug 10126: Renamed getPodId() to getPodIdToDeployIn()
2011-06-03 22:17:08 -07:00