Commit Graph

518 Commits

Author SHA1 Message Date
Rohit Yadav 876fc7434d APPLE-165: Host HA management and HA provider for KVM
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.

The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.

The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.

The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-01-18 18:18:53 +05:30
Wido den Hollander 2b8fd2469f CLOUDSTACK-8443: Support CentOS 7 for 4.5
This is based on two PRs:
- 731
- 757

This commit is based on the 4.5 branch for a future 4.5 release.
2015-09-13 15:30:20 +02:00
Remi Bergsma 43dabb611d RHEL 7 and CentOS 7 need the same fix
(cherry picked from commit d1cb4c7d50)
Signed-off-by: Remi Bergsma <github@remi.nl>
2015-08-19 19:35:38 +02:00
Remi Bergsma 91cfb6068a fixing white space and formatting
(cherry picked from commit 14013d5d1b)
Signed-off-by: Remi Bergsma <github@remi.nl>
2015-08-19 19:35:24 +02:00
Jayapal 259b2639f5 Fixed issue in adding vm SG rules on vm reboot for xenserver 6.5
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

This closes #479

(cherry picked from commit 59e6596fef)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-06-18 13:33:01 +03:00
Wido den Hollander 9ff3fe371e CLOUDSTACK-8559: IP Source spoofing should not be allowed
We did not verify if the packets leaving an Instance had the correct
source address.

Any IP packet not matching the Instance IP(s) will be dropped

(cherry picked from commit 3e3c11ffca)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-06-15 21:51:01 +03:00
Rohit Yadav aee35c96a8 CLOUDSTACK-8252: Ignore VLAN 4095 which is n/a on linux
VLAN id 4095 is commonly used as a 'tag passthrough' in virtualization environments
(VMware, specifically). This vlan id is incompatible with Linux, but we can
allow the admin to manually configure the bridge if the same passthrough is
desired.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-05-22 13:22:40 +01:00
Rohit Yadav 3925512115 CLOUDSTACK-8401: Fix KVM's SG script to properly cleanup old network rules
- Router VMs don't have a chain rule with -def suffix, this fixes name and
  properly removes VR vms not running on a host
- Before trying to remove dnats, filter empty/None elements from list
- destroy_ebtables_rules should check what kind of action is request to be
  performed (-A for add or -D for removed) and execute based on that
- Before executing any command, log it for debugging purposes
- Method to cleanup bridge, may be used in future

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-25 03:02:10 +02:00
Rohit Yadav d66677101c CLOUDSTACK-4611: cleanup_rules using ebtables rules from /proc/modules
The SG python script depends on ebtables-save which is not available on Debian
based distros (Ubuntu and Debian for example). The commit uses /proc/modules
to find available bridge tables (one of nat, filter or broute) and then
find VMs that need to be removed. Further it uses set() to remove duplicate VMs
so we don't try to remove a VM's rules more than once leading to unwanted errors
in the log.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-25 02:58:12 +02:00
Rohit Yadav d91d161107 CLOUDSTACK-8395: vmops plugin should work on both XS 6.5 and 6.2
This fixes the issue of Security Groups not working in case of XenServer 6.5;
- Uses nethash ipset data-structure to store CIDRs (efficient than iphash and
  avoids overflow errors in case users add /8 /4 ingress/egress cidrs)
- Support for ipset versions both on 6.2 and 6.5, both have different outputs. This
  fixes the issue of destroy_network_rules_for_vm failing
- Implements defensive filtering of list, instead of popping last item without
  checking if it's None or empty
- Greps using names that are 'quoted' to avoid bash errors
- Before setting up new network rule, tries to clean and remove old ipset entry
- Idents, whitespace and naming fixes

PS. This is my 1000th commit to the 🐵 project :)

This closes #186

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-23 14:08:17 +02:00
Rohit Yadav 64ab3554a1 scripts: filter output instead of popping string from list
This is a defensive enhancement for KVM SG script that filters out empty string
instead of popping last item which may or may not be an empty string.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-21 17:33:18 +02:00
Rohit Yadav f4cbc4c010 xenserver: remove unwanted vmops.orig file (created during a past merge)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-21 12:58:39 +02:00
Remi Bergsma 43a9eb40b8 make sure sync cannot block reboot
The recent discussed improvement has the risk that if 'sync' hangs, the reboot may be delayed in the same way as the 'reboot' command would do. To work around, we're adding a 5 second timeout. If it cannot sync in 5 seconds, it will not succeed anyway and we should proceed the reset.

@snuf: Could we use your OVM3 heartbeat script for other hypervisors as well? One way to do it seems like a nice idea :-)
2015-04-10 15:14:08 -05:00
Remi Bergsma b23661931a write logfile just before rebooting the host
As discussed with @wido @pyr and @nuxro added an extra log line.

Tested it and it logs fine (tested to local disk) when syncing first:
Apr  3 15:31:23 mcctest2 heartbeat: kvmheartbeat.sh system because it was unable to write the heartbeat to the storage

By the way, it did also log to the agent.log but this extra log has the benefit of ending up in the system log so you'll probably find it easier there. Existing logs:
2015-04-03 15:27:23,943 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 0
2015-04-03 15:28:23,944 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 1
2015-04-03 15:29:23,946 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 2
2015-04-03 15:30:23,948 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 3
2015-04-03 15:31:23,950 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 4
2015-04-03 15:31:23,950 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout; reboot the host

This closes #145

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-10 15:13:37 -05:00
Remi Bergsma a92315f28c reboot much faster in case of storage failure
When storage cannot be reached, it does not make sense to reboot as it will try to flush buffers, umount NFS mounts, etc. This will not work and thus cause a long delay. With this change, the box will reboot immediately (like pressing the reset button).
2015-04-10 15:13:27 -05:00
Star Guo 290938b08e scripts: add ip set interface up because in CentOS7 the interface will not auto up
This closes #97

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-03-10 10:18:10 +05:30
Jayapal dd6bcde65b CLOUDSTACK-8298: Update copying large size VR config file in xenserver
When there is large size VR configuration (aggregate commands) copying data to VR using vmops plugin was failed
 because of the ARG_MAX size limitation. The configuration data size is around 300KB.

 Updated this to create file in host by scp with file contents. This will create file in host.
 Then copy the file from the host to VR using hte vmops createFileInDomr method.

  In host file get created in /tmp/ with name VR-<UUID>.cfg, once it copied to VR this file will be removed.

(cherry picked from commit 619f014255)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-03-04 16:56:07 +05:30
Rohit Yadav cf7a8cc052 CLOUDSTACK-8220: Let's have a separate XenServer 6.5 resource
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 06437dadf5)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-02-06 14:45:06 +05:30
Remi Bergsma 66b77380d0 use directIO flags when dd'ing template
This makes sure dom0 in xenserver doesn't get hammered
when copying templates. It doesn't make sense to use
the cache of dom0 as the template does not fit in
memory. The directIO flags prevent it from trying.

(cherry picked from commit 4e1527e87a)
2014-12-16 10:49:57 +01:00
Pierre-Luc Dion 6c955a3a47 CLOUDSTACK-7887: change int to str into swiftxen 2014-11-12 19:22:13 -05:00
Sanjay Tripathi e6907ed8df CLOUDSTACK-7868: Failed storage.PrimaryStorageDownloadCommand leaves corrupt VDIs in primary storage. 2014-11-08 13:46:45 +05:30
Daan Hoogland 6e1e56d399 CLOUDSTACK-7527 reboot faster by writing to /proc/sysrq-trigger
(cherry picked from commit d04f59a30d)
2014-09-18 12:51:42 +02:00
Daan Hoogland dec9133dcd CLOUDSTACK-7184: xenheartbeat gets passed timeout and interval
(cherry picked from commit 4d065b9a3a)

Conflicts:
	plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/discoverer/XcpServerDiscoverer.java
	plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/resource/CitrixResourceBase.java
	server/src/com/cloud/configuration/Config.java
	server/src/com/cloud/configuration/ConfigurationManagerImpl.java
	server/src/com/cloud/resource/DiscovererBase.java
2014-09-18 12:51:10 +02:00
Kishan Kavala 4f3de024de Add script to ensure cgroups are not co-mounted in rhel7/lxc. If required, script will unmount co-mounted cgroups and remount them seperately 2014-09-11 14:34:40 +05:30
Kishan Kavala 08dc5c6f91 CLOUDSTACK-7428: Allow LXC cluster in SG enabled zones. Use lxc driver in security_group.py script for lxc host 2014-08-27 11:52:59 +05:30
Anthony Xu bd6f03aa95 iptreemap is not supported in new ipset, use iphash instead 2014-08-25 11:22:30 -07:00
Kishan Kavala b37ee25359 replace vconfig with ip link 2014-08-22 15:39:04 +05:30
Vincent Bernat 53650ed7bf CLOUDSTACK-7193: handle domain ID being an int
Recent versions of libvirt (at least 0.9.8) will return an int when
queried for the ID of a domain, not a string. This breaks some parts of
the `security_group.py` script which expects a string containing an
int. Notably, this breaks the part handling VM reboots which is
therefore not executed.

Signed-off-by: Vincent Bernat <Vincent.Bernat@exoscale.ch>
Signed-off-by: Sebastien Goasguen <runseb@gmail.com>
2014-08-18 10:36:21 -04:00
Brenn Oosterbaan 7c92bac4a3 CLOUDSTACK-7345 changed dd blocksize to 128k when using NFS.
Signed-off-by: Daan Hoogland <daan@onecht.net>
(cherry picked from commit 8b7130fa65)
2014-08-14 10:10:13 +02:00
Joris van Lieshout 37baddd721 dd with direct io is less impacting on Dom0 kernel resources
Signed-off-by: Daan Hoogland <daan@onecht.net>
(cherry picked from commit c4b78c3aaa)
2014-08-12 13:17:02 +02:00
Frank.Zhang 88f866645b fix iptables chain name too long (must be under 30 chars) 2014-07-18 17:31:06 -07:00
Anthony Xu 733102c742 change XS log file name from vmops.log to cloud.log 2014-07-15 11:07:15 -07:00
Murali Reddy cdb3dc97b5 CLOUDSTACK-6749: [OVS] xe network-param-get with
param-key=is-ovs-vpc-distributed-vr-network alway returns error

fixing unnecessary errors in the logs
2014-06-13 16:02:31 +05:30
Tim Mackey a8212d9ef4 Cleanup of Xen and XenServer terms. Cloned xen plugin creating a xenserver plugin, then removed xen plugin
Signed-off-by: Tim Mackey <tmackey@gmail.com>
Signed-off-by: Sebastien Goasguen <runseb@gmail.com>
2014-06-07 04:50:23 -04:00
Murali Reddy 9105c779e9 CLOUDSTACK-6685: OVS distributed firewall: source CIDR mismatch while
populating ingress & egress network ACL

fix ensures propoer values for nw_src and nw_dst are popoluated
depending on the ingress or egress acl
2014-05-15 16:44:30 +05:30
Murali Reddy 63f6888588 CLOUDSTACK-6668: OVS distributed routing: ensure bridge is deleted when
last VM from the VPC is deleted on a host

OVS distributed routing: ensure bridge is deleted when last VM from the
VPC is deleted on a host. This fix ensures that bridge is
destroyed.
2014-05-14 16:41:56 +05:30
Harikrishna Patnala 807b6d2c4c CLOUDSTACK-6544: [Automation] Failed to create template for ROOT volume in Xen, with Exception: callHostPlugin failed 2014-05-08 15:59:39 +05:30
Murali Reddy 55111e2284 CLOUDSTACK-6609: OVS distributed routing: ensure tunnels are created if
not created already when OvsVpcPhysicalTopologyConfigCommand update is
recived

Currently if the tunnel creation fails, there is no retry logic. Fix
ensures OvsVpcPhysicalTopologyConfigCommand updates as an opputiunity to ensure
proper tunnels are established between the hosts.
2014-05-08 15:58:16 +05:30
Murali Reddy 2df5df1b68 CLOUDSTACK-6592: OVS distributed routing: make populate flooding rules
transactional

creats a file with all openflow rules updates and using ovs-ofctl file
option updates the brige in one go
2014-05-07 20:05:32 +05:30
Murali Reddy 9e98cbf1c1 CLOUDSTACK-6564: OVS distributed routing: use file based OF rule updates
use ovs-ofctl replace flows by file name option to update the OF rules
instead of sequenetially configuring the rules.
2014-05-02 18:54:30 +05:30
Murali Reddy 213a68dc39 CLOUDSTACK-6507: ensure sequence numbers are honoured while processing
OvsVpcPhysicalTopologyConfigCommand and OvsVpcRoutingPolicyConfigCommand

fix ensures only latest updates are applied (new openflow rules) to the
bidge enabled for distributed routing.
2014-04-25 15:02:19 +05:30
Murali Reddy 771abe4286 fix KVM plug-in for OVS tunnel network. Fix addreses two issues.
fix  mismatch of ovs-host-setup, ovs_host_setup used Libvirt resource and
scripts

plug the nic to OVS bridges created for the tunnel network.

Conflicts:
	plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/OvsVifDriver.java
2014-04-25 15:02:17 +05:30
Murali Reddy 095151c98a add support for sequence numner in the VPC topology updates and VPC
routing policy updates

Conflicts:
	setup/db/db/schema-430to440.sql
2014-04-25 15:02:17 +05:30
Damodar Reddy b814783a63 CLOUDSTACK-6298:[Windows] if already tmp directory exists inject keys is dailing on widows while starting the management server
Signed-off-by: Abhinandan Prateek <aprateek@apache.org>
2014-04-16 11:07:55 +05:30
Anthony Xu cbb31675a0 remove unused code 2014-04-09 14:01:03 -07:00
Sheng Yang e8227c88d8 CLOUDSTACK-6314: Use SSH commands for Xen VR execution
Instead of XAPI, which would make XenServer unnecessary busy.
2014-04-07 13:38:14 -07:00
Murali Reddy cc2892c782 fix typos in xenserver scripts to setup OVS tunnel network 2014-04-07 17:31:25 +05:30
Anthony Xu 58b2b6b9e1 Add support for XS6.2 Fox hotfix 2014-03-28 16:45:16 -07:00
Edison Su 2276a399ac KVM security bug: no forwarding rule applied
(cherry picked from commit e5c391fcf3)

Signed-off-by: Animesh Chaturvedi <animesh@apache.org>
2014-03-28 16:21:36 -07:00
Edison Su 731ccb8219 fix devcloud router start
Conflicts:

	plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XcpOssResource.java
2014-03-28 16:16:51 -07:00