Commit Graph

609 Commits

Author SHA1 Message Date
Wido den Hollander fdf2509060 CLOUDSTACK-10160: Fix typo in Libvirt XML definition for Virtio-SCSI (#2341)
* CLOUDSTACK-10160: Fix typo in Libvirt XML definition for Virtio-SCSI

The attribute for the XML element 'controller' should be 'model' and
not 'mode'.

Source: https://libvirt.org/formatdomain.html#elementsControllers

  A scsi controller has an optional attribute model, which is one of
  'auto', 'buslogic', 'ibmvscsi', 'lsilogic', 'lsisas1068', 'lsisas1078',
  'virtio-scsi' or 'vmpvscsi'.

In the current state a regular SCSI device is attached and not a Virtio-SCSI
device.

Signed-off-by: Wido den Hollander <wido@widodh.nl>

* CLOUDSTACK-10160: Add UnitTest for LibvirtVMDef.SCSIDef

To make sure the XML output string is correct

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-11-28 09:42:15 +05:30
Wido den Hollander 632479d8f8 CLOUDSTACK-9853: Add support for Secondary IPv6 Addresses and Subnets (#2028)
This commit adds support for passing IPv6 Addresses and/or Subnets as
Secondary IPs.

This is groundwork for CLOUDSTACK-9853 where IPv6 Subnets have to be
allowed in the Security Groups of Instances to we can add DHCPv6
Prefix Delegation.

Use ; instead of : for separating addresses, otherwise it would cause
problems with IPv6 Addresses.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-11-22 17:30:33 +05:30
Abhinandan Prateek 4627fb2cd7 CLOUDSTACK-9972: Enhance listVolume API to include physical size and … (#2158)
* CLOUDSTACK-9972: Enhance listVolume API to include physical size and utilization.
Also fixed pool, cluster and pod info

* CLOUDSTACK-9972: Fix volume_view and duplicate API constant

* CLOUDSTACK-9972: Backport Do not allow vms to be deployed on hosts that are in disabled pod

* CLOUDSTACK-9972: Fix localization missing keys

* CLOUDSTACK-9972: Fix sql path
2017-11-05 21:44:43 +05:30
Bitworks Software, Ltd 3381c38cc7 CLOUDSTACK-10073: KVM host RAM overprovisioning (#2266)
Commit enables a new feature for KVM hypervisor which purpose is to increase virtually amount of RAM available beyond the actual limit.
There is a new parameter in agent.properties: host.overcommit.mem.mb which enables adding specified amount of RAM to actually available. It is necessary to utilize KSM and ZSwap features which extend RAM with deduplication and compression.
2017-09-29 11:46:09 +05:30
Wido den Hollander b130e55088 CLOUDSTACK-9397: Add Watchdog timer to KVM Instance (#1707)
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.
The watchdog timer adds functionality where the Hypervisor can detect if an
instance has crashed or stopped functioning.

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

In the agent.properties the action to be taken can be defined:

vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

vm.watchdog.model=i6300esb

When the Instance has the 'watchdog' daemon running it will send heartbeats
to the /dev/watchdog device.

If these heartbeats are no longer received by the HV it will reset the Instance.

If the Instance never sends the heartbeats the HV does not take action. It only
takes action if it stops sending heartbeats.

This is supported since Libvirt 0.7.3 and can be defined in the XML format as
described in the docs: https://libvirt.org/formatdomain.html#elementsWatchdog

To the 'devices' section this will be added:

  <watchdog model='i6300esb' action='reset'/>

In the agent.properties the action to be taken can be defined:

  vm.watchdog.action=reset

The same goes for the model. The Intel i6300esb is however the most commonly used.

  vm.watchdog.model=i6300esb

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-09-28 13:56:15 +05:30
Rohit Yadav d2c3408da7 CLOUDSTACK-9782: Improve scheduling of jobs
- Removed three bg thread tasks, uses FSM event-trigger based scheduling
- On successful recovery, kicks VM HA
- Improves overall HA scheduling and task submission, lower DB access

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-08-30 18:06:48 +02:00
Rohit Yadav 212e5ccfa7 CLOUDSTACK-9782: Host HA and KVM HA provider
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.

The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.

The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.

The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.

Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-08-30 18:06:48 +02:00
Wido den Hollander 2867080979 CLOUDSTACK-10034: Use libvirt to create new volumes and not rados-java (#2039)
Since libvirt 1.2.2 libvirt will properly create volumes
using RBD format 2.

We can use libvirt to creates the volumes which strips a bit of
code from the CloudStack Agent's responsbility.

RBD format 2 is already used by all volumes created by CloudStack.

This format is the most recent format of RBD and is still actively
being developed.

This removes the support for Ubuntu 12.04 as that does not have the
proper libvirt version available.

Signed-off-by: Wido den Hollander wido@widodh.nl

We can use libvirt to creates the volumes which strips a bit of
code from the CloudStack Agent's responsbility.

RBD format 2 is already used by all volumes created by CloudStack.

This format is the most recent format of RBD and is still actively
being developed.

This removes the support for Ubuntu 12.04 as that does not have the
proper libvirt version available.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-08-06 00:04:21 +02:00
Wei Zhou c03923c7e7 CLOUDSTACK-9113: skip vm with inconsistent state when getVmNetworkStats 2017-07-23 17:15:23 +02:00
Wei Zhou 960cb84083 CLOUDSTACK-7984: Collect network statistics for VMs on shared network (KVM implementation) 2017-07-23 17:15:23 +02:00
Rohit Yadav 32e96abea9 Merge remote-tracking branch 'origin/4.9' into 4.10 2017-07-14 14:59:17 +05:30
Wido den Hollander ca415e7436 CLOUDSTACK-9929: Do not gather statistics for CDROM or FLOPPY devices
Libvirt / Qemu (KVM) does not collect statistics about these either.

On some systems it might even yield a 'internal error' from libvirt
when attempting to gather block statistics from such devices.

For example Ubuntu 16.04 (Xenial) has a issue with this.

Skip them when looping through all devices.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-07-14 14:57:11 +05:30
Rohit Yadav ed376fcad6 Merge remote-tracking branch 'origin/4.9'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-06-07 11:21:27 +05:30
Rohit Yadav e197652a28 CLOUDSTACK-9860: Fix stackoverflow issue
Fixes issue caused to a PR-refactoring from #2108, reported by
@borisstoyanov

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-06-06 16:14:03 +05:30
Rohit Yadav 8323a175f1 CLOUDSTACK-9860: Power off VMs when stopVM is called with forced=true
The 'force' option provided with the stopVirtualMachine API command is
often assumed to be a hard shutdown sent to the hypervisor, when in fact
it is for CloudStacks' internal use. CloudStack should be able to send
the 'hard' power-off request to the hosts.

When forced parameter on the stopVM API is true, power off (hard shutdown)
a VM. This uses initial changes from #1635 to pass the forced parameter
to hypervisor plugin via the StopCommand, and fixes force stop (poweroff)
handling for KVM, VMware and XenServer.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-05-25 17:25:22 +05:30
Rajani Karuturi e25a444a0c Merge pull request #2121 from bvbharatk/CLOUDSTACK-9641
CLOUDSTACK-9641 In KVM SSVM and CPVM may use the old cmdline data, if…
2017-05-23 10:07:55 +05:30
Bharat Kumar 3c80f00550 CLOUDSTACK-9641 In KVM SSVM and CPVM may use the old cmdline data, if we fail to fetch the new cmdline in the first pass. 2017-05-19 16:50:19 +05:30
Jayapal d04a3e842c CLOUDSTACK-9317: Update review comments and rule state column 2017-05-17 11:08:13 +05:30
Jayapal c20e0ef88f CLOUDSTACK-9317: Fixed disable static nat on leaving ips on interface 2017-05-17 11:03:50 +05:30
Rajani Karuturi 7434d91614 Merge pull request #1873 from Accelerite/dhcpOffloadFix
CLOUDSTACK-9709: Updated the vm ip fetch task to use the correct the …
2017-05-17 10:43:51 +05:30
Rajani Karuturi a4dd6bdeeb Merge pull request #1955 from myENA/virtio-scsi
CLOUDSTACK-8239 Add VirtIO SCSI support for KVM hosts
2017-04-20 15:36:34 +05:30
Rajani Karuturi ec2d4dd422 Merge release branch 4.9 to master
* 4.9:
  CLOUDSTACK-9811: fix duplicated nics on VR caused by nic name p<slot_number>p<port_number>
2017-03-23 15:19:31 +05:30
Wei Zhou bf93b6313e CLOUDSTACK-9811: fix duplicated nics on VR caused by nic name p<slot_number>p<port_number> 2017-03-20 07:32:31 +01:00
Rajani Karuturi ad7ed7a178 Merge pull request #847 from kishankavala/CLOUDSTACK-8880
Bug-ID: CLOUDSTACK-8880: calculate free memory on host before deploying Vm.  free memory = total memory - (all vm memory)With memory over-provisioning set to 1, when mgmt server starts VMs in parallel on one host, then the memory allocated on that kvm can be larger than the actual physcial memory of the kvm host.

Fixed by checking free memory on host before starting Vm.
Added test case to check memory usage on Host.
Verified Vm deploy on Host with enough capacity and also without capacity

* pr/847:
  Bug-ID: CLOUDSTACK-8880: calculate free memory on host before deploying Vm.  free memory = total memory - (all vm memory)

Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
2017-03-13 22:20:58 +05:30
Rajani Karuturi 3f0fbf251c Merge pull request #1953 from Accelerite/CLOUDSTACK-9794
CLOUDSTACK-9794: Unable to attach more than 14 devices to a VMUpdated hardcoded value with max data volumes limit from hypervisor capabilities.

* pr/1953:
  CLOUDSTACK-9794: Unable to attach more than 14 devices to a VM

Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
2017-03-13 22:19:04 +05:30
Rajani Karuturi 56e851ca46 Merge release branch 4.9 to master
* 4.9:
  moved logrotate from cron.daily to cron.hourly for vpcrouter in cloud-early-config
  CLOUDSTACK-9569: propagate global configuration router.aggregation.command.each.timeout to KVM agent
2017-03-13 22:09:27 +05:30
Suresh Kumar Anaparti 93f5b6e8a3 CLOUDSTACK-9794: Unable to attach more than 14 devices to a VM
Updated hardcoded value with max data volumes limit from hypervisor capabilities.
2017-03-13 16:14:12 +05:30
Nathan Johnson 5c476492b1 CLOUDSTACK-8239 - Adding support for virtio-scsi on KVM hosts
This adds support for virtio-scsi on KVM hosts, either
for guests that are associated with a new os_type of 'Other PV Virtio-SCSI (64-bit)',
or when a VM or template is regstered with a detail parameter rootDiskController=scsi.

Update cloudstack add template dialog to allow for selecting rootDiskController with KVM

Update cloudstack kvm virtio-scsi to enable discard=unmap
2017-03-12 10:54:43 -05:00
Jayapal e3ae08b3ee CLOUDSTACK-9709: Updated the vm ip fetch task to use the correct the thread 2017-03-07 09:50:18 +05:30
Kishan Kavala 9a021904af Bug-ID: CLOUDSTACK-8880: calculate free memory on host before deploying Vm. free memory = total memory - (all vm memory) 2017-02-20 11:32:48 +05:30
Rajani Karuturi 7233ac37cd Merge pull request #977 from ustcweizhou/vm-snapshot
[4.10] CLOUDSTACK-8746: VM Snapshotting implementation for KVM

* pr/977:
  Fixes for testing VM Snapshots on KVM. Related to PR 977
  CLOUDSTACK-8746: vm snapshot implementation for KVM

Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
2017-01-31 05:58:56 +05:30
Wido den Hollander 84e496b4f9
CLOUDSTACK-676: IPv6 Basic Security Grouping for KVM
This commit implements basic Security Grouping for KVM in
Basic Networking.

It does not implement full Security Grouping yet, but it does:
- Prevent IP-Address source spoofing
- Allow DHCPv6 clients, but disallow DHCPv6 servers
- Disallow Instances to send out Router Advertisements

The Security Grouping allows ICMPv6 packets as described by RFC4890
as they are essential for IPv6 connectivity.

Following RFC4890 it allows:
- Router Solicitations
- Router Advertisements (incoming only)
- Neighbor Advertisements
- Neighbor Solicitations
- Packet Too Big
- Time Exceeded
- Destination Unreachable
- Parameter Problem
- Echo Request

ICMPv6 is a essential part of IPv6, without it connectivity will break or be very
unreliable.

For now it allows any UDP and TCP packet to be send in to the Instance which
effectively opens up the firewall completely.

Future commits will implement Security Grouping further which allows controlling UDP and TCP
ports for IPv6 like can be done with IPv4.

Regardless of the egress filtering (which can't be done yet) it will always allow outbound DNS
to port 53 over UDP or TCP.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2017-01-26 15:36:08 +01:00
Wei Zhou a2428508e2 CLOUDSTACK-8746: vm snapshot implementation for KVM
(1) add support to create/delete/revert vm snapshots on running vms with QCOW2 format
(2) add new API to create volume snapshot from vm snapshot
(3) delete metadata of vm snapshots before stopping/migrating and recover vm snapshots after starting/migrating
(4) enable deleting of VM snapshot on stopped vm or vm snapshot is not listed in qcow2 image.
(5) enable smoke tests for vmsnaphsots on KVM
2017-01-24 21:47:30 +01:00
Wei Zhou 714221234d CLOUDSTACK-9569: propagate global configuration router.aggregation.command.each.timeout to KVM agent 2016-12-22 12:00:10 +01:00
Wido den Hollander 2a5f37c1b1
CLOUDSTACK-8715: Add channel to Instances for Qemu Guest Agent
This commit adds a additional VirtIO channel with the name
'org.qemu.guest_agent.0' to all Instances.

With the Qemu Guest Agent the Hypervisor gains more control over the Instance if
these tools are present inside the Instance, for example:

* Power control
* Flushing filesystems
* Fetching Network information

In the future this should allow safer snapshots on KVM since we can instruct the
Instance to flush the filesystems prior to snapshotting the disk.

More information: http://wiki.qemu.org/Features/QAPI/GuestAgent

Keep in mind that on Ubuntu AppArmor still needs to be disabled since the default
AppArmor profile doesn't allow libvirt to write into /var/lib/libvirt/qemu

This commit does not add any communication methods through API-calls, it merely
adds the channel to the Instances and installs the Guest Agent in the SSVMs.

With the addition of the Qemu Guest Agent channel a second channel appears in /dev
on a SSVM as a VirtIO port.

The order in which the ports are defined in the XML matters for the naming inside
the SSVM VM and by not relying on /dev/vportXX but looking for a static name the
SSVM still boots properly if the order in the XML definition is changed.

A SSVM with both ports attached will have something like this:

  root@v-215-VM:~# ls -l /dev/virtio-ports
  total 0
  lrwxrwxrwx 1 root root 11 May 13 21:41 org.qemu.guest_agent.0 -> ../vport0p2
  lrwxrwxrwx 1 root root 11 May 13 21:41 v-215-VM.vport -> ../vport0p1
  root@v-215-VM:~# ls -l /dev/vport*
  crw------- 1 root root 251, 1 May 13 21:41 /dev/vport0p1
  crw------- 1 root root 251, 2 May 13 21:41 /dev/vport0p2
  root@v-215-VM:~#

In this case the SSVM port points to /dev/vport0p1, but if the order in the XML
is different it might point to /dev/vport0p2

By looking for a portname with a pre-defined pattern in /dev/virtio-ports we
do not rely on the order in the XML definition.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-11-23 16:01:08 +01:00
Rohit Yadav 0642a6982f
Merge branch '4.9' 2016-11-23 14:22:15 +05:30
Rohit Yadav 55b918076f
Merge branch '4.8' into 4.9 2016-11-23 13:50:15 +05:30
Rohit Yadav ff616e700b Merge pull request #1745 from shapeblue/CLOUDSTACK-9503
CLOUDSTACK-9503: Increased the VR script timeout. Most of the changes are about converting int/long time values to joda Duration.

* pr/1745:
  CLOUDSTACK-9503: Increased the VR script timeout. Most of the changes are about converting int/long time values to joda Duration.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-11-23 13:41:52 +05:30
Rohit Yadav b59db0dc06 Merge pull request #1705 from nemo9cby/CLOUDSTACK-9465
Made the changes to improve logging.CLOUSTACK-9465 Several log refactoring/improvement suggestions.

There are two scenarios of logging which needs refactoring/improvement:

Method invocation replaced by variable

This means that in the logging code, the method invocation is pre-defined as a variable. for simplicity,          the method invocation should be replaced by the variable.

Delete variable which must be null

The variable in the logging code is null, there is no need to put the variable there.

* pr/1705:
  Made the changes to improve logging.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-11-03 16:48:21 +05:30
Abhinandan Prateek 83b5a8b2b2 CLOUDSTACK-9503: Increased the VR script timeout. Most of the changes are about converting int/long time values to joda Duration. 2016-11-01 16:14:23 +05:30
Nemo cd9c7737d1 Made the changes to improve logging. 2016-10-11 12:58:02 -04:00
Wido den Hollander 0beb41b6e7 CLOUDSTACK-9395: Add Virtio RNG device to Instances when configured
By adding a Random Number Generator device to Instances we can prevent
entropy starvation inside guest.

The default source is /dev/random on the host, but this can be configured
to another source when present, for example a hardware RNG.

When enabled it will add the following to the Instance's XML definition:

  <rng model='virtio'>
    <rate period='1000' bytes='2048' />
    <backend model='random'>/dev/random</backend>
  </rng>

If the Instance has the proper support, which most modern distributions have,
it will have a /dev/hwrng device which it can use for gathering entropy.

More information: https://libvirt.org/formatdomain.html#elementsRng
2016-10-04 12:44:55 +02:00
Nathan Johnson 46df85c5bf CLOUDSTACK-9461
This converts the rbd raw format on disk to qcow2 for compression.
2016-08-26 09:52:24 -05:00
Rohit Yadav 7b8ba24c64 CLOUDSTACK-9452: add python-argparse dependency on el6,7 rpms
The patchviasocket script was rewritten in Python from PR #1533 and made
assumptions that Python 2.7 would be available. In case of CentOS, python 2.7
may not be available or installed. This change ensures that python-argparse
is installed which is used by this script.

Expose cmd error in the logs when patch command fails.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-08-10 16:25:17 +05:30
Aaron Hurt c8fce3ff31 improve logging readability 2016-07-11 12:05:06 -05:00
Aaron Hurt 44491448e3 Cleanup rbd contexts and improve exception logging
We noticed that when an exception occurs within the cleanup loop inside
the deletePhysicalDisk routine that the previously allocated contexts
are not cleaned up.  This seemed to cause an eventual crash of the host
agent after multiple exceptions within the loop.

In addition to ensuring the contexts are always freed we also improved
the logging when exceptions do occur to include the actual return code
from the underlying library in deletePhysicalDisk and deleteSnapshot.
2016-07-08 23:13:33 -05:00
Will Stevens b03a629c6a Merge pull request #1533 from greenqloud/pr-patchviasocket-convert-to-python
Convert patchviasocket to python (removes perl dependency for KVM agent)As requested here: https://github.com/apache/cloudstack/pull/1495

No scripts are using perl so that install requirement can be removed.
The new scripts are using standard python packages only.
Includes extensive unit test.
Note: perl-modules requirement is missing (fixed in mentioned PR) so do not merge that onto master.

* pr/1533:
  Revert "Add perl-modules as install dependency for cloudstack-agent"
  patchviasocket improve error handling
  Convert patchviasocket to python (removes perl dependency for KVM agent)

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-25 22:57:08 -04:00
Will Stevens 678b28f273 Merge release branch 4.8 to master
* 4.8:
  CLOUDSTACK-6928: fix issue disk I/O throttling not applied
  CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvR.
2016-05-25 22:54:23 -04:00
Sverrir A. Berg 0acd3c12a2 Convert patchviasocket to python (removes perl dependency for KVM agent)
As requested here: https://github.com/apache/cloudstack/pull/1495

No scripts are using perl so that install requirement can be removed.
The new scripts are using standard python packages only.
Includes extensive unit test.
2016-05-20 15:42:34 +00:00
Will Stevens 82b702dc9a Merge pull request #1403 from mike-tutkowski/xs-snapshots
Taking fast and efficient volume snapshots with XenServer (and your storage provider)A XenServer storage repository (SR) and virtual disk image (VDI) each have UUIDs that are immutable.

This poses a problem for SAN snapshots, if you intend on mounting the underlying snapshot SR alongside the source SR (duplicate UUIDs).

VMware has a solution for this called re-signaturing (so, in other words, the snapshot UUIDs can be changed).

This PR only deals with the CloudStack side of things, but it works in concert with a new XenServer storage manager created by CloudOps (this storage manager enables re-signaturing of XenServer SR and VDI UUIDs).

I have written Marvin integration tests to go along with this, but cannot yet check those into the CloudStack repo as they rely on SolidFire hardware.

If anyone would like to see these integration tests, please let me know.

JIRA ticket: https://issues.apache.org/jira/browse/CLOUDSTACK-9281

Here's a video I made that shows this feature in action:

https://www.youtube.com/watch?v=YQ3pBeL-WaA&list=PLqOXKM0Bt13DFnQnwUx8ZtJzoyDV0Uuye&index=13

* pr/1403:
  Faster logic to see if a cluster supports resigning
  Support for backend snapshots with XenServer

Signed-off-by: Will Stevens <williamstevens@gmail.com>
2016-05-20 08:33:07 -04:00