KVM hosts which are actuall up, but if their agents are shutdown should be put
in disconnected state. This would avoid getting the VMs HA'd and other commands
such as deploying a VM will exclude that host and save us from errors.
The improvement is that, we first try to contact the KVM host itself. If it fails
we assume that it's disconnected, and then ask its KVM neighbours if they can
check its status. If all of the KVM neighbours tell us that it's Down and we're
unable to reach the KVM host, then the host is possibly down. In case any of the
KVM neighbours tell us that it's Up but we're unable to reach the KVM host then
we can be sure that the agent is offline but the host is running.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Signed-off-by: wilderrodrigues <wrodrigues@schubergphilis.com>
This closes#340
When executing the tests in an environment where Libvirt is also installed, it
caused errors.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#342
Pull average Cpu util report between polling intervals instead of since boot
instead of using values since uptime
(cherry picked from commit 04176eaf17)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
This closes#297
Passing the file argument to the xml break for EL 7.1, the fix removes
the argument as just passing rombar='off' with its file arg to be empty string.
This closes#290
(cherry picked from commit aafa0c80b3)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
EL7 has a different output to 'free', use /proc/meminfo instead of a tool to be
more consistent across distros
(cherry picked from commit 212a05a345)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Removing real IPs from the tests because they cause a long running time for LibvirtComputingResourceTest
- In a local machine it takes 1.977s, but in a KVM test environment it's taking 257.879 sec
Fixing typo on LibvirtRequestWrapper
- Replace linbvirtCommands by libvirtCommands on LibvirtRequestWrapper
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#255
- The test was okay, but when running in an environment where a /root/.ssh/id_rsa existed, it would return true then fail
- We now mock the calls to methods that return the key paths, instead of relying in the static variables
- Adding LibvirtNetworkElementCommandWrapper and LibvirtStorageSubSystemCommandWrapper
- 2 unit tests added
- KVM hypervisor plugin with 22.2% coverage
I also refactored the StorageSubSystemCommand interface into an abstract class
- Remove the pseudo-multiple-inheritance implementation
- The StorageSubSystemCommand was an interface, not related to the Command class
and its implementation were extending the Command class anyway. The whole structure is better now.
- Addin LibvirtPvlanSetupCommandWrapper
- 6 unit tests added
- KVM hypervisor plugin with 21% coverage
From the 6 tests added, 2 were extra tests to increase the coverage of the LibvirtStopCommandWrapper
- Increased from 35% to 78.7%
- Adding LibvirtCopyVolumeCommandWrapper
Refactoring the LibvirtUtilitiesHelper
- Changing method name
Did not add any test to this commit due to the refactor mentioned abot.
Will proceed and add the tests
i# Please enter the commit message for your changes. Lines starting
- Gave it a better, more suggestive, name since I now added other methods to the class.
- It makes easier to mock objects and get a better coverage of the classes
- Adding LibvirtBackupSnapshotCommandWrapper, LibvirtCreatePrivateTemplateFromVolumeCommandWrapper and LibvirtManageSnapshotCommandWrapper
- 3 unit tests added
- KVM hypervisor plugin with 18.3% coverage
Less tests added to those classes because the code is quite complex and way too long.
The tests added are just covering the new flow, to make sure it works fine. I will come back to those classes later.
- Adding LibvirtOvsDestroyBridgeCommandWrapper, LibvirtOvsSetupBridgeCommandWrapper
- 4 unit tests added
- KVM hypervisor plugin with 13.9% coverage
More tests added to cover LibvirtPrepareForMigrationCommandWrapper
- Coverage of this wrapper broght from 37% to 90.6%
- 4 new tests added
- Adding LibvirtCheckConsoleProxyLoadCommandWrapper, LibvirtConsoleProxyLoadCommandWrapper, LibvirtWatchConsoleProxyLoadCommandWrapperand CitrixConsoleProxyLoadCommandWrapper
- 2 unit tests added
- KVM hypervisor plugin with 12% coverage
Refactored the CommandWrapper interface in order to remove the esecuteProxyLoadScan, which is now
implemented bu subclasses.
- Adding LibvirtGetHosStatsCommandWrapper
- 1 unit test added
- KVM hypervisor with 10.5% coverage
Tests are a bit limited on this one becuause of the current implementation. Would clean it up later in a separate branch
- Adding LibvirtGetVmStatsCommandWrapper
- 3 unit tests
Refactored the LibvirtConnectiobn by surrounding it with an wrapper.
- Make it easier to cover the static/native calls
- Added better coverage to StopCommand tests
- Adding LibvirtStopCommandWrapper
- LibvirtRequestWrapper
- 1 unit tests
Refactored the RequestWrapper to make it better.
- Changes also applied to the CitrixRequestWrapper
Linux kernel supports vmxnet3, allowing it in KVM plugin would allow us to
run ESX hosts on KVM hosts using CloudStack with vmxnet3 nic which can be
passed as VM's nicAdapter detail
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit e02d787f30)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This improvements checks for "guest.cpu.features" property which is a space
separated list of cpu features that is specific for a host. When added, it
will add <feature policy='require' name='{{feature-you-listed}}'/> in the
<cpu> section of the generated vm spec xml.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit ea7fd37783)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The only artifact resolved from libvirt.org was org.libvirt:libvirt:0.5.1
this artifact is now available from maven's default central repository
This closes#180
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
As suggested by Wido on the dev ML changing the repo to eu.ceph.com to avoid
build failures. Will revert if ceph.com is up again.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit c9fd57fff3)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CentOS 7 does not ship with ifconfig anymore. We should use ip commands instead.
This also works on older versions, like CentOS 6 and Ubuntu 12.x/14.x, that we
support.
This closes#165
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
1. provide compatibility with the Big Cloud Fabric (BCF) controller
L2 Connectivity Service in both VPC and non-VPC modes
2. virtual network terminology updates: VNS --> BCF_SEGMENT
3. uses HTTPS with trust-always certificate handling
4. topology sync support with BCF controller
5. support multiple (two) BCF controllers with HA
6. support VM migration
7. support Firewall, Static NAT, and Source NAT with NAT enabled option
8. add VifDriver for Indigo Virtual Switch (IVS)
This closes#151
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Refactored to use the XPatch expressions to check the generated domain xml rathern than string comparison.
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
Earlier host addition of multiple hosts with local storage failed due to
same local storage UUID being used where the storage path is same.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit bf17f640c6)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
KVMStoragePoolManager is a singleton in practice, any plugin
or extension of LibvirtComputingResource will need to act on
the specific instance of KVMStoragePoolManager that LibvirtComputingResource
has initialized. Therefore, expose this variable for those who
wish to call storage commands from plugins or extensions.
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Clearly show if a volume is found and if not, that the pool is being refreshed
and the fetch is tried again.
Due to my commit b53a9dcc9f the chance of a volume
not being found is slightly bigger, but the performance gain is enormous on larger
deployments.
This is why we clearly have to log that we are refreshing the pool information
when a volume is not found.
It could be that a volume is created on host A and a few seconds later host B tries
to access the volume. In that case host B's libvirt doesn't know about the volume
yet and has to refresh the pool before it does.
On larger (especially RBD) storage pools this can take a lot of
time slowing operations like creating volumes down.
The getStorageStats command will still ask a pool to be refreshed so
that the management server has accurate information about the storage pools.
On larger deployments, with thousands of volumes in one pool, this should
significantly improve storage related operations
For ResizeVolume API command -
1. If hypervisor resource throws an exception, handle the NPE thrown by the job framework.
2. Improve user error message in case of RuntimeException by throwing the exception instead of 'Unexpected Exception'.
We don't need an external script to investigate the format of the RBD volume,
we only have to ask Libvirt to resize the volume and that will ask librbd to
do so.
In situations where libvirt lost the storage pool the KVM Agent will re-create the
storage pool in libvirt.
This could be then libvirt is restarted for example.
The object returned internally was missing essential information like the sourceDir
aka the Ceph pool, the monitor IPs, cephx information and such.
In this case the first operation on this newly created pool would fail. All operations
afterwards would succeed.
We used to create the snapshot after the copy from Secondary Storage,
but it could be that we never use the snapshot.
Now we check if the snapshot exists prior to performing the cloning operation
Since we use qemu-img to copy from RBD to Secondary Storage we no
longer have to force to RAW images, but can stick with QCOW2
When the snapshot backups are QCOW2 format they can easily be deployed
again when restoring from a backup
The KVMStorageProcessor no longer has a hardcoded if-statement which sets
RBD volumes to RAW, this is now handled in the LibvirtStorageAdapter
The Management Server still sends QCOW2 as format. That's a fix for later.
fix mismatch of ovs-host-setup, ovs_host_setup used Libvirt resource and
scripts
plug the nic to OVS bridges created for the tunnel network.
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/OvsVifDriver.java
Added a new flag 'checkBeforeCleanup' to StopCommand based on which check is done to see if VM is running in HV host.
If VM is running then in this case it is not stopped and the operation bails out.
Also modified the MS code to call the StopCommand with appropriate value for the flag based on the context.
Currently it is only set to 'true' when called from the new vmsync logic based on powerstate of VM. For rest it
is set to 'false' meaning no change in behaviour.
This reduces the amount of time and storage it takes dramatically. We no longer
do a full copy, but a sparse copy. The destination image is still in RAW
format, but we only copy over used blocks.
Qemu is also better in doing this then us doing it in Java code.
Otherwise a RBDException will be thrown with the message that the snapshot
isn't protected.
modified: plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
This saves the step of writing to a temporary image in /tmp first before
writing to RBD.
This is possible due to a new version in librbd. With the rbd_default_format
setting we can now force qemu-img to create format 2 RBD images.
This is available since Ceph version 0.67.5 (Dumpling).
Add executeInVR() with timeout interface to VirtualRouterDeployer
AggregationControlCommand with Action.Finish may take longer than normal command
since it would execute all the commands in one execution, and it may result in
SSH timeout for SshHelper or other mechanism communicate with VR.
Introduce an new executeInVR() interface with added timeout period for waiting
FinishAggregationCommand to complete execution.
- get the hosts on which VPC spans given vpc id
- get the VM's in the VPC
- get the hosts on which a network spans
- get the VPC's to which a hosts is part of
- get VM's of a VPC on a hosts
introduces capability to build a physical toplogy representation of a
VPC. This json file is encapsulated in
OvsVpcPhysicalTopologyConfigCommand, and is used to send full topology
to hypervisor hosts. On hypervisor this json config can be used to setup
tunnels, configure bridge, add flow rules etc
Ovs GURU, to use different broasdcast scheme VS://vpcid.gerkey for the
networks in VPC that use distributed routing
each VIF and tunnel interface to carry the network UUID in other/options
config
2) Corrected some logging in MidoNetPublicNetworkGuru - removed .toString method call on the objects in the log body as toString is called on the object by default when use log4j
With VirtIO enabled on KVM. FreeBSD 10 supports VirtIO for both the
network and the disks. This frees us from IDE and E1000 which should
also improve performance.
By default all network disks are in RAW format. Gluster works fine with
QCOW2 which has some advantages.
Disks are by default in QCOW2 format. It is possible to run into
a mismatch, where the disk is in QCOW2 format, but QEMU gets started
with format=raw. This causes the virtual machines to lockup on boot.
Failures to start a virtual machine can be verified by checking the log
of the virtual machine, and compare the output of 'qemu-img info'.
In /var/log/libvirt/qemu/<VM>.log find the URL for the drive:
-drive file=gluster+tcp://...,format=raw,..
Compare this with the 'qemu-img info' output of the same file, mounted
under /mnt/<pool-uuid>/<img-uuid>:
# qemu-img info /mnt/<pool-uuid>/<img-uuid>
...
file format: qcow2
...
This change makes passes the format when creating a disk located on RBD
(RAW only) and Gluster (QCOW2).
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The support for Gluster as Primary Storage is mostly based on the
implementation for NFS. Like NFS, libvirt can address a Gluster environment
through the 'netfs' pool-type.
PrepareForMigrationCommand, so that destination hypervisor can
mount pool. This further exposed an issue for KVM where iso
was not getting cleaned up upon successful migration, fixed as well.
By default only the Integers between -128..127 are cached (unless overridden by java.lang.Integer.IntegerCache.high system property)
If the inbound or outbound values are higher, the reference comparison won't work.
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
- minor resource leak cleaned up
- cpu-speed reading method extracted
- test added
- logging added in case of exception
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
This saves us a lot of code and libvirt is probably a better
place to do this.
libvirt-java now has the support we want, so we can now resize volumes
with libvirt.
(C)LVM volumes can't be resized using libvirt, so we have to
invoke a resize script for that.
By default the client_mount_timeout setting in librados is 300 seconds,
but that causes the connect to the Ceph cluster to block for 5 minutes
if the Ceph cluster is not available.
This patch is not ideal, but it mitigates the problem for now.
At a later point all this librados/librbd code should go back to libvirt
again, but the current versions of libvirt in the distributions are
to old for all the features we require.
For now this should prevent the CloudStack agent blocking for 5 minutes
when the Ceph cluster isn't available.
This is also tracked at the Ceph tracker: http://tracker.ceph.com/issues/6507
brought back up after being down for few hours,snapshot jobs do not get
triggered with reason "there is other active snapshot tasks on the
instance to which the volume is attached".
- the result of dividing long with long resulted in loss of precision both for network and IO
- unit tests included
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>