KVM hosts which are actuall up, but if their agents are shutdown should be put
in disconnected state. This would avoid getting the VMs HA'd and other commands
such as deploying a VM will exclude that host and save us from errors.
The improvement is that, we first try to contact the KVM host itself. If it fails
we assume that it's disconnected, and then ask its KVM neighbours if they can
check its status. If all of the KVM neighbours tell us that it's Down and we're
unable to reach the KVM host, then the host is possibly down. In case any of the
KVM neighbours tell us that it's Up but we're unable to reach the KVM host then
we can be sure that the agent is offline but the host is running.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Signed-off-by: wilderrodrigues <wrodrigues@schubergphilis.com>
This closes#340
When executing the tests in an environment where Libvirt is also installed, it
caused errors.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#342
Pull average Cpu util report between polling intervals instead of since boot
instead of using values since uptime
(cherry picked from commit 04176eaf17)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
This closes#297
Passing the file argument to the xml break for EL 7.1, the fix removes
the argument as just passing rombar='off' with its file arg to be empty string.
This closes#290
(cherry picked from commit aafa0c80b3)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
EL7 has a different output to 'free', use /proc/meminfo instead of a tool to be
more consistent across distros
(cherry picked from commit 212a05a345)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Fixing typo on LibvirtRequestWrapper
- Replace linbvirtCommands by libvirtCommands on LibvirtRequestWrapper
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#255
- The test was okay, but when running in an environment where a /root/.ssh/id_rsa existed, it would return true then fail
- We now mock the calls to methods that return the key paths, instead of relying in the static variables
- Adding LibvirtNetworkElementCommandWrapper and LibvirtStorageSubSystemCommandWrapper
- 2 unit tests added
- KVM hypervisor plugin with 22.2% coverage
I also refactored the StorageSubSystemCommand interface into an abstract class
- Remove the pseudo-multiple-inheritance implementation
- The StorageSubSystemCommand was an interface, not related to the Command class
and its implementation were extending the Command class anyway. The whole structure is better now.
- Addin LibvirtPvlanSetupCommandWrapper
- 6 unit tests added
- KVM hypervisor plugin with 21% coverage
From the 6 tests added, 2 were extra tests to increase the coverage of the LibvirtStopCommandWrapper
- Increased from 35% to 78.7%
- Adding LibvirtCopyVolumeCommandWrapper
Refactoring the LibvirtUtilitiesHelper
- Changing method name
Did not add any test to this commit due to the refactor mentioned abot.
Will proceed and add the tests
i# Please enter the commit message for your changes. Lines starting
- Gave it a better, more suggestive, name since I now added other methods to the class.
- It makes easier to mock objects and get a better coverage of the classes
- Adding LibvirtBackupSnapshotCommandWrapper, LibvirtCreatePrivateTemplateFromVolumeCommandWrapper and LibvirtManageSnapshotCommandWrapper
- 3 unit tests added
- KVM hypervisor plugin with 18.3% coverage
Less tests added to those classes because the code is quite complex and way too long.
The tests added are just covering the new flow, to make sure it works fine. I will come back to those classes later.
- Adding LibvirtOvsDestroyBridgeCommandWrapper, LibvirtOvsSetupBridgeCommandWrapper
- 4 unit tests added
- KVM hypervisor plugin with 13.9% coverage
More tests added to cover LibvirtPrepareForMigrationCommandWrapper
- Coverage of this wrapper broght from 37% to 90.6%
- 4 new tests added
- Adding LibvirtCheckConsoleProxyLoadCommandWrapper, LibvirtConsoleProxyLoadCommandWrapper, LibvirtWatchConsoleProxyLoadCommandWrapperand CitrixConsoleProxyLoadCommandWrapper
- 2 unit tests added
- KVM hypervisor plugin with 12% coverage
Refactored the CommandWrapper interface in order to remove the esecuteProxyLoadScan, which is now
implemented bu subclasses.
- Adding LibvirtGetHosStatsCommandWrapper
- 1 unit test added
- KVM hypervisor with 10.5% coverage
Tests are a bit limited on this one becuause of the current implementation. Would clean it up later in a separate branch
- Adding LibvirtGetVmStatsCommandWrapper
- 3 unit tests
Refactored the LibvirtConnectiobn by surrounding it with an wrapper.
- Make it easier to cover the static/native calls
- Added better coverage to StopCommand tests
- Adding LibvirtStopCommandWrapper
- LibvirtRequestWrapper
- 1 unit tests
Refactored the RequestWrapper to make it better.
- Changes also applied to the CitrixRequestWrapper
Linux kernel supports vmxnet3, allowing it in KVM plugin would allow us to
run ESX hosts on KVM hosts using CloudStack with vmxnet3 nic which can be
passed as VM's nicAdapter detail
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit e02d787f30)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This improvements checks for "guest.cpu.features" property which is a space
separated list of cpu features that is specific for a host. When added, it
will add <feature policy='require' name='{{feature-you-listed}}'/> in the
<cpu> section of the generated vm spec xml.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit ea7fd37783)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CentOS 7 does not ship with ifconfig anymore. We should use ip commands instead.
This also works on older versions, like CentOS 6 and Ubuntu 12.x/14.x, that we
support.
This closes#165
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
1. provide compatibility with the Big Cloud Fabric (BCF) controller
L2 Connectivity Service in both VPC and non-VPC modes
2. virtual network terminology updates: VNS --> BCF_SEGMENT
3. uses HTTPS with trust-always certificate handling
4. topology sync support with BCF controller
5. support multiple (two) BCF controllers with HA
6. support VM migration
7. support Firewall, Static NAT, and Source NAT with NAT enabled option
8. add VifDriver for Indigo Virtual Switch (IVS)
This closes#151
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Earlier host addition of multiple hosts with local storage failed due to
same local storage UUID being used where the storage path is same.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit bf17f640c6)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
KVMStoragePoolManager is a singleton in practice, any plugin
or extension of LibvirtComputingResource will need to act on
the specific instance of KVMStoragePoolManager that LibvirtComputingResource
has initialized. Therefore, expose this variable for those who
wish to call storage commands from plugins or extensions.
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java
Clearly show if a volume is found and if not, that the pool is being refreshed
and the fetch is tried again.
Due to my commit b53a9dcc9f the chance of a volume
not being found is slightly bigger, but the performance gain is enormous on larger
deployments.
This is why we clearly have to log that we are refreshing the pool information
when a volume is not found.
It could be that a volume is created on host A and a few seconds later host B tries
to access the volume. In that case host B's libvirt doesn't know about the volume
yet and has to refresh the pool before it does.
On larger (especially RBD) storage pools this can take a lot of
time slowing operations like creating volumes down.
The getStorageStats command will still ask a pool to be refreshed so
that the management server has accurate information about the storage pools.
On larger deployments, with thousands of volumes in one pool, this should
significantly improve storage related operations
For ResizeVolume API command -
1. If hypervisor resource throws an exception, handle the NPE thrown by the job framework.
2. Improve user error message in case of RuntimeException by throwing the exception instead of 'Unexpected Exception'.
We don't need an external script to investigate the format of the RBD volume,
we only have to ask Libvirt to resize the volume and that will ask librbd to
do so.
In situations where libvirt lost the storage pool the KVM Agent will re-create the
storage pool in libvirt.
This could be then libvirt is restarted for example.
The object returned internally was missing essential information like the sourceDir
aka the Ceph pool, the monitor IPs, cephx information and such.
In this case the first operation on this newly created pool would fail. All operations
afterwards would succeed.
We used to create the snapshot after the copy from Secondary Storage,
but it could be that we never use the snapshot.
Now we check if the snapshot exists prior to performing the cloning operation
Since we use qemu-img to copy from RBD to Secondary Storage we no
longer have to force to RAW images, but can stick with QCOW2
When the snapshot backups are QCOW2 format they can easily be deployed
again when restoring from a backup
The KVMStorageProcessor no longer has a hardcoded if-statement which sets
RBD volumes to RAW, this is now handled in the LibvirtStorageAdapter
The Management Server still sends QCOW2 as format. That's a fix for later.
fix mismatch of ovs-host-setup, ovs_host_setup used Libvirt resource and
scripts
plug the nic to OVS bridges created for the tunnel network.
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/OvsVifDriver.java
Added a new flag 'checkBeforeCleanup' to StopCommand based on which check is done to see if VM is running in HV host.
If VM is running then in this case it is not stopped and the operation bails out.
Also modified the MS code to call the StopCommand with appropriate value for the flag based on the context.
Currently it is only set to 'true' when called from the new vmsync logic based on powerstate of VM. For rest it
is set to 'false' meaning no change in behaviour.
This reduces the amount of time and storage it takes dramatically. We no longer
do a full copy, but a sparse copy. The destination image is still in RAW
format, but we only copy over used blocks.
Qemu is also better in doing this then us doing it in Java code.
Otherwise a RBDException will be thrown with the message that the snapshot
isn't protected.
modified: plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java
This saves the step of writing to a temporary image in /tmp first before
writing to RBD.
This is possible due to a new version in librbd. With the rbd_default_format
setting we can now force qemu-img to create format 2 RBD images.
This is available since Ceph version 0.67.5 (Dumpling).
Add executeInVR() with timeout interface to VirtualRouterDeployer
AggregationControlCommand with Action.Finish may take longer than normal command
since it would execute all the commands in one execution, and it may result in
SSH timeout for SshHelper or other mechanism communicate with VR.
Introduce an new executeInVR() interface with added timeout period for waiting
FinishAggregationCommand to complete execution.
- get the hosts on which VPC spans given vpc id
- get the VM's in the VPC
- get the hosts on which a network spans
- get the VPC's to which a hosts is part of
- get VM's of a VPC on a hosts
introduces capability to build a physical toplogy representation of a
VPC. This json file is encapsulated in
OvsVpcPhysicalTopologyConfigCommand, and is used to send full topology
to hypervisor hosts. On hypervisor this json config can be used to setup
tunnels, configure bridge, add flow rules etc
Ovs GURU, to use different broasdcast scheme VS://vpcid.gerkey for the
networks in VPC that use distributed routing
each VIF and tunnel interface to carry the network UUID in other/options
config
2) Corrected some logging in MidoNetPublicNetworkGuru - removed .toString method call on the objects in the log body as toString is called on the object by default when use log4j
With VirtIO enabled on KVM. FreeBSD 10 supports VirtIO for both the
network and the disks. This frees us from IDE and E1000 which should
also improve performance.
By default all network disks are in RAW format. Gluster works fine with
QCOW2 which has some advantages.
Disks are by default in QCOW2 format. It is possible to run into
a mismatch, where the disk is in QCOW2 format, but QEMU gets started
with format=raw. This causes the virtual machines to lockup on boot.
Failures to start a virtual machine can be verified by checking the log
of the virtual machine, and compare the output of 'qemu-img info'.
In /var/log/libvirt/qemu/<VM>.log find the URL for the drive:
-drive file=gluster+tcp://...,format=raw,..
Compare this with the 'qemu-img info' output of the same file, mounted
under /mnt/<pool-uuid>/<img-uuid>:
# qemu-img info /mnt/<pool-uuid>/<img-uuid>
...
file format: qcow2
...
This change makes passes the format when creating a disk located on RBD
(RAW only) and Gluster (QCOW2).
Signed-off-by: Niels de Vos <ndevos@redhat.com>
The support for Gluster as Primary Storage is mostly based on the
implementation for NFS. Like NFS, libvirt can address a Gluster environment
through the 'netfs' pool-type.
PrepareForMigrationCommand, so that destination hypervisor can
mount pool. This further exposed an issue for KVM where iso
was not getting cleaned up upon successful migration, fixed as well.
By default only the Integers between -128..127 are cached (unless overridden by java.lang.Integer.IntegerCache.high system property)
If the inbound or outbound values are higher, the reference comparison won't work.
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
- minor resource leak cleaned up
- cpu-speed reading method extracted
- test added
- logging added in case of exception
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
This saves us a lot of code and libvirt is probably a better
place to do this.
libvirt-java now has the support we want, so we can now resize volumes
with libvirt.
(C)LVM volumes can't be resized using libvirt, so we have to
invoke a resize script for that.
By default the client_mount_timeout setting in librados is 300 seconds,
but that causes the connect to the Ceph cluster to block for 5 minutes
if the Ceph cluster is not available.
This patch is not ideal, but it mitigates the problem for now.
At a later point all this librados/librbd code should go back to libvirt
again, but the current versions of libvirt in the distributions are
to old for all the features we require.
For now this should prevent the CloudStack agent blocking for 5 minutes
when the Ceph cluster isn't available.
This is also tracked at the Ceph tracker: http://tracker.ceph.com/issues/6507
brought back up after being down for few hours,snapshot jobs do not get
triggered with reason "there is other active snapshot tasks on the
instance to which the volume is attached".
- the result of dividing long with long resulted in loss of precision both for network and IO
- unit tests included
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
replace vlanid wih broadcast uri to support vxlan to identify whether id is VLAN ID or VNI
Signed-off-by: ynojima <mail@ynojima.net>
Signed-off-by: Hugo Trippaers <htrippaers@schubergphilis.com>
Detail: getPhysicalDisk() was not matching on volumes with .raw, so
instead setting disk format to QCOW2.
BUG-ID: CLOUDSTACK-5018
Bugfix-for:
Reviewed-by:
Reported-by:
Signed-off-by: John Kinsella <jlk@stratosec.co> 1383287538 -0700
1) vxlan will use bridge scheme 'brvx-<vni>'. Multiple physical networks can host guest
traffic type with vxlan isolation, so long as they don't use the same VNI range.
2) Guest traffic labels can be physical interface if bridge by given name is not found.
Normally we take traffic label name, find the matching bridge, then resolve that to a
physical interface. Then we create guest bridges on that interface. Now we can just
specify the interface.
xs 6.1/6.2 introduce the new virtual platform, so there are two virtual platforms, windows PV driver version must match virtual platforms,
this patch tracks PV driver versions in vm details and template details.
Anthony
Detail: Checks for other Ethernet interface names uses startsWith(),
whereas the p1p1 style interface uses a regex that doesn't allow for
tailing characters, and so blocks vlan IDs. Fixed.
BUG-ID: CLOUDSTACK-4884
Bugfix-for: 4.2.1
Reviewed-by:
Reported-by:
Signed-off-by: John Kinsella <jlk@stratosec.co> 1381965250 -0700
I don't think host kernel version has any bearing on it. Original code
was tested with CentOS 6.3 and 6.4, but it seems to succeed or fail per-host,
e.g. a fast host might work and a slow host might not. I was getting intermittent
failures with ubuntu 12.04.3 prior to this patch.
These changes are a joint effort between Edison and I to refactor some
of the code around snapshotting VM volumes and creating
templates/volumes from VM volume snapshots. In general, we were working
towards allowing PrimaryDataStoreDrivers to create snapshots on primary
storage and not requiring the snapshots to be transferred to secondary
storage.
High level changes:
-Added uuid to NfsTO, SwiftTO & S3TO to cut down on the requirement of
PrimaryDataStoreTO and ImageStoreTO which don't really serve much of a
purpose
-Initial work towards enable reverting VM volume from snapshots
-Added hypervisor commands for introducing and forgetting new hypervisor
objects (snapshots, templates & volumes)
Signed-off-by: Edison Su <sudison@gmail.com>
The managed context framework provides a simple way to add logic
to ACS at the various entry points of the system. As threads are
launched and ran listeners can be registered for onEntry or onLeave
of the managed context. This framework will be used specifically
to handle DB transaction checking and setting up the CallContext.
This framework is need to transition away from ACS custom AOP to
Spring AOP.
Initial patch for VXLAN support.
Fully functional, hopefully, for GuestNetwork - AdvancedZone.
Patch Note:
in cloudstack-server
- Add isolation method VXLAN
- Add VxlanGuestNetworkGuru as plugin for VXLAN isolation
- Modify NetworkServiceImpl to handle extended vNet range for VXLAN isolation
- Add VXLAN isolation option in zoneWizard UI
in cloudstack-agent (kvm)
- Add modifyvxlan.sh script that handle bridge/vxlan interface manipulation script
-- Usage is exactly same to modifyvlan.sh
- BridgeVifDriver will call modifyvxlan.sh instead of modifyvlan.sh when VXLAN is used for isolation
Database changes:
- No change in database structure.
- VXLAN isolation uses same tables that VLAN uses to store vNet allocation status.
Known Issue and/or TODO:
- Some resource still says 'VLAN' in log even if VXLAN is used
- in UI, "Network - GuestNetworks" dosen't display VNI
-- VLAN ID field displays "N/A"
- Documentation!
Signed-off-by : Toshiaki Hatano <haeena@haeena.net>
Libvirt reports:
org.libvirt.LibvirtException: Storage volume not found: no storage vol
with matching name
in some cases, if the volume is created on one kvm host, while accessed
from other host.
It's possible due to concurrent access(read/write) storage.
The current fix is to try serveral times, and wait for 30 seconds for
each retry.
If the issue still there, then need to sync the storage pool access
CLOUDSTACK-4457:
CLOUDSTACK-4459:
harden kvm getvolume. It's possible that one volume created on other kvm host, won't show up on another host, try more times to refresh storage pool if volume won't shown up
Conflicts:
engine/storage/integration-test/test/org/apache/cloudstack/storage/test/FakeDriverTestConfiguration.java
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/KVMStorageProcessor.java
There still exist two issues after Edison's commits.
(1) Migration from new hosts to old hosts failed.
The bridge name on old host is set to cloudVirBr* if network.bridge.name.schema is set to 3.0 in /etc/cloudstack/agent/agent.properties, but the actual bridge name is breth*-* after running cloudstack-agent-upgrade.
(2) all ports of vms (Basic zone, or Advanced zone with security groups) on old hosts are open, because the iptables rules are binding to device (bridge) name which is changed by cloudstack-agent-upgrade.
After this, the KVM upgrade steps :
a. Install 4.2 cloudstack agent on each kvm host
b. Run "cloudstack-agent-upgrade". This script will upgrade all the existing bridge name to new bridge name, and update related firewall rules.
c. install a libvirt hook:
c1. mkdir /etc/libvirt/hooks
c2. cp /usr/share/cloudstack-agent/lib/libvirtqemuhook /etc/libvirt/hooks/qemu
c3. chmod +x /etc/libvirt/hooks/qemu
c4. service libvirtd restart
c5. service cloudstack-agent restart
Signed-off-by: Wei Zhou <w.zhou@leaseweb.com>
The migrate method from libvirt supports passing down a different XML for running
the instance of the target hypervisor.
This enables the VNC to bind to the private IP Address of the hypervisor and during
migration this will be changed to the private IP address of the target host.
This way VNC doesn't listen world wide and is much safer.
Initial patch for VXLAN support.
Fully functional, hopefully, for GuestNetwork - AdvancedZone.
Patch Note:
in cloudstack-server
- Add isolation method VXLAN
- Add VxlanGuestNetworkGuru as plugin for VXLAN isolation
- Modify NetworkServiceImpl to handle extended vNet range for VXLAN isolation
- Add VXLAN isolation option in zoneWizard UI
in cloudstack-agent (kvm)
- Add modifyvxlan.sh script that handle bridge/vxlan interface manipulation script
-- Usage is exactly same to modifyvlan.sh
- BridgeVifDriver will call modifyvxlan.sh instead of modifyvlan.sh when VXLAN is used for isolation
Database changes:
- No change in database structure.
- VXLAN isolation uses same tables that VLAN uses to store vNet allocation status.
Known Issue:
- Some resource still says 'VLAN' in log even if VXLAN is used
- in UI, "Network - GuestNetworks" dosen't display VNI
-- VLAN ID field displays "N/A"
This failed due to a RAW -> QCOW2 conversion (again).
The current code still makes to much assumptions about everything always
being QCOW2 while that is not always true.
KVM - Create template from volume
Vmware - Create template from volume / Create template from snapshot
send the physical size in the copycommand which accordingly will populate template store ref and the usage_event tables with the right physical size
Signed off by : nitin mehta<nitin.mehta@citrix.com>
Although libvirt supports resizing RBD volumes (and other formats) the
Java bindings (libvirt-java) don't.
Right now we use the Java bindings for librbd to handle the resizing for us,
but in the future this should be done by libvirt rather then these
Java bindings.
Detail: Cloudstack tries to stop VMs all the time, for all sorts of reasons,
but usually just to get into a known state. Libvirt throws an exception of
'Domain not found' when attempting to stop a VM that doesn't exist. This causes
problems for troubleshooting real issues. Domain not found should equate
to success if trying to stop.
BUG-ID: CLOUDSTACK-4011
Bugfix-for: 4.2
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1375825281 -0600
- Move vnetBridge clean up function from LibvirtComputingResource to BridgeVifDriver
-- since only BridgeVifDriver have to handle this event
- LibvirtComputingResource now properly call VifDriver.unplug() when it receives UnPlugCommand
- Remove not working and no longer used method getVnet(String) from VirtualMachineName
- Remove not working and no longer used method getVnet() from StopCommand
- Remove unused constructer StopCommand(VirtualMachine, String, boolean) from StopCommand
- Remove unused member vnet from StopCommand
- Remove unused member _modifyVlanPath from OvsVifDriver
Tested with 2 KVM hosts and confirmed it correctly manipulate vnetBridge with start, stop, migrate, plug, and unplug event
Signed-off-by: Hugo Trippaers <htrippaers@schubergphilis.com>
Description:
a) Fixing NPE when wrong path is provided for primary datastore.
b) No error dialog shows up in GUI when wrong path is provided,
after NPE fix - propagating exception upward.
c) If the KVM agent is down, an invalid datastore gets logged in
storage_pool table and doesn't get removed, so it shows up
in the GUI in the list of datastores - fixing this as well.
- writeToFile removed since no references to it
- readFileAsString replaced with FileUtils.readFileToString
- minor code duplication removed in dependent method getNicStats
- unit test added
Signed-off-by: Laszlo Hornyak <laszlo.hornyak@gmail.com>
Renaming the method in the command objects to be uniform with
PlugNicCommand/UnplugNicCommand.getVmName
Signed-off-by: Prasanna Santhanam <tsp@apache.org>
inactive VM definitions block a new VM starting. Definitions aren't supposed to
be persistent, but sometimes a crash or failed migration can leave behind a
definition.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1370299734 -0600
RBD format 2 supports cloning (aka layering) where one base image can serve
as a parent image for multiple child images.
This enables fast deployment of a large amount of virtual machines, but it also
saves spaces on the Ceph cluster and improves performance due to better caching.
Qemu-img doesn't support RBD format 2 (yet), so to enable these functions the
RADOS/RBD Java bindings are required.
This patch also enables deployment of System VMs on RBD storage pools. Since we
no longer require a patchdisk for passing the boot arguments we are able to deploy
these VMs on RBD.
Detail: We do two strange things, #1, when a vm is created, we create the uuid
for the domain by converting the name into a uuid, then subsequently, any time
we look for a domain, we convert its name to a uuid and domainLookupByUUID. This
is an unnecessary obfuscation, so instead we just do a domainLookupByName. As
a bonus, we are now setting the domain's uuid to be the cloudstack uuid. It is
no longer used anywhere, but will be less confusing to admins. This shouldn't
affect upgrades, since we can always lookup existing VMs by name.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1369287049 -0600
Detail: Undefine VM after migration. Lingering domain definitions cause
migrations back to the original host to fail, since domain already exists.
BUG-ID: CLOUDSTACK-2640
Bugfix-for: 4.1.0,4.2.0
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1369285950 -0600
so that callers of startVM in LibvirtComputingResource know that a vm failed
to start
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1366224775 -0600
pools the proper way won't cause problems for KVM HA Monitor, this patch closes
holes. Call the KVMStoragePool deleteStoragePool that properly removes it from
the KVMHA hashmap, instead of the pools direct delete() call.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1366172318 -0600
- Supports DHCP, Source NAT, Static NAT, Firewall rules, Port Forwarding
- Renamed MidokuraMidonet to MidoNet
- Related Jira ticket is CLOUDSTACK-996
Signed-off-by: Dave Cahill <dcahill@midokura.com>
Signed-off-by: Hugo Trippaers <htrippaers@schubergphilis.com>
deleted NFS pools, causing failures when defining new storage pools. Sometimes
a storage pool has never been used on a host, and getStoragePool fails when
copying templates or in storage migration. deleteStoragePool(pool) often fails
silently, leaving no pool defined in libvirt, but a mountpoint left behind.
This patch handles some of these exceptions and brings forward any issues via
logging.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1364603486 -0600
KVM to manager. This adds collection of available storage to KVM, not
just used.
Bugfix-for: 4.0.2, 4.1, master
Submitted-by: Ted Smith <darnoth@gmail.com>
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1363966235 -0600
The collection of network usage from VPC virtual router on KVM does not work,
because there is no corresponding procedure to deal with VPC virtual router
(cmd.isForVpc() = true).
Reviewed-by: Kishan Kavala <kishan@apache.org>
Reported-by: Wei Zhou <w.zhou@leaseweb.com>
Signed-off-by: Prasanna Santhanam <tsp@apache.org>
cloud-defined resources on the host has caused various problems. As a backward
compatible fix, if an existing pool with a different name collides with a pool
being created (by path), the pool will be redefined with the name cloudstack
knows about. This is actually what brought up the bug, a persisted storage pool
cloudstack wasn't managing.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1363210149 -0600
Detail: When we stop a VM, it's definition is no longer valid. Therefore, we
need to catch the exception thrown from libvirt in trying to lookup a
non-existent domain by UUID while trying to check if it's shut down.
BUG-ID:CLOUDSTACK-600
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1363201066 -0600
Detail: A previous patch fixed an issue where we are defining VMs to persist
locally on KVM hosts, which can cause issues if the agent isn't running and
libvirt decides to start the VM unbeknownst to cloudstack. The previous patch
stopped defining VMs as persistent. This patch adds compatibility for existing
cloudstack environments, removing the persistent definition on stop if needed.
BUG-ID: CLOUDSTACK-600
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1363194656 -0600
Detail: This gets rid of the patchdisk method of passing cmdline and
authorized_keys to KVM system VMs. It instead passes them to a virtio socket,
which the KVM guest reads from the character device /dev/vport0p1 during
cloud-early-config. Tested to work on CentOS 6.3 and Ubuntu 12.04. Should
work with even older versions of libvirt.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1362691685 -0700
Current kvm agent will silently ignore many exception, and there's no
way to see what really happened. This patch will log in trace level log
that was silently ignored. And also, it will fix huge bare Exception
catch, which is very harmful because it also catches RuntimeException.
Detail: This device can be used for remotely controlling the system vms through
a local socket on the host. We will attempt to replace the KVM patchdisk with
it. Tested, successfully deploys VM, and if system vm has proper driver it
will create a /dev/vport0p1 device within the VM. We will be updating the
system VM in 4.2/5.0 and will support this.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1362527352 -0700
- Building map of {trafficType, vifDriver} at configure time
- Use the relevant VIF driver for the given traffic type when call plug()
- Inform all vif drivers when call unplug(), as we no longer know traffic type
- Refactor VIF driver choosing code and add unit tests
- Basic unit tests, just test default case
- Also slight refactor of unit test code, and use jUnit 4 instead of 3, to match rest of codebase
Signed-off-by: Hugo Trippaers <htrippaers@schubergphilis.com>
The recently added overcommit feature breaks compatibility between older management servers
and 4.2 agents.
This patch fixes that by falling back if needed.
Detail: Removing references to /usr/lib/cloud and /usr/lib64/cloud so that old
systemvm.iso files aren't found by accident. systemvm.iso should exist in
/usr/share/cloudstack-common/vms now.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1360860243 -0700
Detail: If your traffic label points to a bridge that is on a tagged interface
rather than a real physical interface, cloudstack may not parse the physical
interface correctly, bringing up tagged interfaces on the tagged interface.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1360798665 -0700
We used to define domains persistent in libvirt, which caused XML definitions
to stay there after a reboot of the hypervisor.
We however don't do anything with those already defined domains, actually, we wipe
all defined domains when starting the agent.
Some users however reported that libvirt started these domains after a reboot
before the CloudStack agent was started.
By starting domains from the XML description and not defining them we prevent
them from ever being stored in libvirt.
- Fixed new join dao impls as spring components
- Fixed component context xml to load api rate limit checker
- Fixed root pom.xml for duplicate plugin
- Fixed list data centers method
- Fixed following conflicts:
api/src/org/apache/cloudstack/api/command/admin/network/CreateNetworkOfferingCmd.java
api/src/org/apache/cloudstack/api/command/user/offering/ListServiceOfferingsCmd.java
api/src/org/apache/cloudstack/api/command/user/template/DeleteTemplateCmd.java
api/src/org/apache/cloudstack/api/command/user/template/ExtractTemplateCmd.java
plugins/api/discovery/src/org/apache/cloudstack/discovery/ApiDiscoveryServiceImpl.java
server/src/com/cloud/api/ApiDBUtils.java
server/src/com/cloud/api/ApiServer.java
server/src/com/cloud/api/query/QueryManagerImpl.java
server/src/com/cloud/configuration/DefaultComponentLibrary.java
server/src/com/cloud/server/ManagementServerImpl.java
server/src/com/cloud/storage/swift/SwiftManagerImpl.java
Signed-off-by: Rohit Yadav <bhaisaab@apache.org>
Detail: There are several places in the code that do a
"brctl show | grep bridgeName" or similar, which causes all sorts
of problems when you have for example a cloudVirBr50 and a
cloudVirBr5000. This patch attempts to stop relying on the output
of brctl, instead favoring sysfs and /sys/devices/virtual/net.
It cuts a lot of bash out altogether by using java File. It was
tested in my devcloud-kvm against current 4.0, as well as by the
customer reporting initial bug.
BUG-ID: CLOUDSTACK-938
Fix-For: 4.0.1
Signed-off-by: Marcus Sorensen <marcus@betterservers.com>
Description: When selecting a storage adaptor, cleanupDisk assumes that
libvirt is being used. Therefore, we pass a StoragePoolType that maps to
libvirt. This is the only place in LibvirtComputingResource where the
StoragePoolType can't be pulled from somewhere else.
BUG-ID: CLOUDSTACK-1011
Signed-off-by: Marcus Sorensen <marcus@betterservers.com>
Detail: This merges the resizevolume feature branch, which provides the
ability to migrate a disk between disk offerings, thereby changing its
size, or specifying a new size if current disk offering is custom.
BUG-ID: CLOUDSTACK-644
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1358358209 -0700
Although I still think the templates aren't well maintained, I just
added 12.04 since this is an LTS and people probably want it in the
list of templates.
This system should be more generic I think though.
Create OvsVifDriver to deal with openvswitch specifics for plugging
intefaces
Create a parameter to set the bridge type to use in
LibvirtComputingResource.
Create several functions to get bridge information from openvswitch
Add a check to detect the libvirt version and throw an exception when
the version is to low ( < 0.9.11 )
Fix classpath loading in Script.findScript to deal with missing path
separators at the end.
Add notification to the BridgeVifDriver that lswitch broadcast type is
not supported.
Create OvsVifDriver to deal with openvswitch specifics for plugging
intefaces
Create a parameter to set the bridge type to use in
LibvirtComputingResource.
Create several functions to get bridge information from openvswitch
Add a check to detect the libvirt version and throw an exception when
the version is to low ( < 0.9.11 )
Fix classpath loading in Script.findScript to deal with missing path
separators at the end.
Add notification to the BridgeVifDriver that lswitch broadcast type is
not supported.
Detail: Instead of using LibvirtStorageAdaptor for everything, you can create
your own storage adaptor and use it. We select storage adaptor based on storage
pool type, thus we needed to adjust LibvirtComputingResource to pass pool type
to everything in KVMStoragePoolManager. This in turn required that we pass the
info necessary to LibvirtComputingResource as well, so a few agent Commands were
modified.
Note this patch in and of itself shouldn't change any existing behavior, just
allow for new storage adaptors to be selected based on storage pool type.
Reviewed-by: Edison Su
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1355769696 -0700
Detail: If source image is qcow2, and we want a qcow2 image, then doing a
convert strips off compression and any snapshots the user had in that image. If
a backing file exists, we stick with convert so we can pull in both the backing
file and the COW image, otherwise we just cp the qcow2 file. This is also faster
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1354755241 -0700
Detail: This patch deletes any patchdisk found when deleting root volume for
system VM.
BUG-ID: CLOUDSTACK-566
Bugfix-for: 4.0.1
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1354222335 -0700
stopped
Detail: This patch fixed an issue with hosts trying to stop system vms that were
already not running and deleting a patch disk for the system vm running on
another host. It got applied to 4.0 but not master.
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1354222160 -0700
Detail: Because of the way most other primary storage types work with cloudstack
(i.e. backing stores) CLVM actually copies the template to a local logical
volume on primary storage, then uses that. This causes all of your primary
storage to be littered with a copy of every template used. Since we're not
using these, dump the template direct to the newly created logical volume.
This is faster as well since the template is sparse; we're not creating a fat
template on primary storage and then copying that to a logical volume when we
deploy from template.
BUG-ID: CLOUDSTACK-508
Bugfix-for: 4.1
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1353221260 -0700
Detail: In com.cloud.hypervisor.kvm.resource.BridgeVifDriver.java, in 2 places
an if block should have evaluated to true if trafficLabel was null, however it
was causing a NullPointerException instead.
BUG-ID : NONE
Bugfix-for: 4.0
Reviewed-by: Marcus Sorensen
Reported-by: Dave Cahill
Signed-off-by: Marcus Sorensen <marcus@betterservers.com> 1352307750 -0700
Detail: There was a regression in functionality introduced by
915babd970 where the public
bridge could not also be the private bridge. This had several
additional consequences, this patch should revert the behavior
back while keeping the functionality enhancements introduced by that
commit.
BUG-ID : NONE
Reviewed-by: Dave Cahill
Reported-by: Dave Cahill via cloudstack-dev
Signed-off-by: Marcus Sorensen <shadowsor@gmail.com> 1351574006 -0600
called.
VifDriver.unplug must be called in MigrateCommand which hooks VM
migration in source host, because plug will be called in
PrepareForMigration in destination host. But that operation is missing
in current LibvirtComputingResources.
Signed-off-by: Edison Su <sudison@gmail.com>
On kvm computing host, vifdriver.unplug will always fails (throws
LibvirtException) and network cleanup will not be called. This was
because the code first undefine the computing domain, and then tries to
query the destroyed machine definition to fetch NIC information. IMHO,
kvm plugin code rounds LibvirtException too much.
Signed-off-by: Edison Su <sudison@gmail.com>
Since only the cephx user like 'admin' was passed we couldn't define two RBD storage pools
using the cephx user admin, even if they were running on different Ceph clusters.
By adding the monitor hostname and poolname to the secret's usage (which we don't even use) it becomes
unique.
work)
Cloudstack seems to let you create guest traffic types on multiple
physical networks. However, when I try this with KVM I end up always
bridging to whatever device is used for guest.network.device. This pulls
the traffic label (NicTO.getName()) and uses that bridge to ensure that
we get on the correct physical network, rather than just always using
the guest.network.device.
This also changes the bridge naming scheme from cloudVirBr + vlanid to
br + physicalinterface + "-" + vlanid. This is because we should be able
to support the same vlan numbers per physical network, and the previous
bridge name would not support this and collide.
Signed-off-by: Edison Su <sudison@gmail.com>
create
The code is unable to detect an existing pool, because we use a random
UUID each time. New Libvirt doesn't allow multiple pools to be defined
to the same storage. This patch generates a UUID based on the storage
path, so that it can be detected as existing and reused. It also cleans
up no-op code and adjusts the naming of a few things to clean up any
confusion.
Signed-off-by: Edison Su <sudison@gmail.com>
Since /root is r-x permissions, Java fails to mkdir /root/.ssh (even
though the agent is running as root) because it looks for the writable
permission. This patch modifies the 'chmod 700 /root/.ssh' shell command
that we already use into 'mkdir -m 700 /root/.ssh', to be able to create
the directory as root even though write permissions are not set on
/root. This seemed cleaner/safer than adding writable to /root.
Signed-off-by: Edison Su <sudison@gmail.com>
The default value for local.storage.path does not exists by
default in CentOS 6. By default, this results in NullPointerException
silently. Without this log message, administrator can't figure out
the reason at all.
Signed-off-by: Edison Su <sudison@gmail.com>
/root/.ssh is created with perms '600' if it doesn't already exist. This causes
a problem in that it can't write out id_rsa.cloud:
2012-08-27 16:35:40,227 DEBUG [cloud.agent.Agent] (agentRequest-Handler-4:null)
Processing command: com.cloud.agent.api.ModifySshKeysCommand
2012-08-27 16:35:40,228 DEBUG [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-4:null) Failed to create file: java.io.IOException:
Permission denied
Doing 'chmod u+x /root/.ssh' fixed the above, so it seems that even though the
agent is running as root it cares about being able to chdir into /root.ssh
Signed-off-by: Sheng Yang <sheng.yang@citrix.com>
Implements
SetupGuestNetworkCommand,SetNetworkACLCommand,SetSourceNatCommand,IpAssocVpcCommand,SetPortForwardingRulesVpcCommand.
Passes basic functionality, though I'm sure there may be some honing to
do.
Also fixes a few minor things found along the way:
vpc_guestnw.sh wasn't successfully setting up apache due to default
listen IP of 10.1.1.1
vpc_guestnw.sh was referencing a 'logger_it' function, replaced with
'logger -t cloud'
system vms were running with OS type "Debian GNU/Linux 5.0(32-bit)",
which was not found in the KVMGuestOsMapper
the Xen implementation of SetupGuestNetworkCommand had apparently
copied its catch message from UnPlug Nic, fixed string
Send-by: Marcus Sorensen
RB: https://reviews.apache.org/r/6883
This is part 1 in enabling VPC for KVM. The various commands needing
implementation will be submitted individually unless I'm told to do
otherwise, in case I don't complete all of the commands, such that
someone else can take over or build on my work.
RB: https://reviews.apache.org/r/6859
Send-by: shadowsor@gmail.com
Add BridgeVifDriver and move current vif implementation to it.
- remove dependency on VirtualRoutingResource.
- factor out some of the networking code in LibvirtComputingResource
to BridgeVifDriver.
Add base class for KVM VifDriver.
Add VifDriver Interface for KVM.
RB: https://reviews.apache.org/r/6285
Send-by: Tomoe Sugihara <tomoe@midokura.com>
Add BridgeVifDriver and move current vif implementation to it.
- remove dependency on VirtualRoutingResource.
- factor out some of the networking code in LibvirtComputingResource
to BridgeVifDriver.
Add base class for KVM VifDriver.
Add VifDriver Interface for KVM.
RB: https://reviews.apache.org/r/6285
Send-by: Tomoe Sugihara <tomoe@midokura.com>
We used to generate a UUID when this wasn't set, but since we aren't writing to
agent.properties anymore we have to make sure the UUID is persistent across restarts.
Libvirt can also return a bunch of emulators for eg ARM and S390
We filter those out since we do not support these architectures.
This way we don't try to start a x86_64 instance with a S390 emulator
Since we are using libvirt for handling our storage pools we should rely on that information as well.
Before fetching the capacity we refresh the pool so libvirt has the most up-to-date information.
This is not needed with newly created pools since libvirt does a refresh on creation.