Improved performance on creating VM for KVM virtualization.
On a huge hosts every "ifconfig | grep" takes a lot of time (about 2.5-3 minutes on hosts with 500 machines). For example: ip link show dev $vlanDev > /dev/null is faster than ifconfig |grep -w $vlanDev > /dev/null. But using ip command is much better. Using this patch you can create 500s machine in 10 seconds. You don't need slow ifconfig prints anymore.
In 6233a77d15 as a part of PR #2432 the
bash() function was replaced by the execute() function.
Somehow this last calling of the bash() function was not caught by testing
and is still in there.
This causes Exceptions to be thrown by the Security Group script.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
On RHEL/CentOS/Fedora the bridge related sysctl rules are enabled
in kernel by default but can only be disabled. Enabling those keys
will fail, causing iptables/ebtables tables to not be created
and fails SG on CentOS.
This also fixes an integration test case, which assumes first few
tests complete within 3 minutes. In nested env the value may be large,
this increases the value to 20 minutes.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- We should return a boolean and not a String 'true' or 'false'. Although this output is never checked by the calling function(s).
- Do not use == False or == None as that is not according to the Python specs.
- Calling just print 'hello' is deprecated and won't work in newer Python versions. We should use the print() function.
- Remove unused and commented function.
- Use logging.warning() instead of logging.warn()
- Use subprocess.check_output() for execution. This is the Python way of executing commands.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
With merge of PR #2028 the separator for lines to the Security Group
Python script changed from : to ; to support IPv6 addresses.
This broke certain situations where rules were parsed improperly. This
commit fixes the issue.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This script is used to upload snapshots to swift and is executed on dom0 of XenServer. The PR make logging from /var/log/cloud/swiftxenserver.log more meaningful as the below example;
2017-06-15 10:26:32 DEBUG [root] #### CLOUD enter swift ####
2017-06-15 10:26:32 DEBUG [root] #### CLOUD upload begin S-12522/d841b62a-7f83-4d5d-9e9d-2940115f7fa9.vhd to swift ####
2017-06-15 10:27:13 DEBUG [root] #### CLOUD upload complete S-12522/d841b62a-7f83-4d5d-9e9d-2940115f7fa9.vhd to swift: 0:00:40 @ 45 MB/s ####
2017-06-15 10:27:13 DEBUG [root] #### CLOUD exit swift ####
This fixes regression introduced in PR #2295:
- Pass assign=true to fetch new public IP
- Use wait_until instead of sleep+wait in tests
- Loop through list of public IP ranges to match the systemvm gateway
- Fix potential NPE seen when adding simulator host(s)
- Removes aria2 installation from setup_agent.sh using yum, it's already
dependency for cloudstack-agent package
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows using templates and ISOs avoiding secondary storage as intermediate cache on KVM. The virtual machine deployment process is enhanced to supported bypassed registered templates and ISOs, delegating the work of downloading them to primary storage to the KVM agent instead of the SSVM agent.
Template and ISO registration:
- When hypervisor is KVM, a checkbox is displayed with 'Direct Download' label.
- API methods registerTemplate and registerISO are both extended with this new parameter directdownload.
- On template or ISO registration, no download job is sent to SSVM agent, CloudStack would only persist an entry on template_store_ref indicating that template or ISO has been marked as 'Direct Download' (bypassing Secondary Storage). These entries are persisted as:
template_id = Template or ISO id on vm_template table
store_id NULL
download_state = BYPASSED
state = Ready
(Note: these entries allow users to deploy virtual machine from registered templates or ISOs)
- An URL validation command is sent to a random KVM host to check if template/ISO location can be reached. Metalink are also supported by this feature. In case of a metalink, it is fetched and URL check is performed on each of its URLs.
- Checksum should be provided as indicated on #2246: {ALGORITHM}CHKSUMHASH
- After template or ISO is registered, it would be displayed in the UI
Virtual machine deployment:
When a 'Direct Download' template is selected for deployment, CloudStack would delegate template downloading to destination storage pool via destination host by a new pluggable download manager.
Download manager would handle template downloading depending on URL protocol. In case of HTTP, request headers can be set by the user via vm_template_details. Those details should be persisted as:
Key: HTTP_HEADER
Value: HEADERNAME:HEADERVALUE
In case of HTTPS, a new API method is added uploadTemplateDirectDownloadCertificate to allow user importing a client certificate into all KVM hosts' keystore before deployment.
After template or ISO is downloaded to primary storage, usual entry would be persisted on template_spool_ref indicating the mapping between template/ISO and storage pool.
This commit adds support for passing IPv6 Addresses and/or Subnets as
Secondary IPs.
This is groundwork for CLOUDSTACK-9853 where IPv6 Subnets have to be
allowed in the Security Groups of Instances to we can add DHCPv6
Prefix Delegation.
Use ; instead of : for separating addresses, otherwise it would cause
problems with IPv6 Addresses.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
- All tests should pass on KVM, Simulator
- Add test cases covering FSM state transitions and actions
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.
The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.
The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.
The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.
Signed-off-by: Abhinandan Prateek <abhinandan.prateek@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
After integrating XS with CCP the following folder gets created: /var/log/cloud/ however the logs in that are not rotated resulting in root file system fill up. It was a known issue and link http://support.citrix.com/article/CTX138064 describes the issue and solution. Used the article and added corresponding changes to Cloudstack.
ip6tables no longer takes '--icmpv6-type any' as a argument.
To allow all ICMPv6 traffic with ip6tables it has to be invoked this way:
$ ip6tables -I i-2-14-VM -p icmpv6 -s ::/0 -j ACCEPT
All ICMPv6 traffic is now allow into the Instance.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This fixes the agreed upon url on download.cloudstack.org in various
sql files and misc scripts.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This commit implements Ingress and Egress filtering for IPv6 in
Basic Networking.
It allows for opening and closing ports just as can be done with IPv4.
Rules have to be specified twice, once for IPv4 and once for IPv6, for
example:
- 22 until 22: 0.0.0.0/0
- 22 until 22: ::/0
Egress filtering works the same as with IPv4. When no rule is applied all
traffic is allowed. Otherwise only the specified traffic (with DNS being
the exception) is allowed.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This commit implements basic Security Grouping for KVM in
Basic Networking.
It does not implement full Security Grouping yet, but it does:
- Prevent IP-Address source spoofing
- Allow DHCPv6 clients, but disallow DHCPv6 servers
- Disallow Instances to send out Router Advertisements
The Security Grouping allows ICMPv6 packets as described by RFC4890
as they are essential for IPv6 connectivity.
Following RFC4890 it allows:
- Router Solicitations
- Router Advertisements (incoming only)
- Neighbor Advertisements
- Neighbor Solicitations
- Packet Too Big
- Time Exceeded
- Destination Unreachable
- Parameter Problem
- Echo Request
ICMPv6 is a essential part of IPv6, without it connectivity will break or be very
unreliable.
For now it allows any UDP and TCP packet to be send in to the Instance which
effectively opens up the firewall completely.
Future commits will implement Security Grouping further which allows controlling UDP and TCP
ports for IPv6 like can be done with IPv4.
Regardless of the egress filtering (which can't be done yet) it will always allow outbound DNS
to port 53 over UDP or TCP.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Use separate lvcreate command on XenServer7 hosts, that checks and passes
different parameters based on the xenserver release version.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Allow DNS queries over TCP when egress filtering is configured.
When using DNSSEC more and more queries are done over TCP and this
requires 53/TCP to be allowed.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
more detailed error if host file not found or cannot be opened
using mkstemp and mkdtemp for improved security
improve resource cleanup in error conditions in unit test
As requested here: https://github.com/apache/cloudstack/pull/1495
No scripts are using perl so that install requirement can be removed.
The new scripts are using standard python packages only.
Includes extensive unit test.
kvm: Aqcuire lock when running security group Python scriptIt could happen that when multiple instances are starting at the same
time on a KVM host the Agent spawns multiple instances of security_group.py
which both try to modify iptables/ebtables rules.
This fails with on of the two processes failing.
The instance is still started, but it doesn't have any IP connectivity due
to the failed programming of the security groups.
This modification lets the script aqcuire a exclusive lock on a file so that
only one instance of the scripts talks to iptables/ebtables at once.
Other instances of the script which start will poll every 500ms if they can
obtain the lock and otherwise execute anyway after 15 seconds.
* pr/1408:
kvm: Aqcuire lock when running security group Python script
Signed-off-by: Will Stevens <williamstevens@gmail.com>
It could happen that when multiple instances are starting at the same
time on a KVM host the Agent spawns multiple instances of security_group.py
which both try to modify iptables/ebtables rules.
This fails with on of the two processes failing.
The instance is still started, but it doesn't have any IP connectivity due
to the failed programming of the security groups.
This modification lets the script aqcuire a exclusive lock on a file so that
only one instance of the scripts talks to iptables/ebtables at once.
Other instances of the script which start will poll every 500ms if they can
obtain the lock and otherwise execute anyway after 15 seconds.
The lock will be released as soon as the script exists, which is usually within
a few hundred ms.
* 4.7:
CLOUDSTACK-9172 Added cross zones check to delete template and iso
Check the existence of 'forceencap' parameter before use
systemvm: set default umask 022 in injectkeys.sh
The default umask of 0022 is set in Ubuntu and other packages. Set the same
in case of CentOS startup scripts. Use umask 022 in the injectkeys.sh script
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This is a mandatory argument but it was NOT passed which caused the
re-programming of security groups to fail.
Simple fix to just add the argument since the variable is available
there.
* 4.6:
Revert "Change references of people.apache.org to home.apache.org in the test code"
Change references of people.apache.org to home.apache.org in the test code This closes#1123 Signed-off-by: SrikanteswaraRao Talluri <talluri@apache.org>
CLOUDSTACK-9077 Fix injectkeys.sh to work on CentOS7
CLOUDSTACK-9065: fix bug when creating packaging with noredist flag
The S3 implementation is far from finished, this commit focusses on the bases.
- Upgrade AWS SDK to latest version.
- Rewrite S3 Template downloader.
- Rewrite S3Utils utility class.
- Improve addImageStoreS3 API command.
- Split various classes for convenience.
- Various minor improvements and code optimalisations.
A side effect of the new AWS SDK is that it, by default, uses the V4 signature. Therefore I added an option to specify the Signer, so it stays compatible with previous versions.
CLOUDSTACK-9029: Proper support to identify CentOS 7 version numberhttps://issues.apache.org/jira/browse/CLOUDSTACK-9029
* pr/1033:
CLOUDSTACK-9029: Proper support to identify CentOS 7 version number
Signed-off-by: Remi Bergsma <github@remi.nl>
To configure firewall rules, CloudStack modifies `/etc/sysctl.conf` and
execute those modifications. This may be harmful for several reasons:
1. `/etc/sysctl.conf` may be managed by some configuration management
system. Such a system will constantly restore the previous version.
2. `/etc/sysctl.conf` may contain additional properties that have been
changed later by some system administrator (for example, once a
firewall has been configured, forwarding may have been activated
while it is disabled in `/etc/sysctl.conf`). Executing the file
again at a later time may disrupt the system.
3. Entries are added again and again. `/etc/sysctl.conf` will contain
the same directives repeated several times.
Using a configuration file is not needed as `sysctl` is able to directly
modify sysctl values with `-w` flag.
Signed-off-by: Vincent Bernat <Vincent.Bernat@exoscale.ch>
This setting works on CentOS 6 / RHEL 6 but does nothing, as
"cpu" cgroup is not mounted. On CentOS 7 / RHEL 7 systemd does
mount cgroups and "cpu" is co-mounted with "cpuacc". Hence, if
we specify "cpu" then this results in an error because it can
only use them both, or none.
By removing the setting, we rely on the default of qemu, which
is:
cgroup_controllers = ["cpu", "devices", "memory", "blkio", "cpuacct", "net_cls"]
Only if they are really mounted, they will be used. So, this will
work on both version 6 and 7.
The 'fix script' didn't work well, as after a reboot you'd still have qemu
throwing errors. Now we can handle the co-mountedcgroups.
- Changed location of the update_host_passwd script
- Updated the patch files for XenServer
- Updated the script path on LibvirtComputing class
- Removed the hostIP from the LibvirtUpdateHostPasswordCommandWrapper execute() method
- Adding update_host_passwd to VRScripts
- Add accessor method to host password on CitrixResourceBase
- Add implementation to CitrixUpdateHostPasswordCommandWrapper
- Improve testUpdateHostPasswordCommand() unit test on CitrixRequestWrapperTest
- Add line to patch files on xenserver directory
Concerning the LibVirt change:
- I forgot to assing the return of the getDefaultHypervisorScriptsDir() method to the hypervisorScriptsDir variable
- Modifying the LibvirtUpdateHostPasswordCommandWrapper in order to execute the script on the host
- Adding the script path to LibvirtComputingResource
- Adding the host IP address as an instance variable on UpdateHostPasswordCommand
- Improving the Unit Test (LibvirtComputingResourceTest) to get it covering the new code
VLAN id 4095 is commonly used as a 'tag passthrough' in virtualization environments
(VMware, specifically). This vlan id is incompatible with Linux, but we can
allow the admin to manually configure the bridge if the same passthrough is
desired.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit aee35c96a8)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Router VMs don't have a chain rule with -def suffix, this fixes name and
properly removes VR vms not running on a host
- Before trying to remove dnats, filter empty/None elements from list
- destroy_ebtables_rules should check what kind of action is request to be
performed (-A for add or -D for removed) and execute based on that
- Before executing any command, log it for debugging purposes
- Method to cleanup bridge, may be used in future
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 3925512115)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The SG python script depends on ebtables-save which is not available on Debian
based distros (Ubuntu and Debian for example). The commit uses /proc/modules
to find available bridge tables (one of nat, filter or broute) and then
find VMs that need to be removed. Further it uses set() to remove duplicate VMs
so we don't try to remove a VM's rules more than once leading to unwanted errors
in the log.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit d66677101c)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes the issue of Security Groups not working in case of XenServer 6.5;
- Uses nethash ipset data-structure to store CIDRs (efficient than iphash and
avoids overflow errors in case users add /8 /4 ingress/egress cidrs)
- Support for ipset versions both on 6.2 and 6.5, both have different outputs. This
fixes the issue of destroy_network_rules_for_vm failing
- Implements defensive filtering of list, instead of popping last item without
checking if it's None or empty
- Greps using names that are 'quoted' to avoid bash errors
- Before setting up new network rule, tries to clean and remove old ipset entry
- Idents, whitespace and naming fixes
PS. This is my 1000th commit to the 🐵 project :)
This closes#186
(cherry picked from commit d91d161107)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Conflicts:
scripts/vm/hypervisor/xenserver/vmops
scripts: filter output instead of popping string from list
This is a defensive enhancement for KVM SG script that filters out empty string
instead of popping last item which may or may not be an empty string.
Squashed commits:
(cherry picked from commit f4cbc4c010)
(cherry picked from commit 64ab3554a1)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
The recent discussed improvement has the risk that if 'sync' hangs, the reboot may be delayed in the same way as the 'reboot' command would do. To work around, we're adding a 5 second timeout. If it cannot sync in 5 seconds, it will not succeed anyway and we should proceed the reset.
@snuf: Could we use your OVM3 heartbeat script for other hypervisors as well? One way to do it seems like a nice idea :-)
As discussed with @wido @pyr and @nuxro added an extra log line.
Tested it and it logs fine (tested to local disk) when syncing first:
Apr 3 15:31:23 mcctest2 heartbeat: kvmheartbeat.sh system because it was unable to write the heartbeat to the storage
By the way, it did also log to the agent.log but this extra log has the benefit of ending up in the system log so you'll probably find it easier there. Existing logs:
2015-04-03 15:27:23,943 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 0
2015-04-03 15:28:23,944 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 1
2015-04-03 15:29:23,946 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 2
2015-04-03 15:30:23,948 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 3
2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 4
2015-04-03 15:31:23,950 WARN [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout; reboot the host
This closes#145
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When storage cannot be reached, it does not make sense to reboot as it will try to flush buffers, umount NFS mounts, etc. This will not work and thus cause a long delay. With this change, the box will reboot immediately (like pressing the reset button).
This is a plugin that puts in ovm3 support ranging from 3.3.1 to 3.3.2. Basic
functionality is in here, advanced networking etc..
Snapshots only work when a VM is stopped now due to the semantics of OVM's raw
image implementation (so snapshots should work on a storage level underneath the
hypervisor shrug)
This closes#113
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When there is large size VR configuration (aggregate commands) copying data to VR using vmops plugin was failed
because of the ARG_MAX size limitation. The configuration data size is around 300KB.
Updated this to create file in host by scp with file contents. This will create file in host.
Then copy the file from the host to VR using hte vmops createFileInDomr method.
In host file get created in /tmp/ with name VR-<UUID>.cfg, once it copied to VR this file will be removed.
This makes sure dom0 in xenserver doesn't get hammered
when copying templates. It doesn't make sense to use
the cache of dom0 as the template does not fit in
memory. The directIO flags prevent it from trying.
Some try/except in security_group.py catch a lot of exceptions. There
was already one fixed in CLOUDSTACK-1052. Here is another one. We use
logging.exception() to log those exceptions.
Signed-off-by: Vincent Bernat <Vincent.Bernat@exoscale.ch>
Signed-off-by: Pierre-Luc Dion <pdion891@apache.org>
Recent versions of libvirt (at least 0.9.8) will return an int when
queried for the ID of a domain, not a string. This breaks some parts of
the `security_group.py` script which expects a string containing an
int. Notably, this breaks the part handling VM reboots which is
therefore not executed.
Signed-off-by: Vincent Bernat <Vincent.Bernat@exoscale.ch>
Signed-off-by: Sebastien Goasguen <runseb@gmail.com>
last VM from the VPC is deleted on a host
OVS distributed routing: ensure bridge is deleted when last VM from the
VPC is deleted on a host. This fix ensures that bridge is
destroyed.
not created already when OvsVpcPhysicalTopologyConfigCommand update is
recived
Currently if the tunnel creation fails, there is no retry logic. Fix
ensures OvsVpcPhysicalTopologyConfigCommand updates as an opputiunity to ensure
proper tunnels are established between the hosts.
OvsVpcPhysicalTopologyConfigCommand and OvsVpcRoutingPolicyConfigCommand
fix ensures only latest updates are applied (new openflow rules) to the
bidge enabled for distributed routing.
fix mismatch of ovs-host-setup, ovs_host_setup used Libvirt resource and
scripts
plug the nic to OVS bridges created for the tunnel network.
Conflicts:
plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/OvsVifDriver.java
distributed routing. Fix ensures there is approproate flag in other
config of the network to indicate the bridge type.
Conflicts:
server/src/com/cloud/network/vpc/VpcManagerImpl.java
to flow rules and applies them on the bridge
add event subscriber in OvsTunnelManager, that listens to
replaceNetworkAcl events. On event sends the updated policy info to all
the hosts in the VPC
- get the hosts on which VPC spans given vpc id
- get the VM's in the VPC
- get the hosts on which a network spans
- get the VPC's to which a hosts is part of
- get VM's of a VPC on a hosts
introduces capability to build a physical toplogy representation of a
VPC. This json file is encapsulated in
OvsVpcPhysicalTopologyConfigCommand, and is used to send full topology
to hypervisor hosts. On hypervisor this json config can be used to setup
tunnels, configure bridge, add flow rules etc
Ovs GURU, to use different broasdcast scheme VS://vpcid.gerkey for the
networks in VPC that use distributed routing
each VIF and tunnel interface to carry the network UUID in other/options
config
distributed routing mode, and setup flows appropriatley
script to handle the VPC topology sent from management server in JSOn
format. From the JSON file, reqired configuration (tunnel setup and flow
rules setup) is setup on the bridge
use interface wildcard "+" in iptables to cover potential used VLAN interface to allow output on physical interface.
you will see
0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-out bond2+ --physdev-is-bridged
instead of
0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 PHYSDEV match --physdev-out bond2.1234 --physdev-is-bridged
Anthony
1) vxlan will use bridge scheme 'brvx-<vni>'. Multiple physical networks can host guest
traffic type with vxlan isolation, so long as they don't use the same VNI range.
2) Guest traffic labels can be physical interface if bridge by given name is not found.
Normally we take traffic label name, find the matching bridge, then resolve that to a
physical interface. Then we create guest bridges on that interface. Now we can just
specify the interface.
xs 6.1/6.2 introduce the new virtual platform, so there are two virtual platforms, windows PV driver version must match virtual platforms,
this patch tracks PV driver versions in vm details and template details.
Anthony