CLOUDSTACK-10341: VR minor fixes to systemvmtemplate (#2468)
CLOUDSTACK-10340: Add setter to hypervisorType in VMInstanceVO (#2504)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes rsyslog: fix config error in rsylslog.conf
Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: action '*' treated as ':omusrmsg:*' - please use ':omusrmsg:*' syntax instead, '*' will not be supported in the future [v8.24.0 try http://www.rsyslog.com/e/2184 ]
Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: error during parsing file /etc/rsyslog.conf, on or before line 95: warnings occured in file '/etc/rsyslog.conf' around line 95 [v8.24.0 try http://www.rsyslog.com/e/2207 ]
- Run apache2 only after cloud-postinit
- Increase /run size for VR with 256M RAM
root@r-395-VM:~# systemctl daemon-reload
Failed to reload daemon: Refusing to reload, not enough space available on /run/systemd. Currently, 15.8M are free, but a safety buffer of 16.0M is enforced.
tmpfs 23M 6.5M 16M 29% /run
This fixes a difference issue in rVR heartbeat check script raised
recently on dev@.
Reduce logging to avoid logging to fill ramdisk
Make checkrouter return fault state when keepalived is not running
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This deprecates and remove TLS 1.0 and 1.1 from preferred list of
protocols and keeps only TLSv1.2.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* CLOUDSTACK-10317: Fix SNAT rules for additional public nics
This allows networks with additional public nics to have correct
SNAT iptables rules applied on configuration.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* update based on Wei's suggested change
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This might (and does block) in certain situations on the VR as
also explained in the Python documentation:
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait
Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE
and the child process generates enough output to a pipe such that
it blocks waiting for the OS pipe buffer to accept more data.
Use communicate() to avoid that.
Using the check_output function handles most of this for us and
also provides better error handling.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Delete dnsmasq's leases file when dnsmasq is restarted to avoid it
use old ip-mac-address-vm mapping leases.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In case of isolated, both self.config.is_vpc() and self.config.is_router() are false,
but self.config.is_dhcp() is true.
Moved the password server logic to the `if has_metadata` block,
as this is valid for all 3 systemvm types.
When the IPv4 address of a Instance changes we need to make sure the
old entry is removed from the DHCP lease file on the Virtual Router
otherwise the Instance will still get the old lease.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This includes test related fixes and code review fixes based on
reviews from @rafaelweingartner, @marcaurele, @wido and @DaanHoogland.
This also includes VMware disk-resize limitation bug fix based on comments
from @sateesh-chodapuneedi and @priyankparihar.
This also includes the final changes to systemvmtemplate and fixes to
code based on issues found via test failures.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes test failures around VMware with the new systemvmtemplate.
In addition:
- Does not skip rVR related test cases for VMware
- Removes rc.local
- Processes unprocessed cmd_line.json
- Fixed NPEs around VMware tests/code
- On VMware, use udevadm to reconfigure nic/mac address than rebooting
- Fix proper acpi shutdown script for faster systemvm shutdowns
- Give at least 256MB of swap for VRs to avoid OOM on VMware
- Fixes smoke tests for environment related failures
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Several systemvmtemplate optimizations
- Uses new macchinina template for running smoke tests
- Switch to latest Debian 9.3.0 release for systemvmtemplate
- Introduce a new `get_test_template` that uses tiny test template
such as macchinina as defined test_data.py
- rVR related fixes and improvements
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactors and simplifies systemvm codebase file structures keeping
the same resultant systemvm.iso packaging
- Password server systemd script and new postinit script that runs
before sshd starts
- Fixes to keepalived and conntrackd config to make rVRs work again
- New /etc/issue featuring ascii based cloudmonkey logo/message and
systemvmtemplate version
- SystemVM python codebase linted and tested. Added pylint/pep to
Travis.
- iptables re-application fixes for non-VR systemvms.
- SystemVM template build fixes.
- Default secondary storage vm service offering boosted to have 2vCPUs
and RAM equal to console proxy.
- Fixes to several marvin based smoke tests, especially rVR related
tests. rVR tests to consider 3*advert_int+skew timeout before status
is checked.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This ports PR #1470 by @remibergsma.
Make the generated json files unique to prevent concurrency issues:
The json files now have UUIDs to prevent them from getting overwritten
before they've been executed. Prevents config to be pushed to the wrong
router.
2016-02-25 18:32:23,797 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) (logid:) Seq 2-4684025087442026584: Processing: { Ans: , MgmtId: 90520732674657, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.routing.GroupA
nswer":{"results":["null - success: null","null - success: [INFO] update_config.py :: Processing incoming file => vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b\n[INFO] Processing JSON file vm_dhcp_entry.json.4ea4506
1-2efb-4467-8eaa-db3d77fb0a7b\n"],"result":true,"wait":0}}] }
On the router:
2016-02-25 18:32:23,416 merge.py __moveFile:298 Processed file written to /var/cache/cloud/processed/vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b.gz
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactor cloud-early-config and make appliance specific scripts
- Make patching work without requiring restart of appliance and remove
postinit script
- Migrate to systemd, speedup booting/loading
- Takes about 5-15s to boot on KVM, and 10-30seconds for VMware and XenServer
- Appliance boots and works on KVM, VMware, XenServer and HyperV
- Update Debian9 ISO url with sha512 checksum
- Speedup console proxy service launch
- Enable additional kernel modules
- Remove unknown ssh key
- Update vhd-util URL as previous URL was down
- Enable sshd by default
- Use hostnamectl to add hostname
- Disable services by default
- Use existing log4j xml, patching not necessary by cloud-early-config
- Several minor fixes and file refactorings, removed dead code/files
- Removes inserv
- Fix dnsmasq config syntax
- Fix haproxy config syntax
- Fix smoke tests and improve performance
- Fix apache pid file path in cloud.monitoring per the new template
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Load the nf_conntrack_ipv6 module for IPv6 connection tracking on SSVM
- Move systemd services to /etc and enable services after they have been
installed
- Disable most services by default and enable in cloud-early-config
- Start services after enabling them using systemd
- In addition remove /etc/init.d/cloud as this is no longer needed and
done by systemd
- Accept DOS/MBR as file format for ISO images as well
Under Debian 7 the 'file' command would return:
debian-9.1.0-amd64-netinst.iso: ISO 9660 CD-ROM filesystem data UDF filesystem data
Under Debian 9 however it will return
debian-9.1.0-amd64-netinst.iso: DOS/MBR boot sector
This would make the HTTPTemplateDownloader in the Secondary Storage VM refuse the ISO as
a valid template because it's not a correct format.
Changes this behavior so that it accepts both.
This allows us to use Debian 9 as a System VM template.
Not sure though if enabling them is enough for systemd to still start them
on first boot
Signed-off-by: Wido den Hollander <wido@widodh.nl>
SystemVM changes to work on Debian 9
- Migrate away from chkconfig to systemctl
- Remove xenstore-utils override deb pkg
- Fix runlevel in sysv scripts for systemd
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
We found bug in ACL rules for PrivateGateway for VPC
At a glance - rules not applied - switching Allow All or Deny All (default ACL) - showed as completed - but rules missed.
Result - traffic via PrivateGateway blocked by next DROP rule in next chains
How to reproduce:
Enable PrivateGateway for Cloudstack
Create VPC
Provision new PrivateGateway inside VPC with some VLAN
Change ACL (optional step to show that problem not in initial configuration but in config itself)
Expected:
ACL rules applied (inserted) into correspondig ACL_INBOUND/OUTBOUND chanins for PrivateGateway interface (ethX) based on ACL which user choose
Current:
No rules inserted. ACL_INBOUND/OUTBOUND_ethX - empty. Traffic blocked by next DROP rule in FORWARD chain
Affect - all our corporate customers blocked with access to their own nets via PG and vice-versa.
Root cause:
Issue happened because of CsNetFilter.py logic for inserting rules for ACL_INBOUND/OUTBOUND chains.
We choose rule numebr to isnert right before last DROP rule - but forget about fact - that if chain empty - we also return 0 as insert position. Which not true for iptables - numeration started from 0.
So we need very small patch to handle this special case - if number of rules inside chain equal to zero - return 1, else - return count of rules inside chain.
It's found only one - just because be default for PrivateGateway - we didn't insert any "service rules" (if SourceNat for PrivateGteway not ticked) - and we have by default empty ACL_INBOUND/OUTBOUND chains. Because same insert happened for all VPC networks (but when we call this insert - we already have at least 1 rule inside chains - and we successfully can process)
When auth strictness is set to true, terminate SSH handshake for clients
that do not present valid certificates.
This uses the `setNeedClientAuth`, where if the option is set and the
client chooses not to provide authentication information about itself,
the negotiations will stop and the engine will begin its closure
procedure:
https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/SSLEngine.html#setNeedClientAuth(boolean)
During systemvm reboot, the conf folder is removed and certificate
re-setup is not done. This may cause the agent to not connect, this
fixes the case by backing up and restoring keystore and other config
files when re-patching is done after rebooting of a systemvm (cpvm, ssvm).
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This introduces a new certificate authority framework that allows
pluggable CA provider implementations to handle certificate operations
around issuance, revocation and propagation. The framework injects
itself to `NioServer` to handle agent connections securely. The
framework adds assumptions in `NioClient` that a keystore if available
with known name `cloud.jks` will be used for SSL negotiations and
handshake.
This includes a default 'root' CA provider plugin which creates its own
self-signed root certificate authority on first run and uses it for
issuance and provisioning of certificate to CloudStack agents such as
the KVM, CPVM and SSVM agents and also for the management server for
peer clustering.
Additional changes and notes:
- Comma separate list of management server IPs can be set to the 'host'
global setting. Newly provisioned agents (KVM/CPVM/SSVM etc) will get
radomized comma separated list to which they will attempt connection
or reconnection in provided order. This removes need of a TCP LB on
port 8250 (default) of the management server(s).
- All fresh deployment will enforce two-way SSL authentication where
connecting agents will be required to present certificates issued
by the 'root' CA plugin.
- Existing environment on upgrade will continue to use one-way SSL
authentication and connecting agents will not be required to present
certificates.
- A script `keystore-setup` is responsible for initial keystore setup
and CSR generation on the agent/hosts.
- A script `keystore-cert-import` is responsible for import provided
certificate payload to the java keystore file.
- Agent security (keystore, certificates etc) are setup initially using
SSH, and later provisioning is handled via an existing agent connection
using command-answers. The supported clients and agents are limited to
CPVM, SSVM, and KVM agents, and clustered management server (peering).
- Certificate revocation does not revoke an existing agent-mgmt server
connection, however rejects a revoked certificate used during SSL
handshake.
- Older `cloudstackmanagement.keystore` is deprecated and will no longer
be used by mgmt server(s) for SSL negotiations and handshake. New
keystores will be named `cloud.jks`, any additional SSL certificates
should not be imported in it for use with tomcat etc. The `cloud.jks`
keystore is stricly used for agent-server communications.
- Management server keystore are validated and renewed on start up only,
the validity of them are same as the CA certificates.
New APIs:
- listCaProviders: lists all available CA provider plugins
- listCaCertificate: lists the CA certificate(s)
- issueCertificate: issues X509 client certificate with/without a CSR
- provisionCertificate: provisions certificate to a host
- revokeCertificate: revokes a client certificate using its serial
Global settings for the CA framework:
- ca.framework.provider.plugin: The configured CA provider plugin
- ca.framework.cert.keysize: The key size for certificate generation
- ca.framework.cert.signature.algorithm: The certificate signature algorithm
- ca.framework.cert.validity.period: Certificate validity in days
- ca.framework.cert.automatic.renewal: Certificate auto-renewal setting
- ca.framework.background.task.delay: CA background task delay/interval
- ca.framework.cert.expiry.alert.period: Days to check and alert expiring certificates
Global settings for the default 'root' CA provider:
- ca.plugin.root.private.key: (hidden/encrypted) CA private key
- ca.plugin.root.public.key: (hidden/encrypted) CA public key
- ca.plugin.root.ca.certificate: (hidden/encrypted) CA certificate
- ca.plugin.root.issuer.dn: The CA issue distinguished name
- ca.plugin.root.auth.strictness: Are clients required to present certificates
- ca.plugin.root.allow.expired.cert: Are clients with expired certificates allowed
UI changes:
- Button to download/save the CA certificates.
Misc changes:
- Upgrades bountycastle version and uses newer classes
- Refactors SAMLUtil to use new CertUtils
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
If a public IP is assigned to a VPC, a VM running inside that VPC cannot ping that public IP. This is due to the IPtables Nat rules set in such a way that drop any requests to the public IP from internal interfaces. I am fixing this so that internal hosts can also reach the public IP.
Reproduction:
Create a VPC
Create a network inside the VPC
Allocate a public IP
Create a VM in the network
Create a port forwarding rule enabling ICMP
ping the public IP inside the VM (this will fail)
This makes sure IP address is active.
After a vRouter is recreated (e.g. reboot via CloudStack UI) and Remote Access VPN enabled, VPN won't work anymore. Here is the abbreviated output of "ipsec auto -status" while we were having the issue:
root@r-10-VM:~# ipsec auto --status
000 using kernel interface: netkey
000 interface lo/lo 127.0.0.1
000 interface lo/lo 127.0.0.1
000 interface eth0/eth0 169.254.1.45
000 interface eth0/eth0 169.254.1.45
000 %myid = (none)
After this commit, the following occurs and VPNs work:
root@r-10-VM:~# ipsec auto --status
000 using kernel interface: netkey
000 interface lo/lo 127.0.0.1
000 interface lo/lo 127.0.0.1
000 interface eth0/eth0 169.254.1.45
000 interface eth0/eth0 169.254.1.45
000 interface eth1/eth1 xxx.xxx.xxx.172
000 interface eth1/eth1 xxx.xxx.xxx.172
000 interface eth2/eth2 192.168.1.1
000 interface eth2/eth2 192.168.1.1
000 %myid = (none)
eth1 interface IP is masked, but now ipsec sees all the interfaces and VPN works.
Looks like this bug was introduced by Pull Request #1423
It added code to start ipsec (cloudstack/systemvm/patches/debian/config/opt/cloud/bin/configure.py)
if vpnconfig['create']:
logging.debug("Enabling remote access vpn on "+ public_ip)
CsHelper.start_if_stopped("ipsec")
This change removes an unnecessary conversion from IPNetwork
to list in one of the router scripts. This makes the router
faster at processing static NAT rules, which can prevent
timeouts when attaching or detaching IPs.
(cherry picked from commit d5c5eb10f8)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
When enabling remote access VPN, a new interface is created upon client connecting via VPN. The DNS service (dnsmasq) is set only to listen on interfaces that are active when it starts. Thus VPN users are provided the VR's IP address for DNS resolution, but it is not actually listening for DNS requests.
Mozilla Firefox displays white tile in place of cursor. The reason - function isImageLoaded() always returns true after first load and function checkUpdate() reloads too fast.
Suggested fix - in refresh() method state imageLoaded should be reverted to false. This ensures that function checkUpdate() processes only when tile image is loaded.
Current SSL protocol and ciphers used in SystemVMs are not the
recommended. To analyze it is possible to use tests such as from SSL
Labs (https://www.ssllabs.com/ssltest/). This commit changes the grade
from C to -A
This enables the firewall/mangle tables rules to ACCEPT instead of RETURN, which
is the same behaviour as observed in ACS 4.5. By accepting the traffic, guest
VMs will be able to communicate tcp traffic between each other over snat public
IPs.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
CLOUDSTACK-9669:egress destination cidr VR python script changes
CLOUDSTACK-9669:egress destination API and orchestration changes
CLOUDSTACK-9669: Added the ipset package in systemvm template
CLOUDSTACK-9669:Added licence header for new files
CLOUDSTACK-9669: replacing 0.0.0.0/0 with the network cidr
ipset member add with 0.0.0.0/0 fails. So 0.0.0.0/0 replaced with the network cidr.
In source cidr 0.0.0.0/0 is nothing but network cidr.
updated the default egress all cidr with network cidr
* 4.9:
Do not set gateway to 0.0.0.0 for windows clients
CLOUDSTACK-9904: Fix log4j to have @AGENTLOG@ replaced
ignore bogus default gateway when a shared network is secondary the default gateway gets overwritten by a bogus one dnsmasq does the right thing and replaces it with its own default which is not good for us so check for '0.0.0.0'
Activate NioTest following changes in CLOUDSTACK-9348 PR #1549
CLOUDSTACK-9828: GetDomRVersionCommand fails to get the correct version as output Fix tries to return the output as a single command, instead of appending output from two commands
CLOUDSTACK-3223 Exception observed while creating CPVM in VMware Setup with DVS
CLOUDSTACK-9787: Fix wrong return value in NetUtils.isNetworkAWithinNetworkB
- do not keep passwords in databag (/etc/cloudstack/vmpasswd.json)
- process only the password we get in (vm_password.json) from mgt server
- lookup the correct passwd server instead of adding passwd to all of them
Example:
- 4 tiers and 199 VMs running
- Start vm 200 would cause new passwd from vm_password.json (1) to be merged with /etc/cloudstack/vmpasswd.json (199)
- A curl command was exected foreach password (200) foreach tier (4) resulting in 800 calls
- In fact, since passwds are never cleaned it could very well be even more as the ip address was the key in the json file so until the ip address was reused the original password would remain and be sent to passwd server every time another vm starts.
- This took ~40 seconds
Now we just figure out the right tier and only process the new password resulting in a single curl call.
- takes 0,03 seconds!
when a shared network is secondary the default gateway gets overwritten by a bogus one
dnsmasq does the right thing and replaces it with its own default which is not good for us
so check for '0.0.0.0'
- commented some occurences of cloud.com as being harmless
* examples
* identifiers (internal)
- changed the URL for vhd-util download
- changed comments from 'cloud.com' to 'Apache CloudStack'
Fix public IPs not being removed from the VR when deprovisionedThis PR replaces #1706. It does not remove the IP from the database, but it does deprovision the IP correctly from the VR when the public IP is removed.
* pr/1907:
Fix public IPs not being removed from the VR when deprovisioned
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
CLOUDSTACK-9757: Fixed issue in traffic from additional public subnetAcquire ip from additional public subnet and configure nat on that ip.
After this pick any from that network and access additional public subnet from this vm. Traffic is supposed to go via additional public subnet interface in the VR.
* pr/1922:
CLOUDSTACK-9757: Fixed issue in traffic from additional public subnet
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
* 4.9:
CLOUDSTACK-9746 system-vm: logrotate config causes critical failures
CLOUDSTACK-9788: Fix exception listNetworks with pagesize=0
CLOUDSTACK-8663: Fixed various issues to allow VM snapshots and volume snapshots to exist together
Fix HVM VM restart bug in XenServer
[CLOUDSTACK-9793] Faster IP in subnet checkThis change removes the conversion from IPNetwork to list in one of the router scripts. This makes the router faster at processing static NAT rules, which can prevent timeouts when attaching or detaching IPs.
With the `list` conversion, it has to potentially check a list of 65536 IP strings multiple times. We assume that the comparison implemented in the IPNetwork is far more efficient. We have seen speed-up from 218 seconds to enable static NAT with 18 IPs on the router to 2 or 3 seconds by removing this cast. This also fixes a potential bug where adding IPs to a router time out because the scripts are taking too long. 218 seconds, for example, is beyond the timeout on the KVM agent for script execution, and then all enableStaticNat operations will fail.
* pr/1948:
CLOUDSTACK-9793: Faster ip in subnet check
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
* rotate both daily and by size by using maxsize in stead of size
* decrease the max size to 10M for rsyslog files
* remove delaycompress for rsyslog files
* increase rotate to 10 for cloud.log
This change removes an unnecessary conversion from IPNetwork
to list in one of the router scripts. This makes the router
faster at processing static NAT rules, which can prevent
timeouts when attaching or detaching IPs.
Updated StrongSwan VPN ImplementationThis PR is a merge of @jayapalu changes in #872 and the changes I had to make to get the functionality working.
I have done pretty extensive testing of this code so far and we are looking to be in pretty good shape. One thing to note is that a `Diffie-Hellman` group **is required** in order for this feature to work correctly. It is not highlighted in the tests below, but I have shown that the `PFS` is not required for this feature to work. In #872 I have shown a more exhaustive set of tests of this code, but I have limited this set of tests to a recommended `IKE` and `ESP` configuration in order to reduce the noise and test the other areas of functionality.
**Test Results**
I am testing this functionality by creating two VPCs with VMs in each and creating a S2S VPN connection between the two VPCs. Then I SSH into a VM in one VPC and I ping the private IP of a VM in the other VPC. Then I tear it down and try a different configuration.
_Setup_
```
VPC 1 VPC 2
===== =====
VPN Gateway VPN Gateway
VPN Customer Gateway VPN Customer Gateway
VPN Connection <---> VPN Connection
- Passive = True - Passive = False
```
_Legend_
`SKIP` => At least one of the VPN Connections did not come up, so no test was run.
`OK` => The ping test was successful over the S2S VPN connection.
`FAIL` => The ping test failed over the S2S VPN connection.
`Passive` => Specifies if either the `<vpc_1> : <vpc_2>` sides of the VPN Connection is set to passive.
`Conn State` => Specifies the connection status of the `<vpc_1> : <vpc_2>` VPN Connection in the UI.
`Requires Reset` => If the ping test does not result in an `OK`, then a VPN Connection Reset is performed on either `<vpc_1> : <vpc_2>` sides of the VPN Connection based on which side is not showing `Connected`. The results in the `Status` column is the final result after the reset is performed.
_Results_
```
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| Status | IKE & ESP | DPD | Encap | IKE Life | ESP Life | Passive | Conn State | Requires Reset |
+========+======================+=======+=======+==========+==========+===============+=============================+================+
| OK | aes128-sha1;modp1536 | True | False | 86400 | 3600 | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | True | 86400 | 3600 | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | | 3600 | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | 86400 | | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | | | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | 86400 | 3600 | False : False | Connected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | 86400 | 3600 | True : True | Disconnected : Disconnected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | True | False | 86400 | 3600 | False : True | Connected : Disconnected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | False | False | 86400 | 3600 | False : False | Connected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | False | False | 86400 | 3600 | True : False | Disconnected : Connected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | False | False | 86400 | 3600 | True : True | Disconnected : Disconnected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| OK | aes128-sha1;modp1536 | False | False | 86400 | 3600 | False : True | Connected : Disconnected | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| SKIP | aes128-sha1 | True | False | 86400 | 3600 | True : False | Disconnected : Error | True : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| SKIP | aes128-sha1 | False | False | 86400 | 3600 | True : False | Disconnected : Error | True : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| FAIL | aes128-sha1 | True | False | 86400 | 3600 | True : True | Disconnected : Disconnected | True : True |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
| SKIP | aes128-sha1 | True | False | 86400 | 3600 | False : False | Connected : Error | False : False |
+--------+----------------------+-------+-------+----------+----------+---------------+-----------------------------+----------------+
```
* pr/1741:
complete implementation of the StrongSwan VPN feature
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
[4.9] CLOUDSTACK-9692: Fix password server issue in redundant VRsThe password server in RVRs has wrong parameters as the gateway of guest nics is None.
In this case, we should get the gateway from /var/cache/cloud/cmdline.
This issue is caused by commit 45642b8382
* pr/1871:
CLOUDSTACK-9692: Fix password server issue in redundant VRs
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
The password server in RVRs has wrong parameters as the gateway of guest nics is None.
In this case, we should get the gateway from /var/cache/cloud/cmdline.
CLOUDSTACK-9615: Fixd applying ingress rules without portsWhen ingress rule is applied without ports (port start and port end params are not passed) then API/UI is showing rule got applied but in the VR, iptables rule not got applied.
Fixed this issue in the VR script.
* pr/1783:
CLOUDSTACK-9615: Fixed applying ingress rules without ports
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
As part of the bug 'CLOUDSTACK-9339 Virtual Routers don't handle Multiple Public Interfaces correctly'
issue of mismatch of traffic type represented by 'nw_type' in config sent by management server in
ip_associations.json and how it is persisted in the ips.json data bag are differnet,
is addressed, however missed the change in final merge.
this bug is to add the functionality in cs_ip.py, to lower the traffic type sent by management server before persisting in the ips.json databag
-when processing static nat rule, add a mangle table rule, to mark the traffic
from the guest vm when it has associated static nat rule so that traffic gets
routed using the route tabe of the device which has public ip associated
-fix the case where nic_device_id is empty when ip is getting disassociated
resulting in empty deviceid in ips.json
-add utility methods in CsRule, and CsRoute to add 'ip rule' and 'ip route' rules respectivley
-ensure traffic from all public interfaces are connection marked with device number, and restored
for the reverse traffic. use the connection marked number to do device specific routing table lookup
fill the device specific routing table with default route
-component tests for testing multiple public interfaces of VR
CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network when guest VM has nic's in more than one guest network set the tag for each host in /etc/dhcphosts.txt, and use the tag to add exception in /etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network
this was the behaviour with edithosts.sh prior to 4.6
* pr/1766:
CLOUDSTACK-9598: wrong defaut gateway for the nic in non-default network
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This commit adds a additional VirtIO channel with the name
'org.qemu.guest_agent.0' to all Instances.
With the Qemu Guest Agent the Hypervisor gains more control over the Instance if
these tools are present inside the Instance, for example:
* Power control
* Flushing filesystems
* Fetching Network information
In the future this should allow safer snapshots on KVM since we can instruct the
Instance to flush the filesystems prior to snapshotting the disk.
More information: http://wiki.qemu.org/Features/QAPI/GuestAgent
Keep in mind that on Ubuntu AppArmor still needs to be disabled since the default
AppArmor profile doesn't allow libvirt to write into /var/lib/libvirt/qemu
This commit does not add any communication methods through API-calls, it merely
adds the channel to the Instances and installs the Guest Agent in the SSVMs.
With the addition of the Qemu Guest Agent channel a second channel appears in /dev
on a SSVM as a VirtIO port.
The order in which the ports are defined in the XML matters for the naming inside
the SSVM VM and by not relying on /dev/vportXX but looking for a static name the
SSVM still boots properly if the order in the XML definition is changed.
A SSVM with both ports attached will have something like this:
root@v-215-VM:~# ls -l /dev/virtio-ports
total 0
lrwxrwxrwx 1 root root 11 May 13 21:41 org.qemu.guest_agent.0 -> ../vport0p2
lrwxrwxrwx 1 root root 11 May 13 21:41 v-215-VM.vport -> ../vport0p1
root@v-215-VM:~# ls -l /dev/vport*
crw------- 1 root root 251, 1 May 13 21:41 /dev/vport0p1
crw------- 1 root root 251, 2 May 13 21:41 /dev/vport0p2
root@v-215-VM:~#
In this case the SSVM port points to /dev/vport0p1, but if the order in the XML
is different it might point to /dev/vport0p2
By looking for a portname with a pre-defined pattern in /dev/virtio-ports we
do not rely on the order in the XML definition.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
CLOUDSTACK-9498: VR CsFile search utility methods fail when search stThere is no real use of python 're' module in CsFile.py utility methods searchString, deleteLine. Regular string search is sufficient. These methods are used only for VPN user add/delete. Since VPN user password can have python 're' module meta characters, it interfere with search functionality.
Replacing re.search() with regular string search instead.
Change is confined to VPN add/delete users. Have run the test/integration/component/test_vpn_users.py
VPN remote access user limit tests ... === TestName: test_01_VPN_user_limit | Status : SUCCESS ===
ok
Test create VPN when L2TP port in use ... === TestName: test_02_use_vpn_port | Status : SUCCESS ===
ok
Test create NAT rule when VPN when L2TP enabled ... === TestName: test_03_enable_vpn_use_port | Status : SUCCESS ===
ok
Test add new users to existing VPN ... === TestName: test_04_add_new_users | Status : SUCCESS ===
ok
Test add duplicate user to existing VPN ... === TestName: test_05_add_duplicate_user | Status : SUCCESS ===
ok
Test as global admin, add a new VPN user to an existing VPN entry ... === TestName: test_06_add_VPN_user_global_admin | Status : SUCCESS ===
ok
Test as domain admin, add a new VPN user to an existing VPN entry ... === TestName: test_07_add_VPN_user_domain_admin | Status : SUCCESS ===
ok
* pr/1680:
CLOUDSTACK-9498: VR CsFile search utility methods fail when search string has 're' meta chars, and causing VPN user add/deelte to fail
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
when guest VM has nic's in more than one guest network set the tag for
each host in /etc/dhcphosts.txt, and use the tag to add exception in
/etc/dhcpopts.txt to prevent sending default route, dns server in case if the nic is in non-default network
this was the behaviour with edithosts.sh prior to 4.6
added new test case test_router_dhcp_opts to test DHCP option file use of cloudstack
The VR executes a ip route flush command as part of configurations. This command performs a
DNS lookup on the VR hostname. Since the VR does not have a DNS entry, the ip command would
wait 5 seconds before timing out and executing the flush operation. This fix adds the VR
hostname to /etc/hosts mapped to 127.0.0.1 to answer the DNS lookup – reducing the
execution time.
CLOUDSTACK-8326: Always fill UDP checksums in DHCP replies in VRIn some cases the UDP checksums in packets from DHCP servers are
incorrect. This is a problem for some DHCP clients that ignore
packets with bad checksums. This patch inserts an iptables rule
to ensure DHCP servers always send packets with correct checksums.
Due to this bug DHCP offers are sometimes not accepted by Instances.
The end-result without this fix is no connectivity for the Instance
due to the lack of a IPv4 address.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
* pr/1743:
CLOUDSTACK-8326: Always fill UDP checksums in DHCP replies in VR
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In some cases the UDP checksums in packets from DHCP servers are
incorrect. This is a problem for some DHCP clients that ignore
packets with bad checksums. This patch inserts an iptables rule
to ensure DHCP servers always send packets with correct checksums.
Due to this bug DHCP offers are sometimes not accepted by Instances.
The end-result without this fix is no connectivity for the Instance
due to the lack of a IPv4 address.
This is also commited in OpenStack:
- https://github.com/projectcalico/felix/issues/40
- https://review.openstack.org/148718
- https://bugzilla.redhat.com/show_bug.cgi?id=910619
Signed-off-by: Wido den Hollander <wido@widodh.nl>
CLOUDSTACK-9183: bash: /opt/cloud/bin/getRouterAlerts.sh: No such file or directory
* pr/1744:
CLOUDSTACK-9183: bash: /opt/cloud/bin/getRouterAlerts.sh: No such file or directory
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
're' meta chars, and causing VPN user add/deelte to fail
-there is no real use of python 're' in CsFile.py utility methods searchString, deleteLine
Replacing with regular string search instead.
-modifying the smoke test for VPN user add/delete to have all permissable chars
resulting in internal LB vm not come up
parsing cmd_line to create 'ips' data bag, never handled internal lb vm, but still
worked due to another bug. support for internal lb vm is added with this fix
CLOUDSTACK-9480, CLOUDSTACK-9495 fix egress rule incorrect behaviorWhen 'default egress policy' is set to 'allow' in the network offering, any egress rule that is added will 'deny' the traffic overriding the default behaviour.
Conversely, when 'default egress policy' is set to 'deny' in the network offering, any egress rule that is added will 'allow' the traffic overriding the default behaviour.
While this works for 'tcp', 'udp' as expected, for 'icmp' protocol its always set to ALLOW. This patch keeps all protocols behaviour consistent.
Results of running test/integration/component/test_egress_fw_rules.py. With out the patch test_02_egress_fr2 test was failing. This patch fixes the test_02_egress_fr2 scenario.
-----------------------------------------------------------------------------------------------------
Test By-default the communication from guest n/w to public n/w is NOT allowed. ... === TestName: test_01_1_egress_fr1 | Status : SUCCESS ===
ok
Test By-default the communication from guest n/w to public n/w is allowed. ... === TestName: test_01_egress_fr1 | Status : SUCCESS ===
ok
Test Allow Communication using Egress rule with CIDR + Port Range + Protocol. ... === TestName: test_02_1_egress_fr2 | Status : SUCCESS ===
ok
Test Allow Communication using Egress rule with CIDR + Port Range + Protocol. ... === TestName: test_02_egress_fr2 | Status : SUCCESS ===
ok
Test Communication blocked with network that is other than specified ... === TestName: test_03_1_egress_fr3 | Status : SUCCESS ===
ok
Test Communication blocked with network that is other than specified ... === TestName: test_03_egress_fr3 | Status : SUCCESS ===
ok
Test Create Egress rule and check the Firewall_Rules DB table ... === TestName: test_04_1_egress_fr4 | Status : SUCCESS ===
ok
Test Create Egress rule and check the Firewall_Rules DB table ... === TestName: test_04_egress_fr4 | Status : SUCCESS ===
ok
Test Create Egress rule and check the IP tables ... SKIP: Skip
Test Create Egress rule and check the IP tables ... SKIP: Skip
Test Create Egress rule without CIDR ... === TestName: test_06_1_egress_fr6 | Status : SUCCESS ===
ok
Test Create Egress rule without CIDR ... === TestName: test_06_egress_fr6 | Status : SUCCESS ===
ok
Test Create Egress rule without End Port ... === TestName: test_07_1_egress_fr7 | Status : EXCEPTION ===
ERROR
Test Create Egress rule without End Port ... === TestName: test_07_egress_fr7 | Status : SUCCESS ===
ok
Test Port Forwarding and Egress Conflict ... SKIP: Skip
Test Port Forwarding and Egress Conflict ... SKIP: Skip
Test Delete Egress rule ... === TestName: test_09_1_egress_fr9 | Status : SUCCESS ===
ok
Test Delete Egress rule ... === TestName: test_09_egress_fr9 | Status : SUCCESS ===
ok
Test Invalid CIDR and Invalid Port ranges ... === TestName: test_10_1_egress_fr10 | Status : SUCCESS ===
ok
Test Invalid CIDR and Invalid Port ranges ... === TestName: test_10_egress_fr10 | Status : SUCCESS ===
ok
Test Regression on Firewall + PF + LB + SNAT ... === TestName: test_11_1_egress_fr11 | Status : SUCCESS ===
ok
Test Regression on Firewall + PF + LB + SNAT ... === TestName: test_11_egress_fr11 | Status : SUCCESS ===
ok
Test Reboot Router ... === TestName: test_12_1_egress_fr12 | Status : SUCCESS ===
ok
Test Reboot Router ... === TestName: test_12_egress_fr12 | Status : EXCEPTION ===
ERROR
Test Redundant Router : Master failover ... === TestName: test_13_1_egress_fr13 | Status : SUCCESS ===
ok
Test Redundant Router : Master failover ... === TestName: test_13_egress_fr13 | Status : SUCCESS ===
ok
-----------------------------------------------------------------------------------------------------
* pr/1666:
fix egress rule incorrect behavior
Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
CLOUDSTACK-9480: Egress Firewall: Incorrect use of Allow/Deny for ICMP
fix ensures, ICMP, TCP, UDP are handled similalry w.r.t egress rule action
CLOUDSTACK-9495: Egress rules functionalty broken when protocol=all specified
when protocol=all specified, CIDR was ignored. Fix ensures if CIDR is specified
its always used in configuring iptable rules
2 new test cased to test /32 CIDR
DNS on VR should not be publically accessible as it may be prone to DNS
amplification/reflection attacks. This fixes the issue by only allowing VR
DNS (port 53) to be accessible from guest network cidr, as per the fix in:
https://issues.apache.org/jira/browse/CLOUDSTACK-6432
- Only allows guest network cidrs to query VR DNS on port 53.
- Includes marvin smoke test that checks the VR DNS accessibility checks from
guest and non-guest network.
- Fixes Marvin sshClient to avoid using ssh agent when password is provided,
previous some environments may have seen 'No existing session' exception without
this fix.
- Adds a new dnspython dependency that is used to perform dns resolutions in the
tests.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Often, patch and security releases do not require schema migrations or
data migrations. However, if an empty upgrade class and associated
scripts are not defined, the upgrade process will break. With this
change, if a release does not have an upgrade, a noop DbUpgrade is added
to the upgrade path. This approach allows the upgrade to proceed and
for the database to properly reflect the installed version. This change
should make the release process simpler as RMs no longer need to
rememeber to create this boilerplate code when starting a new release.
Beginning with the 4.8.2.0 and 4.9.1.0 releases, the project will
formally adopt a four (4) position release number to properly accomodate
rekeases that contain only CVE fixes. The DatabaseUpgradeChecker and
Version classes made assumptions that they would always parse and
compare three (3) position version numbers. This change adds the
CloudStackVersion value object that supports both three (3) and four (4)
version numbers. It encapsulates version comparsion logic, as well as,
the rules to allow three (3) and four (4) to interoperate.
* Modifies DatabaseUpgradeChecker to handle derive an upgrade path for
a version that was not explicitly specified. It determines the
releases the first release before it with database migrations and uses
that list as the basis for the list for version being calculated. A
noop upgrade is then added to the list which causes no schema changes
or data migrations, but will update the database to the version.
* Adds unit tests for the upgrade path calculation logic in
DatabaseUpgradeChecker
* Removes dummy upgrade logic for the 4.8.2.0 introduced in previous
versions of this patch
* Introduces the CloudStackVersion value object which parses and
compares three (3) and four (4) position version numbers. This class
is intended to replace com.cloud.maint.Version.
* Adds the junit-dataprovider dependency -- allowing test data to be
concisely generated separately from the execution of a test case.
Used extensively in the CloudStackVersionTest.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Ensure that FW_EGRESS_RULE chain exists after upgrading the router
- Flush allow all egress rule on 0.0.0.0/0, if such a rule exists in the config
it will be added later (CLOUDSTACK-9437)
[CLOUDSTACK-9430] Added fix for adding/editing Network ACL rule orderingBUG: https://issues.apache.org/jira/browse/CLOUDSTACK-9430
The issue occurred because all of the ACL rules get inserted before the old ones. Then, the cleanup deletes the duplicate rows, and leaves any new rule in front of the old ones.
Here is an example with a simplified iptables view for ACL
Ex: adding a rule 4
before add:
1,2,3
during add:
1',2',3',4',1,2,3
after add:
4',1,2,3
After fix:
before add:
1,2,3
during add:
1,2,3,1',2',3',4'
after add:
1',2',3',4'
* pr/1609:
Added fix for adding/editing Network ACL rule ordering
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9342: Site to Site VPN PFS not being set correctlyBug in code set PFS to the same value (yes/no) as DPD.
file.addeq(" pfs=%s" % CsHelper.bool_to_yn(obj['dpd']))
* pr/1480:
CLOUDSTACK-9342: Site to Site VPN PFS not being set correctly
Signed-off-by: Will Stevens <williamstevens@gmail.com>
* Take setup from vhost.template rather than default(-ssl)
* should move into Python CS code as well
* Move CORS setup to separate conf
* Modify vhost template to Optionally include the cors file
* Add NameVirtualHost to vhost template for feature parity with ports.conf
* Take setup from vhost.template rather than default(-ssl)
[CLOUDSTACK-9296] Start ipsec for client VPNThis fix starts the IPSEC daemon when enabling client side vpn
* pr/1423:
[CLOUDSTACK-9296] Start ipsec for client VPN
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Lower the time we wait for interfaces to appearWaiting for interfaces is tricky. They might never appear.. for example when we have entries in `/etc/cloudstack/ips.json` that haven't been plugged yet. Waiting this long makes everything horribly slow (every vm, interface, static route, etc, etc, will hit this wait, for every device). We've seen CloudStack send an `ip_assoc.json` command for `eth1` public nic only and then the router goes crazy waiting for all other interfaces that were there before reboot and aren't there. If only the router would return to the mgt server a success of `eth1`, it would get the command for `eth2` etc etc. Obviously, a destroy works much faster because no state services, so no knowledge of previous devices so no waits :-)
After a stop/start the router has state in `/etc/cloudstack/ips.json` and every commands waits. Eventually hitting the hardcoded 120 sec timeout.
* pr/1471:
lower the time we wait for interfaces to appear
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Add Java Default Certificat Authorities into the keystore if using a custom cert SSL
Related to CLOUDSTACK-1475
* pr/1555:
Add Java Default Certificat Authorities into the keystore if using a custom cert SSL Related to CLOUDSTACK-1475 Fix some english message
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvRRebase of PR #1509 against the 4.7 branch as requested by @swill
One LGTM from @ustcweizhou carried from previous PR. Previous PR will be closed.
Description from PR #1509:
CLOUDSTACK-6975 refers to service monitoring bringing up dnsmasq but this is no-longer accurate, as service monitoring is not active on the post-4.6 routers. These routers still suffer an essentially identical issue, however, because "dnsmasq needs to be restarted each time configure.py is called in order to avoid lease problems." As such, dnsmasq is still running on backup RvRs, causing the issues described in CLOUDSTACK-6975.
This PR is based on a patch submitted by @ustcweizhou. The code now checks the redundant state of the router before restarting dnsmasq.
RvR networks without this patch have dnsmasq running on both master and backup routers. RvR networks with this patch have dnsmasq running on only the master router.
* pr/1514:
CLOUDSTACK-6975: Prevent dnsmasq from starting on backup redundant RvR.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
SystemVM cleanupsfrom the logrotate docs
> size - With this, the log file is rotated when the specified size is reached. Size may be specified in bytes (default), kilobytes (sizek), or megabytes (sizem).
> Note: If size and time interval options are specified at same time, only size option take effect. it causes log files to be rotated without regard for the last rotation time. If both log size and timestamp of a log file need to be considered by logrotate, the minsize option should be used. logrotate will rotate log file when they grow bigger than minsize, but not before the additionally specified time interval.
* pr/1414:
systemvm, logrotate: remove daily explicitly as it is ignored
Signed-off-by: Will Stevens <williamstevens@gmail.com>
* 4.7:
Fix Sync of template.properties in Swift
Configure rVPC for router.redundant.vrrp.interval advert_int setting
Have rVPCs use the router.redundant.vrrp.interval setting
Resolve conflict as forceencap is already in master
Split the cidr lists so we won't hit the iptables-resture limits
Check the existence of 'forceencap' parameter before use
Do not load previous firewall rules as we replace everyhing anyway
Wait for dnsmasq to finish restart
Remove duplicate spaces, and thus duplicate rules.
Restore iptables at once using iptables-restore instead of calling iptables numerous times
Add iptables copnversion script.
Reimplement router.redundant.vrrp.interval settingGlobal setting `router.redundant.vrrp.interval` is not used any more and it is now set to a hardcoded 1.
This results in a failover from master->backup when the backup doesn't hear from the master in ~3.6sec. This is a bit too tight, as we've seen failovers during live migrations. We could reproduce it in about half of the cases. Setting this to setting to 2 (tested it by hardcoding it in the systemvms) gives twice as much time and we didn't see issues any more. Instead of updating the hardcoded setting from 1 to 2, I reimplemented the global setting by sending it to the router with the cmd_line, as the non-VPC router also does.
Background:
Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula (256 - priority) / 256
As defined in the RFC, the backup must receive an advertisement from the master every (3 * advert_int) + skew_time seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down.
Source: http://www.hollenback.net/KeepalivedForNetworkReliability
* pr/1486:
Configure rVPC for router.redundant.vrrp.interval advert_int setting
Have rVPCs use the router.redundant.vrrp.interval setting
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Restore iptables at once using iptables-restore instead of calling iptables numerous timesThis makes handling the firewall rules about 50-60 times faster because it is generated in memory and then loaded once. It's work by @borisroman see PR #1400. Reopened it here because I think this is a great improvement.
* pr/1482:
Resolve conflict as forceencap is already in master
Split the cidr lists so we won't hit the iptables-resture limits
Check the existence of 'forceencap' parameter before use
Do not load previous firewall rules as we replace everyhing anyway
Wait for dnsmasq to finish restart
Remove duplicate spaces, and thus duplicate rules.
Restore iptables at once using iptables-restore instead of calling iptables numerous times
Add iptables copnversion script.
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Honour GS use_ext_dns and redundant VR VIPThis patch addresses two issues:
On redundant VR setups, the primary resolver being handed out to instances is the guest_ip (primary IP for the VR). This might lead to problems upon failover, at least while the DHCP lease doesn't update (because the primary resolver will be checked first until times out, however it'll be gone upon failover).
If Global Setting use_ext_dns is true, we don't want the VR to be the primary resolver at all.
* pr/1536:
This patch addresses two issues:
Signed-off-by: Will Stevens <williamstevens@gmail.com>
* 4.8:
CLOUDSTACK-9287 - Improve test by checking if pvt gw is removed and fix typos
Handle private gateways more reliably
CLOUDSTACK-9287 - Fix RVR public interface
CLOUDSTACK-9287 - Add integration test to cover the private gateway related changes
CLOUDSTACK-9287 - Refactor the interface state configuration
CLOUDSTACK-9287 - Check if the nic profile has already been removed from a certain router
CLOUDSTACK-9287 - Bring up the private gw interface on state change to master
CLOUDSTACK-9287 - Make sure private gw interface is not used for default gw
CLOUDSTACK-9287 - Add integration test to cover the private gw interface/mac address issues
CLOUDSTACK-9287 - Put private gateway interface down on backup router
CLOUDSTACK-9287 - Generate new mac address if router is redundant and nic profile exists
Add private gateway IP to router initialization config
apply static routes on change to master state
* 4.7:
CLOUDSTACK-9287 - Improve test by checking if pvt gw is removed and fix typos
Handle private gateways more reliably
CLOUDSTACK-9287 - Fix RVR public interface
CLOUDSTACK-9287 - Add integration test to cover the private gateway related changes
CLOUDSTACK-9287 - Refactor the interface state configuration
CLOUDSTACK-9287 - Check if the nic profile has already been removed from a certain router
CLOUDSTACK-9287 - Bring up the private gw interface on state change to master
CLOUDSTACK-9287 - Make sure private gw interface is not used for default gw
CLOUDSTACK-9287 - Add integration test to cover the private gw interface/mac address issues
CLOUDSTACK-9287 - Put private gateway interface down on backup router
CLOUDSTACK-9287 - Generate new mac address if router is redundant and nic profile exists
Add private gateway IP to router initialization config
apply static routes on change to master state
Handle private gateways more reliablyWhen initialising a VPC router we need to know which IP/device corresponds to a private gateway. This is to solve a problem when stop/starting a VPC router (which gets the private gateway config as a guest network and as a result breaks the functionality). You read it right, the private gateway is sent as type=guest after reboot and type=public initially.
Before this change, you could add a private gw to a running router but you couldn't restart it (it would mix up the tiers). Now the private gateway is detected properly and it works just fine.
Booting without private gateway:
```
root@r-167-VM:~# cat /etc/cloudstack/cmdline.json
{
"config": {
"baremetalnotificationapikey": "V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw",
"baremetalnotificationsecuritykey": "OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w",
"disable_rp_filter": "true",
"dns1": "8.8.8.8",
"domain": "cs2cloud",
"eth0ip": "169.254.0.42",
"eth0mask": "255.255.0.0",
"host": "192.168.22.61",
"name": "r-167-VM",
"port": "8080",
"privategateway": "None",
"redundant_router": "false",
"template": "domP",
"type": "vpcrouter",
"vpccidr": "10.0.0.0/24"
},
"id": "cmdline"
```
Booting with private gateway:
```
root@r-167-VM:~# cat /etc/cloudstack/cmdline.json
{
"config": {
"baremetalnotificationapikey": "V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw",
"baremetalnotificationsecuritykey": "OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w",
"disable_rp_filter": "true",
"dns1": "8.8.8.8",
"domain": "cs2cloud",
"eth0ip": "169.254.2.227",
"eth0mask": "255.255.0.0",
"host": "192.168.22.61",
"name": "r-167-VM",
"port": "8080",
"privategateway": "10.201.10.1",
"redundant_router": "false",
"template": "domP",
"type": "vpcrouter",
"vpccidr": "10.0.0.0/24"
},
"id": "cmdline"
```
And:
```
cat cmdline
vpccidr=10.0.0.0/24 domain=cs2cloud dns1=8.8.8.8 privategateway=10.201.10.1 template=domP name=r-167-VM eth0ip=169.254.2.227 eth0mask=255.255.0.0 type=vpcrouter disable_rp_filter=true baremetalnotificationsecuritykey=OXI16srCrxFBi-xOtEwcYqwLlMfSFTlTg66YHtXBBqR7HNN1us3HP5zWOKxfVmz4a3C1kUNLPrUH13gNmZlu4w baremetalnotificationapikey=V2l1u3wKJVan01h8kq63-5Y5Ia3VLEW1v_Z6i-31QIRJXlt5vkqaqf6DVcdK0jP3u79SW6X9pqJSLSwQP2c2Rw host=192.168.22.61 port=8080
```
Logs:
```
2016-02-24 20:08:45,723 DEBUG [c.c.n.r.VpcVirtualNetworkApplianceManagerImpl] (Work-Job-Executor-4:ctx-458d4c52 job-1402/job-1403 ctx-d5355fca) (logid:5772906c) Set privategateway field in cmd_line.json to 10.201.10.1
```
* pr/1474:
Handle private gateways more reliably
Add private gateway IP to router initialization config
Signed-off-by: Will Stevens <williamstevens@gmail.com>
Apply static routes on change to master stateRefactored static routes for private gateways so they also get loaded when the router switches to master state. Otherwise they're lost and connections drop after fail over.
* pr/1472:
apply static routes on change to master state
Signed-off-by: Will Stevens <williamstevens@gmail.com>
CLOUDSTACK-9287 - Fix unique mac address per rVPC routerThis is work by @wilderrodrigues, see PR #1413 It contains important fixes and I think it needs to be included so I send the PR again.
* pr/1483:
CLOUDSTACK-9287 - Improve test by checking if pvt gw is removed and fix typos
CLOUDSTACK-9287 - Fix RVR public interface
CLOUDSTACK-9287 - Add integration test to cover the private gateway related changes
CLOUDSTACK-9287 - Refactor the interface state configuration
CLOUDSTACK-9287 - Check if the nic profile has already been removed from a certain router
CLOUDSTACK-9287 - Bring up the private gw interface on state change to master
CLOUDSTACK-9287 - Make sure private gw interface is not used for default gw
CLOUDSTACK-9287 - Add integration test to cover the private gw interface/mac address issues
CLOUDSTACK-9287 - Put private gateway interface down on backup router
CLOUDSTACK-9287 - Generate new mac address if router is redundant and nic profile exists
Signed-off-by: Will Stevens <williamstevens@gmail.com>
On redundant VR setups, the primary resolver being handed out to instances is the guest_ip (primary IP for the VR). This might lead to problems upon failover, at least while the DHCP lease doesn't update (because the primary resolver will be checked first until times out, however it'll be gone upon failover).
If Global Setting use_ext_dns is true, we don't want the VR to be the primary resolver at all.
CLOUDSTACK-9336 surround the execution of baremetal-vr.py with condition
* pr/1463:
CLOUDSTACK-9336 surround the execution of baremetal-vr.py with condition
Signed-off-by: Will Stevens <williamstevens@gmail.com>
If the size directive is used, logrotate will ignore the daily, weekly, monthly,
and yearly directives.
remove cloud-cleanup
This script does not do anything because it fails due missing /var/log/cloud directory. Logrotate is used for this functionality.
* 4.8:
CLOUDSTACK-9172 Added cross zones check to delete template and iso
Check the existence of 'forceencap' parameter before use
systemvm: set default umask 022 in injectkeys.sh
* 4.7:
CLOUDSTACK-9172 Added cross zones check to delete template and iso
Check the existence of 'forceencap' parameter before use
systemvm: set default umask 022 in injectkeys.sh
Check the existence of 'forceencap' parameter before useCheck the existence of 'forceencap' parameter before use.
Error seen:
```
Traceback (most recent call last):
File "/opt/cloud/bin/update_config.py", line 140, in <module>
process_file()
File "/opt/cloud/bin/update_config.py", line 54, in process_file
finish_config()
File "/opt/cloud/bin/update_config.py", line 44, in finish_config
returncode = configure.main(sys.argv)
File "/opt/cloud/bin/configure.py", line 1003, in main
vpns.process()
File "/opt/cloud/bin/configure.py", line 488, in process
self.configure_ipsec(self.dbag[vpn])
File "/opt/cloud/bin/configure.py", line 544, in configure_ipsec
file.addeq(" forceencaps=%s" % CsHelper.bool_to_yn(obj['encap']))
KeyError: 'encap'
```
* pr/1402:
Check the existence of 'forceencap' parameter before use
Signed-off-by: Will Stevens <williamstevens@gmail.com>
They might never appear.. for example when we have entries in
/etc/cloudstack/ips.json that haven't been plugged yet. Waiting
this long makes everything horribly slow (every vm, interface,
static route, etc, etc, will hit this wait, for every device).
* 4.8:
Display hostname the VPC router runs on
CLOUDSTACK-9266: Make deleting static routes in private gw work
CLOUDSTACK-9264: Make /32 static routes for private gw work
* 4.7:
Display hostname the VPC router runs on
CLOUDSTACK-9266: Make deleting static routes in private gw work
CLOUDSTACK-9264: Make /32 static routes for private gw work
CLOUDSTACK-9256 add unique key for static routes in jsonStatic routes that are being set do not show up in the static_routes.json file. The reason for this is that the index that is used, is the gateway address, which is not unique. Hence stuff is overwritten and lost.
Ping @borisroman @wilderrodrigues @DaanHoogland
* pr/1364:
CLOUDSTACK-9256 add unique key for static routes in json
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
CLOUDSTACK-9254: Make longer names display pretty
CLOUDSTACK-9245 - Deletes ACL items when destroying the VPC or deleting the ACL itself
CLOUDSTACK-9245 - Formatting NetworkACLServiceImpl class
CLOUDSTACK-9245 - Formatting VpcManagerImpl class
CLOUDSTACK-9245 - Formatting NetworkACLManagerImpl class
More VR performance!
* 4.7:
Refactor public ip retrieval into method
CLOUDSTACK-9244 Fix setting up RFC1918 routes
CLOUDSTACK-9239 throw exception on deprecated command
Enhance VR performance by selectively executing tasks instead of brute-forcing
CLOUDSTACK-9236: Load Balancing Health Check button displayed when non-NetScaler offering is used
* 4.7:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Conflicts:
engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java
[4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This PR applies the same fixes as in the PR #1259, but against branch 4.7.
Please refer to PR #1259 for the tests results and all the comments already made there.
Issues fixed are:
* CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector
* CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the keepalivd.conf
* CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings
Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```:
* test_04_rvpc_network_garbage_collector_nics
* test_05_rvpc_multi_tiers
The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change the settings (default is 600 seconds for each) and restart the Management Server before running the tests. I would suggest to set it to 60 seconds.
In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not reading their values through ConfigDao. Due to that, the settings were not being applied properly and the test was waiting to long to check the VPC routers.
* pr/1277:
CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far
CLOUDSTACK-9188 - Reads network GC interval and wait from configDao
CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
CLOUDSTACK-9154 - Adds test to cover nics state after GC
CLOUDSTACK-9154 - Returns the guest iterface that is marked as added
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
CLOUDSTACK-9222 Prevent cloud.log.1 filling up the disk
Add integration test for restartVPC with cleanup, and Private Gateway enabled.
Nullpointer Exception in NicProfileHelperImpl
CLOUDSTACK-9222 Prevent cloud.log.1 filling up the diskDelay Compress results in more space usage than needed. Since we have copy truncate we don't need it.
* pr/1329:
CLOUDSTACK-9222 Prevent cloud.log.1 filling up the disk
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.7:
Fix unable to setup more than one Site2Site VPN Connection
FIX S2S VPN rVPC: Check only redundant routers in state MASTER
PEP8 of integration/smoke/test_vpc_vpn
Add S2S VPN test for Redundant VPC
Make integration/smoke/test_vpc_vpn Hypervisor independant
FIX VPN: non-working ipsec commands
[UI] MADNESS
[DB] Add force_encap field to s2s_customer_gateway table
[ROUTER] Add forceencaps field to python router ipsec config method
[TEST] unittest needs rework
[MARVIN] Add forceencap field to VpnCustomerGateway class in marvin base
[CORE] Add Force UDP Encapsulation option to Site2Site VPN
CLOUDSTACK-9186: Root admin cannot see VPC created by Domain admin user
CLOUDSTACK-9192: UpdateVpnCustomerGateway is failing
CLOUDSTACK-6485 prevent ip asignment of private gw iface
CLOUDSTACK-9204 Do not error when staticroute is already gone
make both check lines consistent
CLOUDSTACK-9181 Prevent syntax error in checkrouter.sh
CLOUDSTACK-9202 Bump ssh timeout
[4.7] FIX Site2SiteVPN on redundant VPCThis PR:
- fixes the inability to setup more than one Site2Site VPN connection from a VPC
- fixes starting of Site2Site VPN on redundant VPC
- fixes Site2Site VPN state checking on redundant VPC
- improves the vpc_vpn test to allow multple hypervisors
- adds an integration test for Site2Site VPN on redundant VPC
Tested it on 4.7 single Xen server zone:
command:
```
nosetests --with-marvin --marvin-config=/data/shared/marvin/mct-zone1-xen1.cfg -a tags=advanced,required_hardware=true /tmp/test_vpc_vpn.py
```
results:
```
Test Site 2 Site VPN Across redundant VPCs ... === TestName: test_01_redundant_vpc_site2site_vpn | Status : SUCCESS ===
ok
Test Remote Access VPN in VPC ... === TestName: test_01_vpc_remote_access_vpn | Status : SUCCESS ===
ok
Test Site 2 Site VPN Across VPCs ... === TestName: test_01_vpc_site2site_vpn | Status : SUCCESS ===
ok
----------------------------------------------------------------------
Ran 3 tests in 1490.076s
OK
```
also performed numerous manual inspections of state of VPN connections and connectivity between VPC's
* pr/1276:
Fix unable to setup more than one Site2Site VPN Connection
FIX S2S VPN rVPC: Check only redundant routers in state MASTER
PEP8 of integration/smoke/test_vpc_vpn
Add S2S VPN test for Redundant VPC
Make integration/smoke/test_vpc_vpn Hypervisor independant
FIX VPN: non-working ipsec commands
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9181 Prevent syntax error in checkrouter.shAdded quotes to prevent syntax errors in weird situations.
Error seen in mgt server:
```
2015-12-15 14:30:32,371 DEBUG [c.c.a.m.AgentManagerImpl] (RedundantRouterStatusMonitor-7:ctx-0dd8ef3e) Details from executing class com.cloud.agent.api.CheckRouterCommand: Status: UNKNOWN
/opt/cloud/bin/checkrouter.sh: line 28: [: =: unary operator expected
/opt/cloud/bin/checkrouter.sh: line 31: [: =: unary operator expected
```
Cause:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
./checkrouter.sh: line 28: [: =: unary operator expected
./checkrouter.sh: line 31: [: =: unary operator expected
Status: UNKNOWN
```
Somehow a nic was missing.
After fix the script can handle this:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
Status: UNKNOWN
```
The other states are also reported fine:
```
root@r-1191-VM:/opt/cloud/bin# ./checkrouter.sh
Status: MASTER
```
```
root@r-1192-VM:/opt/cloud/bin# ./checkrouter.sh
Status: BACKUP
```
While at it, I also removed the INTERFACES variable/constant as it was only used once and hardcoded the second time. Now both are hardcoded and easier to read.
* pr/1296:
make both check lines consistent
CLOUDSTACK-9181 Prevent syntax error in checkrouter.sh
Signed-off-by: Remi Bergsma <github@remi.nl>
CLOUDSTACK-9204 Do not error when staticroute is already goneWhen deleting a static route fails because it isn't there any more (KeyError), it should succeed instead.
Error seen:
```
[INFO] Processing JSON file static_routes.json.1451560145
Traceback (most recent call last):
File "/opt/cloud/bin/update_config.py", line 140, in <module>
process_file()
File "/opt/cloud/bin/update_config.py", line 52, in process_file
qf.load(None)
File "/opt/cloud/bin/merge.py", line 258, in load
proc = updateDataBag(self)
File "/opt/cloud/bin/merge.py", line 91, in _init_
self.process()
File "/opt/cloud/bin/merge.py", line 131, in process
dbag = self.process_staticroutes(self.db.getDataBag())
File "/opt/cloud/bin/merge.py", line 179, in process_staticroutes
return cs_staticroutes.merge(dbag, self.qFile.data)
File "/opt/cloud/bin/cs_staticroutes.py", line 26, in merge
del dbag[key]
KeyError: u'192.168.0.3'
```
* pr/1298:
CLOUDSTACK-9204 Do not error when staticroute is already gone
Signed-off-by: Remi Bergsma <github@remi.nl>
- Refactors the set_backup, set_master and set_fault methods to have better names for the variable
- Increase the sleep on the test in order to wait for the routers to be ready. It's now 3 times the GC settings
CLOUDSTACK-9155 make sure logrotate is effective for cloud.logMany processes on the VRs log to cloud.log. When log rotate kicks in, the file is rotated but the scripts still write to the old inode (cloud.log.1 after rotate). Tis quickly fills up the tiny log partition.
Using 'copytruncate' is a small tradeoff, there is a slight change of missing a log entry, but in the old situation nothing ended up in cloud.log after rotate (except for stuff that was (re)started) so I think this is the best solution until we properly rewrite the script to either use their own script or syslog.
More details: https://issues.apache.org/jira/browse/CLOUDSTACK-9155
* pr/1235:
CLOUDSTACK-9155 make sure logrotate is effective
Signed-off-by: Remi Bergsma <github@remi.nl>
Many processes on the VRs log to cloud.log. When logrotate
kicks in, the file is rotated but the scripts still write
to the old inode (cloud.log.1 after rotate). Tis quickly
fills up the tiny log partition.
Using 'copytruncate' is a tradeoff, there is a slight
change of missing a log entry, but in the old situation
we were missing all of them after logrotate.
CLOUDSTACK-9151 - As a Developer I want the VRID to be set within the limits of KeepaliveDThis PR fixes a blocker issue!
- Just like with RVRs, use the VRID 51 instead of making it dependent on the VPCID
- Reason: arbitary unique number 0..255 used to differentiate multiple instances of vrrpd running on the same NIC (and hence same socket). virtual_router_id 51
* pr/1231:
CLOUDSTACK-9151 - Removes the replacement of the VRID in the CsRedundant file
Signed-off-by: Remi Bergsma <github@remi.nl>
Updating pom.xml version numbers for release 4.6.2-SNAPSHOTSet next version in 4.6 release branch to version 4.6.2-SNAPSHOT.
Using ` ./tools/build/setnextversion.sh`.
Ping @bhaisaab @DaanHoogland before we merge this, how will we be creating the upgrade paths from 4.6.2 to 4.7? After this PR is merged, we need to manually do a fwd-merge and make sure we keep the pom versions in master/4.7. Much like in #1071.
* pr/1186:
Fixed typo in iam/pom.xml
Updating pom.xml version numbers for release 4.6.2-SNAPSHOT
Signed-off-by: Daan Hoogland <daan@onecht.net>
- Just like with RVRs, use the VRID 51 instead of making it dependent on the VPCID
- Reason: arbitary unique number 0..255 used to differentiate multiple instances of vrrpd running on the same NIC (and hence same socket). virtual_router_id 51
Setup general route for RFC 1918 space, as otherwise it will be sent to
the public gateway and not work. More specific routes that may be set
have preference over this generic routes.
When public network is RFC1918, we do not setup the routes to avoid
problems with internal-only deployments.
* 4.6:
CLOUDSTACK-9106 - Makes Enum name compliant with Java code conventions.
CLOUDSTACK-9106 - Adds a test to cover the changes in the applyVpnUsers() method
CLOUDSTACK-9106 - Makes the router commands call more consistent.
CLOUDSTACK-9106 - Enables private gateway tests on Redundant VPCs
CLOUDSTACK-9106 - Refactor the createPrivateNicProfileForGateway() method
CLOUDSTACK-9106 - Reduces the amount of iterations through the routers of a VPC
Add support for not (re)starting server after cloud-setup-management.
Closed PRs that will not be considered for merge:
This closes#1158
This closes#1097
- Use the router to retrieve the instance ID
- Check if the VPC is redundant in order to reuse the private gateway address.
- Brings the private gateways interfaces up.
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs
Added logid to logging framework, now all API call logs can be tracked with this id end to end
* pr/1167:
CLOUDSTACK-9105: Logging enhancement: Handle/reference to track API calls end to end in the MS logs Added logid to logging framework, now all API call logs can be tracked with this id end to end
Signed-off-by: Daan Hoogland <daan@onecht.net>
Send arping to the gateway instead of our own addressWe need to send an Unsolicited ARP to the gateway, instead of our own address. We now encounter problems when people deploy/destroy/deploy and get the same public ip.
Packets arrive, but with incorrect / cached mac and are ignored by the routervm kernel.
Run arping manually to update the arp-cache on the gateway and things start to work.
Then we discovered the `arping` is actually done, but sent to its own address. Therefore the gateway doesn't pick it up. We only saw this happening when rapid deploy tools are used, like Terraform that do deploy/destroy/deploy and might get the same ip but on a new router having a new mac.
```
2015-12-03 18:07:25,589 CsHelper.py execute:160 Executing: arping -c 1 -I eth1 -A -U -s 192.168.23.8 192.168.23.1
```
The integration tests seem happy, although the full run is still ongoing:
```
=== TestName: test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL | Status : SUCCESS ===
```
Thanks @sspans for helping trouble shoot this. Ping @wilderrodrigues can you review please?
* pr/1163:
CLOUDSTACK-9097 Make public ip work immediately
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.6:
CLOUDSTACK-9075 - Uses the same vlan since it should have been already released
CLOUDSTACK-9075 - Adds VPC static routes test
CLOUDSTACK-9075 - Covers Private GW ACL with Redundant VPCs
CLOUDSTACK-9075 - Add method to get list of Physical Networks per zone
CLOUDSTACK-6276 Removing unused parameter in integration test for projects
CLOUDSTACK-6276 Removing unused parameter in integration test
CLOUDSTACK-6276 Fixing affinity groups for projects
We need to send an Unsolicited ARP to the gateway, instead of our own address. We now encounter problems when people deploy/destroy/deploy and get the same public ip.
CLOUDSTACK-9062: Improve S3 implementation.The S3 implementation is far from finished, this commit focuses on the bases.
- Upgrade AWS SDK to latest version.
- Rewrite S3 Template downloader.
- Rewrite S3Utils utility class.
- Improve addImageStoreS3 API command.
- Split various classes for convenience.
- Various minor improvements and code optimizations.
A side effect of the new AWS SDK is that it, by default, uses the V4 signature. Therefore I added an option to specify the Signer, so it stays compatible with previous versions.
Please review thoroughly, both code inspection and (automated) integration tests. Currently no integration tests are available specifically for S3. Therefore the implementation is needed to be tested manually, for now...
What I tested:
- Greenfield install -> will download latest systemvm template automatically to S3.
- Upload a template/iso
- Download a template/iso
- Restart of management server -> list available templates -> doesn't download them again if available.
* pr/1083:
CLOUDSTACK-9062: Improve S3 implementation.
Signed-off-by: Remi Bergsma <github@remi.nl>
* 4.6:
CLOUDSTACK-9015 - Delete public IP in order to get both IP and NAT rule removed.
CLOUDSTACK-9015 - Add test to cover the rVPC routers stop/start/reboot scenario
CLOUDSTACK-9015 - Make sure the Backup router can talk to the Master router after a stop/start/reboot
CLOUDSTACK-9067 - As I developer I want to remove all the unused router-shell scripts from ACSThis PR removes the unused shell scripts that were present in the ACS project. Those script were replaced by the.
Some of the scripts are used by the HyperV Resource, which were hardcoded. I took the opportunity to use the Java constants over there as well, so the next one touching the code will know they exist and won't hardcode anything.
The following task were applied:
* Remove the shell files and the Java constants that were mapping them;
* Apply the use of the Java constants to the HyperV Resource class;
* Wrap the String.format() method in the StringUtils so we can test the changes in the HyperV Resource class.
The last point was added because I do not have a HyperV test environment. Hence, I wanted to make sure the tiny code I changed is covered at least by unit tests.
* pr/1084:
CLOUDSTACK-9067 - Replaces hardcoded paths with the VRScripts constants.
CLOUDSTACK-9067 - Fomatting the code of HypervDirectConnectResource class
CLOUDSTACK-9067 - Remove old script file from the project
Signed-off-by: Remi Bergsma <github@remi.nl>
[4.6.1] CLOUDSTACK-9015 - Redundant VPC Virtual Router's state is BACKUP & BACKUP or MASTER & MASTERThis PR closes#1064
All the details can be found in the original PR, which won't be merged because it was created agains master. Once this PR is closed, the original one will be also closed.
* pr/1070:
CLOUDSTACK-9015 - Delete public IP in order to get both IP and NAT rule removed.
CLOUDSTACK-9015 - Add test to cover the rVPC routers stop/start/reboot scenario
CLOUDSTACK-9015 - Make sure the Backup router can talk to the Master router after a stop/start/reboot
Signed-off-by: Remi Bergsma <github@remi.nl>
The S3 implementation is far from finished, this commit focusses on the bases.
- Upgrade AWS SDK to latest version.
- Rewrite S3 Template downloader.
- Rewrite S3Utils utility class.
- Improve addImageStoreS3 API command.
- Split various classes for convenience.
- Various minor improvements and code optimalisations.
A side effect of the new AWS SDK is that it, by default, uses the V4 signature. Therefore I added an option to specify the Signer, so it stays compatible with previous versions.
CLOUDSTACK-9058 - Respond with "saved_password" if no password is to be issued.The password server on the virtual router should respond with "saved_password" if no password is to be issued. This allows for backwards compatibility with Windows Guest VMs which require the "saved_password" response.
* pr/1079:
CLOUDSTACK-9058
Signed-off-by: Remi Bergsma <github@remi.nl>
- Stop KeepaliveD/ConntrackD if the eth2 (guest) interface is not configured and UP
- Only setup the redundancy after all the router configuration is done
- Open the FW for the VRRP communitation
- 224.0.0.18 and 225.0.0.50
- Set keepalived.conf.templ by default to use interface eth2 (guest)
- It will be reconfigured anyway, but having eth2 there is more clear
CLOUDSTACK-8993: DHCP fails with "no address available" when an IP is reused
Repopulate /etc/dhcphosts.txt to remove old entries with the same IP address.
* pr/981:
CLOUDSTACK-8993: DHCP fails with "no address available" when an IP is reused
Signed-off-by: Remi Bergsma <github@remi.nl>
- If we stop/start a router, the state in the file will still say MASTER, when it is actually not
- Checking the state based on the interface (eth1) state
- Once master.py is called by keepalived, save the state in the json file to BACKUP just to make sure it's also written there
- Do not use the API call because it will read what is in the database, that might not have been updated yet
* Check the status in the router directly instead
- Remove all the sleeps
- It was working before because the Routers were restarting about 10 times for each operation
e.g. adding a VM to a network ot acquiring a new IP.
- Adding stat_rules of internal LB to iptables
We needed one extra rule in the INPUT chain
- With the keepalived fixed they should not be needed anymore. So first reducing them drasticaly
- I am now making a backup of the template file, write to the template file and compare it with the existing configuration
- The template file is recovered afer the process
- I also check if the process is running
- I fixed a bug in the compare method
- I am now updating the configuration variable once the file content is flushed to disk