Commit Graph

306 Commits

Author SHA1 Message Date
Rohit Yadav 3783fd5cec Merge remote-tracking branch 'origin/4.15' 2021-04-05 13:00:18 +05:30
Rohit Yadav 43257f8300 Merge remote-tracking branch 'origin/4.14' into 4.15 2021-04-05 12:59:37 +05:30
aleskxyz ca4669c4d4
systemvm: Add localized "data-server" records in /etc/hosts for VPC routers (#4873) 2021-04-05 12:34:10 +05:30
Rohit Yadav d4635e3442 Merge remote-tracking branch 'origin/4.15'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2021-04-01 14:35:01 +05:30
Wei Zhou d4ba00434c
VR: fix rsyslog compresses log files but not release disk space in VR (#4869)
We had critical issue with VR recently. The VRs of shared network or vpc stops working after some days.
After investigation, I found that the disk space is full

```
root@r-10-VM:~# df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/vda5        2086316 2069932         0 100% /
```

logrotate/ryslog has compresses the log files, but space is not released. see `lsof |grep deleted`

```
root@r-10-VM:~# lsof |grep deleted
rsyslogd    960                      root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960                      root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960                      root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960                      root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
rsyslogd    960  962 in:imuxso       root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960  962 in:imuxso       root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960  962 in:imuxso       root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960  962 in:imuxso       root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
rsyslogd    960  963 in:imklog       root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960  963 in:imklog       root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960  963 in:imklog       root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960  963 in:imklog       root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
rsyslogd    960  964 in:imfile       root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960  964 in:imfile       root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960  964 in:imfile       root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960  964 in:imfile       root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
rsyslogd    960  965 in:imudp        root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960  965 in:imudp        root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960  965 in:imudp        root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960  965 in:imudp        root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
rsyslogd    960  966 rs:main         root   12w      REG              254,5 493060096        137 /var/log/daemon.log.1 (deleted)
rsyslogd    960  966 rs:main         root   13w      REG              254,5  17715200        110 /var/log/messages.1 (deleted)
rsyslogd    960  966 rs:main         root   16w      REG              254,5 545968128        342 /var/log/auth.log.1 (deleted)
rsyslogd    960  966 rs:main         root   18w      REG              254,5  38313984        341 /var/log/cron.log.1 (deleted)
```

workaround: restarting rsyslog to release the space.
```
systemctl restart rsyslog
```

The root cause is, the following command does not work in 4.15 template
```
root@r-10-VM:~# invoke-rc.d rsyslog rotate
[FAIL] Closing open files: rsyslogd failed!
```

Fix: use `/usr/lib/rsyslog/rsyslog-rotate` instead
```
root@r-10-VM:~# /usr/lib/rsyslog/rsyslog-rotate
root@r-10-VM:~# cat /usr/lib/rsyslog/rsyslog-rotate

if [ -d /run/systemd/system ]; then
    systemctl kill -s HUP rsyslog.service
else
    invoke-rc.d rsyslog rotate > /dev/null
fi

```
2021-04-01 14:30:58 +05:30
Wei Zhou dc5b9ec7c8
systemvm: remove logrotate config for wtmp and btmp (#4872)
logrotate in systemvms run every day. it exits with failure.
```
root@r-100-VM:~# systemctl status logrotate
● logrotate.service - Rotate log files
   Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2021-03-23 00:00:01 UTC; 2 days ago
     Docs: man:logrotate(8)
           man:logrotate.conf(5)
  Process: 25001 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE)
 Main PID: 25001 (code=exited, status=1/FAILURE)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
```

it is because the logrotate for wtmp and btmp already exist in 4.15 systemvm template.
```
root@r-100-VM:~# cat /etc/logrotate.d/wtmp
/var/log/wtmp {
    missingok
    monthly
    create 0664 root utmp
    minsize 1M
    rotate 1
}
root@r-100-VM:~# cat /etc/logrotate.d/btmp
/var/log/btmp {
    missingok
    monthly
    create 0660 root utmp
    rotate 1
}
```

remove them from /etc/logrotate.conf fixes the issue.
```
root@r-100-VM:~# systemctl status logrotate
● logrotate.service - Rotate log files
   Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled)
   Active: inactive (dead) since Thu 2021-03-25 00:00:01 UTC; 9h ago
     Docs: man:logrotate(8)
           man:logrotate.conf(5)
  Process: 28211 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=0/SUCCESS)
 Main PID: 28211 (code=exited, status=0/SUCCESS)

Mar 25 00:00:01 r-100-VM systemd[1]: Starting Rotate log files...
Mar 25 00:00:01 r-100-VM systemd[1]: logrotate.service: Succeeded.
Mar 25 00:00:01 r-100-VM systemd[1]: Started Rotate log files.
```
2021-04-01 12:51:17 +05:30
Wei Zhou 63c91c1458
server: Fix network statistics for vpc (#3944)
This contains 3 main changes
(1) add NETWORK_STATS_ethX for all nics with public ips in VPC VRs (current: NETWORK_STATS_eth1)
(2) DO NOT create records in user_statistics for each VPC tier (only one record per public nic per VPC VR)
(3) send NetworkUsageCommand before unplugging a NIC with public IPs from VPC VR
2021-04-01 12:43:06 +05:30
Rohit Yadav 9f730eabfa Merge remote-tracking branch 'origin/4.15' 2021-03-24 12:46:24 +05:30
Rakesh dab7d29bb2
systemvm: Load modules to support NAT traversal in VR (#4777)
Load necessary modules so that VPN connection works properly
2021-03-24 12:13:31 +05:30
sureshanaparti eba186aa40
storage: New Dell EMC PowerFlex Plugin (formerly ScaleIO, VxFlexOS) (#4304)
Added support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack (for KVM hypervisor) and enabled VM/Volume operations on that pool (using pool tag).
Please find more details in the FS here:
https://cwiki.apache.org/confluence/x/cDl4CQ

Documentation PR: apache/cloudstack-documentation#169

This enables support for PowerFlex/ScaleIO (v3.5 onwards) storage pool as a primary storage in CloudStack

Other improvements addressed in addition to PowerFlex/ScaleIO support:

- Added support for config drives in host cache for KVM
	=> Changed configuration "vm.configdrive.primarypool.enabled" scope from Global to Zone level
	=> Introduced new zone level configuration "vm.configdrive.force.host.cache.use" (default: false) to force host cache for config drives
	=> Introduced new zone level configuration "vm.configdrive.use.host.cache.on.unsupported.pool" (default: true) to use host cache for config drives when storage pool doesn't support config drive
	=> Added new parameter "host.cache.location" (default: /var/cache/cloud) in KVM agent.properties for specifying the host cache path and create config drives on the "/config" directory on the host cache path
	=> Maintain the config drive location and use it when required on any config drive operation (migrate, delete)

- Detect virtual size from the template URL while registering direct download qcow2 (of KVM hypervisor) templates

- Updated full deployment destination for preparing the network(s) on VM start

- Propagate the direct download certificates uploaded to the newly added KVM hosts

- Discover the template size for direct download templates using any available host from the zones specified on template registration
	=> When zones are not specified while registering template, template size discovery is performed using any available host, which is picked up randomly from one of the available zones

- Release the VM resources when VM is sync-ed to Stopped state on PowerReportMissing (after graceful period)

- Retry VM deployment/start when the host cannot grant access to volume/template

- Mark never-used or downloaded templates as Destroyed on deletion, without sending any DeleteCommand
	=> Do not trigger any DeleteCommand for never-used or downloaded templates as these doesn't exist and cannot be deleted from the datastore

- Check the router filesystem is writable or not, before performing health checks
	=> Introduce a new test "filesystem.writable.test" to check the filesystem is writable or not
	=> The router health checks keeps the config info at "/var/cache/cloud" and updates the monitor results at "/root" for health checks, both are different partitions. So, test at both the locations.
	=> Added new script: "filesystem_writable_check.py" at /opt/cloud/bin/ to check the filesystem is writable or not

- Fixed NPE issue, template is null for DATA disks. Copy template to target storage for ROOT disk (with template id), skip DATA disk(s)

* Addressed some issues for few operations on PowerFlex storage pool.

- Updated migration volume operation to sync the status and wait for migration to complete.

- Updated VM Snapshot naming, for uniqueness in ScaleIO volume name when more than one volume exists in the VM.

- Added sync lock while spooling managed storage template before volume creation from the template (non-direct download).

- Updated resize volume error message string.

- Blocked the below operations on PowerFlex storage pool:
  -> Extract Volume
  -> Create Snapshot for VMSnapshot

* Added the PowerFlex/ScaleIO client connection pool to manage the ScaleIO gateway clients, which uses a single gateway client per Powerflex/ScaleIO storage pool and renews it when the session token expires.

- The token is valid for 8 hours from the time it was created, unless there has been no activity for 10 minutes.
  Reference: https://cpsdocs.dellemc.com/bundle/PF_REST_API_RG/page/GUID-92430F19-9F44-42B6-B898-87D5307AE59B.html

Other fixes included:

- Fail the VM deployment when the host specified in the deployVirtualMachine cmd is not in the right state (i.e. either Resource State is not Enabled or Status is not Up)

- Use the physical file size of the template to check the free space availability on the host, while downloading the direct download templates.

- Perform basic tests (for connectivity and file system) on router before updating the health check config data
	=> Validate the basic tests (connectivity and file system check) on router
	=> Cleanup the health check results when router is destroyed

* Updated PowerFlex/ScaleIO storage plugin version to 4.16.0.0

* UI Changes to support storage plugin for PowerFlex/ScaleIO storage pool.
- PowerFlex pool URL generated from the UI inputs(Gateway, Username, Password, Storage Pool) when adding "PowerFlex" Primary Storage
- Updated protocol to "custom" for PowerFlex provider
- Allow VM Snapshot for stopped VM on KVM hypervisor and PowerFlex/ScaleIO storage pool

and Minor improvements in PowerFlex/ScaleIO storage plugin code

* Added support for PowerFlex/ScaleIO volume migration across different PowerFlex storage instances.

- findStoragePoolsForMigration API returns PowerFlex pool(s) of different instance as suitable pool(s), for volume(s) on PowerFlex storage pool.
- Volume(s) with snapshots are not allowed to migrate to different PowerFlex instance.
- Volume(s) of running VM are not allowed to migrate to other PowerFlex storage pools.
- Volume migration from PowerFlex pool to Non-PowerFlex pool, and vice versa are not supported.

* Fixed change service offering smoke tests in test_service_offerings.py, test_vm_snapshots.py

* Added the PowerFlex/ScaleIO volume/snapshot name to the paths of respective CloudStack resources (Templates, Volumes, Snapshots and VM Snapshots)

* Added new response parameter “supportsStorageSnapshot” (true/false) to volume response, and Updated UI to hide the async backup option while taking snapshot for volume(s) with storage snapshot support.

* Fix to remove the duplicate zone wide pools listed while finding storage pools for migration

* Updated PowerFlex/ScaleIO volume migration checks and rollback migration on failure

* Fixed the PowerFlex/ScaleIO volume name inconsistency issue in the volume path after migration, due to rename failure
2021-02-24 14:58:33 +05:30
dahn aab2447656
systemvm: loop optimisation in bash (#4451)
Co-authored-by: Daan Hoogland <dahn@onecht.net>
2021-02-18 18:18:16 +05:30
Rohit Yadav f5a44b3502 Merge remote-tracking branch 'origin/4.14' into 4.15 2021-02-05 18:28:02 +05:30
Wei Zhou d62d5c6cd2
VR: fix expunging vm will remove dhcp entries of another vm in VR (#4627)
Steps to reproduce the issue

(1) create two vm wei-001 and wei-002, start them

(2) check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They have entries for both of wei-001 and wei-002

(3) stop wei-002, and restart VR (or restart network with cleanup).
check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They have entries for wei-001 only (as wei-002 is stopped)

(4) expunge wei-002. when it is done,
check /etc/cloudstack/dhcpentry.json and /etc/dhcphosts.txt in VR
They do not have entries for wei-001.
VR health check fails at dhcp_check.py and dns_check.py
2021-02-05 18:10:53 +05:30
Rohit Yadav 58a0a7b1a3 Merge remote-tracking branch 'origin/4.14' 2020-12-14 14:41:06 +05:30
davidjumani 4d33e159f7
vr: Ensuring dnsmasq.leases file is populated (#4529) 2020-12-14 09:06:24 +00:00
Daan Hoogland e9ce381c56 Merge branch '4.14' 2020-11-25 09:04:53 +01:00
Wei Zhou 8a68617eee bugfix #9 vpc vr: Add PREROUTING rule for vm with static nat to multiple private gateways 2020-11-25 08:40:16 +01:00
Wei Zhou 69c0f71cf7 bugfix #8 vpc: add rule for traffic between vm and private gateway 2020-11-25 08:40:16 +01:00
Wei Zhou a8c9b4531b bugfix #7 vpc vr: allow servers in private gateway to reach internet via the VPC VR if it is gateway 2020-11-25 08:40:16 +01:00
Wei Zhou 8fb2efee1c bugfix #6 vpc vr: Add iptables rules for ACL of private gateway 2020-11-25 08:40:16 +01:00
Wei Zhou 7e6f484332 Revert "Fix Policy Based Routing for private gateway static routes (#3604)"
This reverts commit 82d94a87c5.
2020-11-25 08:40:16 +01:00
Wei Zhou 5cc6fedb1f Revert "Handle private gateways more reliably"
This reverts commit f4f9b3ab4e.
2020-11-25 08:40:16 +01:00
Rohit Yadav 8e03374c29 Merge remote-tracking branch 'origin/4.14' 2020-11-23 16:00:41 +05:30
Wei Zhou 81ac9f90ab
vr: fix python exception when configure VRs (#4489)
before
```
root@r-27-VM:/var/cache/cloud# /opt/cloud/bin/configure.py monitor_service.json
ERROR:root:Command 'ip link show eth0 | grep 'state DOWN'' returned non-zero exit status 1
```

with this change
```
root@r-27-VM:/var/cache/cloud# /opt/cloud/bin/configure.py monitor_service.json
root@r-27-VM:/var/cache/cloud#
```
2020-11-23 14:09:40 +05:30
Rohit Yadav d3f18ef71c Merge remote-tracking branch 'origin/4.14' 2020-11-20 21:12:20 +05:30
Wei Zhou 75fdb07387
vpc: fix ips on wrong interfaces after rebooting vpc vrs (#4467)
* vpc: fix ips on wrong interfaces after rebooting vpc vrs

* #4467: Rename to updateNicWithDeviceId

* CLSTACK-8923 vr: Force a restart of keepalived if conntrackd is not running or configuration has changed
2020-11-20 21:02:53 +05:30
Daan Hoogland 492962238e Merge branch '4.14' 2020-11-20 11:43:20 +00:00
Wei Zhou a368ba9def
VR: fix logging is not working and logs are not appended to /var/log/cloud.log (#4466) 2020-11-20 10:40:02 +00:00
Spaceman1984 88762c101c
Added compress option to dnsmasq log files (#4439) 2020-11-06 09:33:52 +00:00
Daan Hoogland ffc42b9d92 Merge branch '4.14' 2020-11-04 09:33:46 +01:00
Rakesh 34146569d9
FIX issue in VR if remote access vpn is enabled (#4430)
Co-authored-by: Rakesh Venkatesh <r.venkatesh@global.leaseweb.com>
2020-11-04 09:27:48 +01:00
Daan Hoogland ee5094b77f Merge branch '4.14' 2020-10-24 12:55:25 +02:00
Wei Zhou ff8a84ee77
systemvm: fix proc.find in CsProcess.py (#4413)
Co-authored-by: Wei Zhou <w.zhou@global.leaseweb.com>
2020-10-21 19:21:54 +02:00
Rohit Yadav 766eab8cab Merge remote-tracking branch 'origin/4.13' into 4.14 2020-09-23 10:49:19 +05:30
Lucas Granet ab02cf7078 router: adding "data-server" dns entry in /etc/hosts (#4319)
The DNS entry "data-server" was not added in /etc/hosts.

Since the VR is now considered as a "dhcpsrvr" (?), we need to apply this commit to add this DNS entry.
/etc/hosts is fully rewritten by this script.

Fixes: #4308
(cherry picked from commit dc65f31f9f)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-09-23 10:48:44 +05:30
Lucas Granet dc65f31f9f
router: adding "data-server" dns entry in /etc/hosts (#4319)
The DNS entry "data-server" was not added in /etc/hosts.

Since the VR is now considered as a "dhcpsrvr" (?), we need to apply this commit to add this DNS entry.
/etc/hosts is fully rewritten by this script.

Fixes: #4308
2020-09-22 13:07:56 +05:30
Rohit Yadav 9ae1170b29 Merge remote-tracking branch 'origin/4.14' 2020-08-04 11:28:43 +05:30
Wei Zhou 407e34d4e7
vrouter: remove a POSTROUTING rule for port forwarding in VPC router (#3952)
As discussed in #3937 (comment)
a rule for port forwarding in VPC router might not be needed.

This fixes the failed result of health check for network VRs.
2020-08-04 11:25:28 +05:30
Rohit Yadav 3de5ca9871 Merge remote-tracking branch 'origin/4.13' into 4.14
Fixes forward-merge lint issue

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-06-15 08:59:08 +05:30
Rohit Yadav 1e19ea5bdd
systemvmtemplate: move to using Debian10 (#4104)
This upgrades the systemvmtemplate base to Debian 10 with openjdk-11 and a newer strongswan package.

Fixes #3654

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-06-09 08:20:51 +05:30
davidjumani 1756b0f64a
noVNC console integration (#3967)
* Adding noVNC repo

* Adding support for noVNC

* Adding Ctl+Esc

* Removing device name from novnc header
2020-05-19 14:14:04 +02:00
dahn 8f3ad0fd8d
python format (#4087) 2020-05-18 15:15:01 +00:00
havengit 60d7215a06
fix dhcp lease entry wrong hostname (#4064)
When Guest VM add secondary nic,  will get wrong hostname "infiniteh" from dhcp server
infiniteh -->infinite
cat /etc/dhcphosts.txt
02:00:0b:ef:00:04,set:192_168_4_18,192.168.4.18,gumd-tes3,infiniteh
2020-05-11 10:56:14 +02:00
Daan Hoogland 8e4be6dc60 Merge branch '4.13' 2020-04-16 15:27:52 +02:00
dahn 22e0fc8752 mac-check 2020-04-16 15:10:50 +02:00
dahn 6a72e6e9f8 do not put in default accept rules for DNS and BOOTPS 2020-04-16 15:09:51 +02:00
Pearl Dsilva 32b509a83e
Handle port forward rule check for vpc and non vpc Isolated net… (#3963)
Co-authored-by: Pearl Dsilva <pearl.dsilva@shapeblue.com>
2020-03-13 09:20:42 +01:00
Daan Hoogland 6f9890694d Merge release branch 4.13 to master
* 4.13:
  vr: fix password server run with empty gateway in isolated netw… (#3943)
  Fix simulator docker db deploy issue (apache#3397) (#3651)
2020-03-09 11:26:21 +01:00
Wei Zhou 7d0fd9fa3f
vr: fix password server run with empty gateway in isolated netw… (#3943) 2020-03-09 10:35:56 +01:00
Daan Hoogland 06a8ff04b1 Merge release branch 4.13 to master
* 4.13:
  VR: Fix Redundant VRouter guest network on wrong interface (#3847)
2020-02-29 19:56:07 +01:00
Wei Zhou 313e21a0da
VR: Fix Redundant VRouter guest network on wrong interface (#3847) 2020-02-29 19:52:40 +01:00
Daan Hoogland 8c078b8849 Merge release branch 4.13 to master
* 4.13:
  vrouter: reload keepalived instead of restart and fix password… (#3898)
  Allow port 80/8080 accessible only from guest network (#3907)
2020-02-28 17:20:48 +01:00
Wei Zhou 3f8b2c369d
vrouter: reload keepalived instead of restart and fix password… (#3898) 2020-02-28 17:15:51 +01:00
Rakesh faccec4142
Allow port 80/8080 accessible only from guest network (#3907) 2020-02-28 17:05:44 +01:00
Rohit Yadav 3ca5be40d4 Merge remote-tracking branch 'origin/4.13' 2020-02-28 15:03:12 +05:30
Andrija Panic e8d418c091
router: Fix dhcp infinite lease time (#3913)
The previous setup of many hours would not work, due to some internal dnsmasq issues - lease was set correclty, but dnsmasq was setting the dhcp-renew-time (and rebind time) to less than 2 years from the date the lease was issued.

Using "infinite" as the value (instead of the number) works as expected - and (atm) the renew date is set to year 2088, etc.

Co-authored-by: dahn <daan.hoogland@gmail.com>
2020-02-28 14:27:09 +05:30
Rohit Yadav d90341ebf1
cloudstack: add JDK11 support (#3601)
This adds support for JDK11 in CloudStack 4.14+:

- Fixes code to build against JDK11
- Bump to Debian 9 systemvmtemplate with openjdk-11
- Fix Travis to run smoketests against openjdk-11
- Use maven provided jdk11 compatible mysql-connector-java
- Remove old agent init.d scripts

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2020-02-12 12:58:25 +05:30
Daan Hoogland 10482da136 Merge release branch 4.13 to master
* 4.13:
  vr: add missing rule for port forwarding rule in vpc (#3857)
  vpc: set traffic type of private gateway IP to Public to fix ke… (#3851)
2020-02-06 20:38:07 +01:00
Wei Zhou d88c614a35
vr: add missing rule for port forwarding rule in vpc (#3857) 2020-02-06 20:25:56 +01:00
Anurag Awasthi c0abfce8fa
Health check feature for virtual router (#3575) 2020-01-30 12:39:03 +01:00
Daan Hoogland 99ec8a825a Merge release branch 4.13 to master
* 4.13:
  Fix Policy Based Routing for private gateway static routes (#3604)
2020-01-30 11:39:36 +01:00
Dennis Konrad 82d94a87c5
Fix Policy Based Routing for private gateway static routes (#3604)
* Fix for routing table issue with NAT interfaces

* Mark only packets with the public ip as destination
2020-01-30 11:31:30 +01:00
Rohit Yadav 518ed5379c Merge remote-tracking branch 'origin/4.13' 2020-01-30 11:13:14 +05:30
Wei Zhou 521217c852
vr: fix vr in unknown state (more) (#3848)
This fixes similar issue with #3465.

Meanwhile change log level of CsHelper.execute2 from DEBUG to INFO and fix some typo.
2020-01-30 08:43:46 +05:30
Rohit Yadav a54afa820e Merge remote-tracking branch 'origin/4.13' 2020-01-29 20:51:27 +05:30
Wei Zhou be112a0220
vrouter: reload haproxy when cfg file is updated (#3726)
since 4.11.3, haproxy is always restarted when add/delete a lb rule.
When haproxy is started, the processes are
```
root@r-854-VM:~# ps aux |grep haproxy
root     22272  0.0  0.2   4036   668 ?        Ss   07:52   0:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy  22274  0.0  2.3  38444  5856 ?        S    07:52   0:00 /usr/sbin/haproxy-master
haproxy  22275  0.0  0.3  38444   880 ?        Ss   07:52   0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
```
When haproxy is reload, the processes are
```
root@r-854-VM:~# ps aux |grep haproxy
root     22272  0.0  0.2   4168   632 ?        Ss   07:52   0:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy  22283  0.0  2.3  38444  5884 ?        S    07:53   0:00 /usr/sbin/haproxy-master
haproxy  22286  0.0  0.3  38444   880 ?        Ss   07:53   0:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds -sf 22275
```

We need to change the pid file from /var/run/haproxy.pid to /run/haproxy.pid, so the haproxy will be reloaded instead of restarted.
2020-01-29 16:01:19 +05:30
Rohit Yadav 0cb2db6e1d Merge remote-tracking branch 'origin/4.13' 2020-01-28 11:26:40 +05:30
Wei Zhou ff1c6e78f4 router: Set up metadata/password/dhcp server on gateway IP instead of guest IP in RVR (#3477)
When we create a vm in the network with redundant VRs, the lease file in the vm (for example /var/lib/dhcp/dhclient.eth0.leases) shows the dhcp-server-identifier is the guest ip (not vip/gateway) of master VR. That's the ip ipaddress where the vm fetch password and metadata from.
if we stop the master VR (then backup will be master) or restart the network with cleanup (VRs will be created), the guest ip of master VR changes so vm are not able to get metadata/ssh-key using the ips in dhcp lease file.

Setting up metadata/password/dhcp server on gateway instead of guest IP in redundant VRs will fix the issues.

FIxes #3409
2020-01-28 10:35:59 +05:30
Paul Angus be97470d83 Get Diagnostics: Download logs and diagnostics data from SSVM, CPVM, Router (#3350)
* * Complete API implementation
* Complete UI integration
* Complete marvin test
* Complete Secondary storage GC background task

* improve UI labels

* slight reword and add another missing description

* improve download message clarity

* Address comments

* multiple fixes and cleanups

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* fix more bugs, let it return ip rule list in another log file

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* fix missing iprule bug

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* add support for ARCHIVE type of object to be linked/setup on secstorage

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* Fix retrieving files for Xenserver

* Update get_diagnostics_files.py

* Fix bug where executable scripts weren't handled

* Fixed error on script cmd generation

* Do not filter name for log files as it would override similar prefix script names

* Addressed code review comments

* log error instead of printstacktrace

* Treat script as executable and shell script

* Check missing script name case and write to output instead of catching exception

* Use shell = true instead of shlex to support any executable

* fix xenserver bug

* don't set dir permission for vmware

* Code review comments - refactoring

* Add check for possible NPE

* Remove unused imoprt after rebase

* Add better description for configs

Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
Co-authored-by: Rohit Yadav <rohit@apache.org>
Co-authored-by: Anurag Awasthi <anurag.awasthi@shapeblue.com>
2020-01-15 11:38:33 +01:00
Andrija Panic 2ffc0c5073 Increase DHCP lease time to infinite (#3662)
* Increase lease time to infinite

Lease time set to effectively infinite (36000+ days) since we fully control VM lifecycle via CloudStack
Infinite time helps avoid some edge cases which could cause DHCPNAK being sent to VMs since
(RHEL) system lose routes when they receive DHCPNAK
When VM is expunged, it's active lease and DHCP/DNS config is properly removed from related files in VR.

* desc fix
2020-01-03 15:18:40 +01:00
Rohit Yadav 96d98de85c Merge remote-tracking branch 'origin/4.13' 2019-11-12 15:06:50 +05:30
Rohit Yadav ae61bfee76
systemvm: for ip route show command don't use the throw command (#3612)
While searching for existing route, don't use the throw keyword when
using the cmd with `ip route show`.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-11-11 23:47:21 +05:30
Andrija Panic d3f199f1c1 Increase DHCP lease time to infinite (#3662)
* Increase lease time to infinite

Lease time set to effectively infinite (36000+ days) since we fully control VM lifecycle via CloudStack
Infinite time helps avoid some edge cases which could cause DHCPNAK being sent to VMs since
(RHEL) system lose routes when they receive DHCPNAK
When VM is expunged, it's active lease and DHCP/DNS config is properly removed from related files in VR.

* desc fix
2019-11-05 10:46:43 +01:00
Rohit Yadav b576972f71
test: stabilize 4.13/master (#3547)
Fix failing smoketests, fix NPEs. 

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-08-13 11:51:10 +05:30
Sven Vogel bf7e59587d systemvm: Fix VR bootstrapping/connection state in KVM (#3524)
Enable qemu-guest-agent / add start qemu-guest-agent back.
Improve hotplug kernel module loading and verbosity.
2019-07-29 11:51:39 +05:30
Wei Zhou b7988a3e5f vr: Fix vpc router in UNKNOWN state (#3465)
If there are more than 10 vpc tiers or public ip subnets in a VPC, eth1X will be added in vpc router.
The redundant state is UNKNOWN in this case.
2019-07-11 01:48:37 +05:30
Rohit Yadav c93630f125
travis: use explicit change directory and use -pl to build rat check (#3472)
This tries to fix build failures seen in job 1 of Travis. Also fixes a pylint issue.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-07-05 15:47:44 +05:30
Rohit Yadav 2ecd5ec804
systemvm: don't fork to curl to save password (#3437)
This fixes to avoid forking curl to save password but instead call
a HTTP POST url directly within Python code. This may reduce bottleneck
during high VM launches that require passwords.

Fixes #3182

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-07-01 23:53:44 +05:30
Rohit Yadav 6784cc516b Merge remote-tracking branch 'origin/4.12' 2019-06-26 16:29:49 +05:30
Rohit Yadav a2323e1425 Merge remote-tracking branch 'origin/4.11' into 4.12 2019-06-26 16:27:29 +05:30
ustcweizhou e76266e39b systemvm: Fix hostname is localhost in some VRs (#3422)
In some virtual routers, 'hostname -f' returns 'localhost'. The hostname is also 'localhost' in `/var/log/messages`. This change can fix the issue in new VRs.
2019-06-26 16:26:05 +05:30
Paul Angus 033199972e systemvm: improve SystemVM startup and memory usage (#3126)
In order to reduce memory footprint and improve boot speed/predictability.
The following changes have been made:

- add vm.min_free_kbytes to sysctl
- periodically clear disk cache (depending on memory size)
- only start guest services specific to hypervisor
- use systemvm code to determine hypervisor type (not systemd)
- start cloud service at end of post init rather than through systemd
- reduce initial threads started for httpd
- fix vmtools config file

Fixes #3039

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-26 14:40:59 +05:30
Rohit Yadav ff23131701 Merge remote-tracking branch 'origin/4.11' into 4.12
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-05 10:00:45 +05:30
Rohit Yadav 8fb388e931
router: support multi-homed VMs in VPC (#3373)
This does not remove VM entries in dbags when hostnames match. The
current codebase already removes entry when a VM is stopped/removed so
we don't need to handle lazy removal. This will allow a VM on
multiple-tiers in a VPC to get dns/dhcp rules as expected.

This also fixes the issue of dhcp_release based on a specific interface and
removes dhcp/dns entry when a nic is removed on a guest VM.

Fixes #3273

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-05 08:47:05 +05:30
Richard Lawley 41f569e8a8 router: Fix rule duplication with non-VPC static NAT rules (#3366)
The VR code has provision for inserting rules at the top or bottom by specifying "front" as the second parameter to self.fw.append. However, there are a number of cases where someone has been unaware of this and added a rule with the pattern self.fw.append(["mangle", "", "-I PREROUTING".... This causes the code to check for the rule already being present to fail, and duplicate rules end up being added.

This PR fixes two of these cases which apply to adding static NAT rules. I am aware of more of these cases, but I don't have the ability to easily test the outcome of fixing them. I'm happy to add these in if you're confident that the automated tests will be sufficient. Searching for "-I (case sensitive) finds these.

The code for dealing with "front" is included below to show that this shouldn't have any ill effects:

if fw[1] == "front":
    cpy = cpy.replace('-A', '-I')

Fixes #3177
2019-06-05 02:21:03 +05:30
Rohit Yadav b2b99ca63e Merge remote-tracking branch 'origin/4.11' into 4.12
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-03 17:15:41 +05:30
Nicolas Vazquez c9ce3e2344 router: Persistent DHCP leases file on VRs and cleanup /etc/hosts on VM deletion (#3351)
Since the CloudStack virtual router was redesigned on version 4.6 it has been observed that the DHCP leases file is not persistent across network operations. This causes conflicts on guest VMs static IPs, causing these static IPs to not be renewed by the DHCP server running on isolated and VPC networks' virtual routers (dnsmasq). On stopping or destroying a VM, its dhcp/dns records are not removed from the virtual router causing ghost effects.

Fixes #3272
Fixes #3354

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-06-03 17:04:16 +05:30
Rohit Yadav fb555b11ae Merge remote-tracking branch 'origin/4.11' into 4.12 2019-05-31 12:36:45 +05:30
Richard Lawley 2f268fbb52 systemvm: fix VR issues with Multiple Public Subnets (#3361)
This PR resolves 2 issues related to Virtual Routers with multiple public interfaces, and works around a third.

- Fixes #3353 - Adds missing throw routes for eth0/eth1 to eth3+ when there are >1 public IPs
- Fixes #3168 - Incorrect marks set on some static NAT rules (some code references were changed from hex(int(interfacenum)) to hex(100 + int(interfacenum)) - this change just adds the remaining ones
- Fixes #3352 - Work around that sends Gratuitous ARP messages when a HA VR becomes master to work around the problem of the MAC address being different between HA VRs. If that issue is fixed properly (i.e. a database entry for the subsequent interfaces so they can be static) then this is unnecessary, though should not cause any problems.
2019-05-31 12:35:42 +05:30
Rohit Yadav d77e69a2f2 Merge remote-tracking branch 'origin/4.11' into 4.12
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-23 18:16:45 +05:30
Rohit Yadav 0929866956
server: ssh-keygen in PEM format and reduce main systemvm patching script (#3333)
On first startup, the management server creates and saves a random
ssh keypair using ssh-keygen in the database. The command does
not specify keys in PEM format which is not the default as generated
by latest ssh-keygen tool.

The systemvmtemplate always needs re-building whenever there is a change
in the cloud-early-config file. This also tries to fix that by introducing a
stage 2 bootstrap.sh where the changes specific to hypervisor detection
etc are refactored/moved. The initial cloud-early-config only patches
before the other scripts are called.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-23 18:08:00 +05:30
Rohit Yadav 00ff536f81 Merge remote-tracking branch 'origin/4.11' into 4.12
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-14 14:26:11 +05:30
Rohit Yadav 9ff819da2c
systemvm: new qemu-guest-agent based patching for KVM (#3278)
This introduces a new patching script for patching systemvms on KVM
using qemu-guest-agent that runs inside the systemvm on startup. This
also removes the vport device which was previously used by the legacy
patching script and instead uses the modern and new uniform guest
agent vport for host-guest communication.

Also updates the sytemvmtemplate build config to use the latest Debian
9.9.0 iso.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-05-10 23:42:19 +05:30
GabrielBrascher 3f17671449 Fix conflict and merge forward PR #3163 from 4.11 to master (4.12)
# Conflicts:
#	packaging/debian/init/cloud-management
#	packaging/systemd/cloudstack-agent.default
#	packaging/systemd/cloudstack-agent.service
#	packaging/systemd/cloudstack-management.service
2019-02-04 23:53:19 -02:00
Rohit Yadav cb3fed0e4e systemd: fix services to allow TLS configurations via java.security.ciphers (#3163)
* systemd: fix services to allow TLS configurations via java.security.ciphers

This fixes the management server and systemd services to allow the
java.security.ciphers file to configure disabled TLS protocols and
algorithms. This also cleans up systemd service files for agent and
usage server.

This fixes #3140

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* configure: fix travis failure due pycodestyle error

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2019-02-04 19:51:30 -02:00
Rohit Yadav a75cfd4d06 Merge remote-tracking branch 'origin/4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-11-13 16:13:52 +05:30
nvazquez dea0b3eb78 Prevent error on GroupAnswers on VR creation 2018-11-09 15:30:57 -03:00
Rohit Yadav 323d381767 Merge remote-tracking branch 'origin/4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-29 16:27:08 +05:30
Rohit Yadav f0491d5c72
vr: defer was broken in VR because of json name change (#2979)
After upgrade from CS 4.10 to CS 4.11, multiple VRs did not start through.
It did not properly defer the finalize config in update_config.py.
Apparently, the json files are now called differently: where it used to
be vm_dhcp_entry.json it now has a uuid added, for example
vm_metadata.json.4d727b6e-2b48-49df-81c3-b8532f3d6745.
The if statement that checks if the finalize can be safely deferred
therefore no longer matches. This PR contains a fix so finalize is
defered again.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-29 16:11:43 +05:30
Rohit Yadav 8d31024a60 Merge remote-tracking branch 'origin/4.11' 2018-10-24 11:08:00 +05:30
Rohit Yadav e092529c98
systemvm: Ensure cloud service reboots after failure (#2916)
This fixes an issue for systemvms (CPVM and SSVM) on VMware, as eth0
is not programmed (link-local) the networking.service fails to start
which is a dependency for cloud-postinit service. When cloud-postinit
service fails to start/run, it fails to start the agent (cloud) process.
This fixes the smoketest failures we saw in case of VMware 6.5 with
4.11.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-23 23:33:08 +05:30
Rohit Yadav 84994c841f Merge remote-tracking branch 'origin/4.11' 2018-10-16 10:54:39 +05:30
Rohit Yadav 933ee23104
vr: memory and swap optimizations (#2892)
This tries to provide a threshold based fix for #2873 where swappinness of VR is not used until last resort. By limiting swappiness unless actually needed, the VR system degradation can be avoided for most cases. The other change is around not starting baremetal-vr by default on all VRs, according to the spec https://cwiki.apache.org/confluence/display/CLOUDSTACK/Baremetal+Advanced+Networking+Support only vmware VRs need to run it and that too only as the last step of the setup/completion, so we don't need to run it all the time.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-16 10:29:48 +05:30
Rohit Yadav bd9880003f Merge remote-tracking branch 'origin/4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-10 16:10:26 +05:30
Rohit Yadav ea771cfda4
router: Fixes #2719 program VR nics by device id order for VPC (#2888)
This fixes #2719 where private gateway IP might be incorrectly
programmed on a guest network nic. The VR would now check ipassoc
requests by mac addresses than provided nic/device id in case they are
wrong.

The root cause is that the device id information is lost when aggregated
commands are created upon starting of a new VPC VR, without the correct
device id in ip_associations json it mis-programs the VR.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-10 15:20:36 +05:30
Rohit Yadav b6302d4e90 Merge remote-tracking branch 'origin/4.11'
Conflicts resolved for:
	engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java
	engine/schema/src/com/cloud/vm/dao/UserVmDaoImpl.java
server/src/com/cloud/network/element/VirtualRouterElement.java
server/src/com/cloud/vm/UserVmManagerImpl.java
tools/marvin/setup.py

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-10-05 05:18:42 +05:30
René Moser 8c0b9d6202 systemvm: baremetal-vr: reduce memory usage (#2866)
We see a suspicious continuous increase in memory usage. Kind of looks like a memory leak.

One thing noted during debugging is that flask is started in debug mode. This is not best practice for a production system.
2018-10-03 16:38:32 +05:30
Rohit Yadav 9c14059d9e Merge remote-tracking branch 'origin/4.11' 2018-09-21 14:21:28 +05:30
Rohit Yadav 70dbfa7883
systemvm: export $TYPE before patching ssvm/cpvm (#2855)
This fixes a regression introduced in #2799, by exporting $TYPE
before the `patch` is called to patch/extract archives for ssvm/cpvm.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-21 14:19:18 +05:30
Rohit Yadav e559154a41 Merge remote-tracking branch 'origin/4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-12 15:29:00 +05:30
Rohit Yadav 5a046e243a
systemvmtemplate: new 4.11.2 template and fixes (#2799)
VMware router will be rebooted based on #2794, per current config
the VRs on reboot will go through fsck checks slowing down the deployment
process by few seconds. This will ensure that fsck checks are done
on every 3rd boot of the VR. The `4` is used because 1st boot is done
during the build of systemvmtemplate appliance.

Add upgrade path for a new 4.11.2 systemvmtemplate.
Other changes:
- Add support for XS 7.5 Fixes #2834.
- Reboot VR only if mgmt gw is not pingable on vmware.
- Enable passive ftp by enabling nf_conntrack_helper. This is change in behaviour since linux 4.7

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-09-12 14:42:05 +05:30
Rohit Yadav 82fc9f3016 Merge remote-tracking branch 'origin/4.11' 2018-09-07 16:15:52 +05:30
David Passante 4b4555bff7 systemvmtemplate: Fixes: #2760 Fix SystemVMs running in Xen HVM mode are not configured (#2824)
Set hypervisor to xen-hvm when virt-what detects both HyperV cpuid and xen-domU.
2018-09-07 16:11:23 +05:30
Rohit Yadav 7a0f7ab6d2 Merge remote-tracking branch 'origin/4.11' 2018-08-28 15:57:59 +05:30
Luiz Henrique 3212ce51e7 systemvm: Fixes #2805 set gateway to empty string than None to avoid arping on 'None' (#2806)
Arping command in virtual-router was called anyway on python code.

on file: merge.py
line 239, in this code : "dp['gateway'] ='None' ''

later on CsAddress.py line 303

if 'gateway' in self.address:
self.arpPing()

This string 'None' makes if steatement always be true
the solution on #2806 makes dp['gateway'] =''

Cannot be None type because there is a string operation later on code.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-08-28 15:57:10 +05:30
Mike Tutkowski 46c56eaaf9 Merge release branch 4.11 to master
* 4.11:
  Changed the implementation of isVolumeOnManagedStorage(VolumeInfo) to check if the data store in question is for primary storage (and added a unit test from Daan Hoogland)
  vmware: reboot VR after mac updates (#2794)
2018-08-12 00:03:37 -06:00
Rohit Yadav 461c4ad027
vmware: reboot VR after mac updates (#2794)
This re-introduces the rebooting of VR after setup of nics/macs in
case of VMware. It also adds a minor enhancement to show the console
esp. for root admins when VRs and systemvms are in starting state.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-08-10 13:07:11 +05:30
Rohit Yadav 5e48c0b4c9 Merge remote-tracking branch 'origin/4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-08-08 12:20:56 +05:30
Rohit Yadav f60f3cec34
router: Fixes #2789 fix proper mark based packet routing across interfaces (#2791)
Previously, the ethernet device index was used as rt_table index and
packet marking id/integer. With eth0 that is sometimes used as link-local
interface, the rt_table index `0` would fail as `0` is already defined
as a catchall (unspecified). The fwmarking on packets on eth0 with 0x0
would also fail. This fixes the routing issues, by adding 100 to the
ethernet device index so the value is a non-zero, for example then the
relationship between rt_table index and ethernet would be like:

100 -> Table_eth0 -> eth0 -> fwmark 100 or 0x64
101 -> Table_eth1 -> eth1 -> fwmark 101 or 0x65
102 -> Table_eth2 -> eth2 -> fwmark 102 or 0x66

This would maintain the legacy design of routing based on packet mark
and appropriate routing table rules per table/ids. This also fixes a
minor NPE issue around listing of snapshots.

This also backports fixes to smoketests from master.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-08-08 12:05:42 +05:30
Rene Diepstraten 33a6ea0c87 router: Use network based netmask for dnsmasq (#2792)
Without this patch, the VR uses the netmask of the primary network for all assigned cidrs.
This patch correctly applies the corresponding netmask.
2018-08-07 15:29:38 +05:30
Dingane Hlaluku 40af32b1b9 diagnostics: new diagnostics admin API for system VMs (#2721)
This is a new feature for CS that allows Admin users improved
troubleshooting of network issues in CloudStack hosted networks.

Description: For troubleshooting purposes, CloudStack administrators may wish to execute network utility commands remotely on system VMs, or request system VMs to ping/traceroute/arping to specific addresses over specific interfaces. An API command to provide such functionalities is being developed without altering any existing APIs. The targeted system VMs for this feature are the Virtual Router (VR), Secondary Storage VM (SSVM) and the Console Proxy VM (CPVM).

FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Remote+Diagnostics+API
ML discussion:
https://markmail.org/message/xt7owmb2c6iw7tva
2018-07-13 16:58:45 +05:30
Rohit Yadav 85750f918b Merge branch '4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-06-20 12:31:52 +05:30
Slair1 08a59e89c3 Source NAT option on Private Gateway (#2681)
Using Source NAT option on Private Gateway does not work
This fixes #2680 

## Description
<!--- Describe your changes in detail -->
When you use the Source NAT feature of Private Gateways on a VPC.  This should Source NAT all traffic from CloudStack VMs going towards IPs reachable through Private Gateways.

This change in this PR, stops adding the Source CIDR to SNAT rules.  This should be discussed/reviewed, but i can see no reason why the Source CIDR is needed.  There can only be one SNAT IP per interface, except for Static (one-to-one) NATs, which still work with this change in place.  The outbound interface is what matters in the rule.

<!-- For new features, provide link to FS, dev ML discussion etc. -->
<!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. -->
##### SUMMARY
<!-- Explain the problem/feature briefly -->
There is a bug in the Private Gateway functionality, when Source NAT is enabled for the Private Gateway.  When the SNAT is added to iptables, it has the source CIDR of the private gateway subnet.  Since no VMs live in that private gateway subnet, the SNAT doesn’t work.  

##### STEPS TO REPRODUCE
<!--
For bugs, show exactly how to reproduce the problem, using a minimal test-case. Use Screenshots if accurate.

For new features, show how the feature would be used.
-->

<!-- Paste example playbooks or commands between quotes below -->
Below is an example:

- VMs have IP addresses in the 10.0.0.0/24 subnet.
- The Private Gateway address is 10.101.141.2/30
 
In the outputs below, the SOURCE field for the new SNAT (eth3) only matches if the source is 10.101.141.0/30.  Since the VM has an IP address in 10.0.0.0/24, the VMs don’t get SNAT’d as they should when talking across the private gateway.  The SOURCE should be set to ANYWHERE.
##### BEFORE ADDING PRIVATE GATEWAY
~~~
Chain POSTROUTING (policy ACCEPT 1 packets, 52 bytes)
pkts bytes target     prot opt in     out     source               destination
    2   736 SNAT       all  --  any    eth2    10.0.0.0/24          anywhere             to:10.0.0.1
   16  1039 SNAT       all  --  any    eth1    anywhere             anywhere             to:46.99.52.18
~~~

<!-- You can also paste gist.github.com links for larger files -->

##### EXPECTED RESULTS
<!-- What did you expect to happen when running the steps above? -->

~~~
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination
    0     0 SNAT       all  --  any    eth3    anywhere             anywhere             to:10.101.141.2
    2   736 SNAT       all  --  any    eth2    anywhere             anywhere             to:10.0.0.1
   23  1515 SNAT       all  --  any    eth1    anywhere             anywhere             to:46.99.52.18
~~~

##### ACTUAL RESULTS
<!-- What actually happened? -->

<!-- Paste verbatim command output between quotes below -->
~~~
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target     prot opt in     out     source               destination
    0     0 SNAT       all  --  any    eth3    10.101.141.0/30      anywhere             to:10.101.141.2
    2   736 SNAT       all  --  any    eth2    10.0.0.0/24          anywhere             to:10.0.0.1
   23  1515 SNAT       all  --  any    eth1    anywhere             anywhere             to:46.99.52.18
~~~
## Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: -->
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)

## GitHub Issue/PRs
<!-- If this PR is to fix an issue or another PR on GH, uncomment the section and provide the id of issue/PR -->
<!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged -->
<!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" -->

Fixes: #2680 

## Screenshots (if appropriate):

## How Has This Been Tested?

<!-- Please describe in detail how you tested your changes. -->
<!-- Include details of your testing environment, and the tests you ran to -->
<!-- see how your change affects other areas of the code, etc. -->

## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [x] I have read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/master/CONTRIBUTING.md) document.
- [x] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
Testing
- [ ] I have added tests to cover my changes.
- [ ] All relevant new and existing integration tests have passed.
- [ ] A full integration testsuite with all test that can run on my environment has passed.
2018-06-19 21:19:26 +02:00
Rohit Yadav 56030153cb Merge branch '4.11': Fixes #2544 run passwd server on dhcpserver IP on rVR (#2635)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-14 16:27:41 +05:30
Rohit Yadav ece79e6913
router: Fixes #2544 run passwd server on dhcpserver IP on rVR (#2635)
This ensures that password server runs on the dhcpserver identifier
IP which is the not the VRRP virtual (10.1.1.1) IP by default but
the actual ip of the interface. When dhcp client discovery is made,
the `dhcp-server-identifier` contains the non VIP address that is
used by password reset script to query guest VM password.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-14 16:21:57 +05:30
Rohit Yadav 65511c4335 Merge branch '4.11': Reduce VR downtime during network restart (#2508)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-11 13:09:03 +05:30
Rohit Yadav a77ed56b86
CLOUDSTACK-9114: Reduce VR downtime during network restart (#2508)
This introduces a rolling restart of VRs when networks are restarted
with cleanup option for isolated and VPC networks. A make redundant option is
shown for isolated networks now in UI.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-11 12:48:07 +05:30
Rohit Yadav e7bd73e72b Merge branch '4.11' 2018-05-04 12:39:53 +05:30
Rohit Yadav 77172b9f03 vr: create tables before applying egress iptables rules
This fixes the issue that post-upgrade egress rules are not applied
on VR, restarting the network with cleanup used to be the workaround.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav 8533def696 systemvm: Fixes #2561 patching on XenServer
This fixes incorrect xenstore-read binary path, this failed systemvm
to be patched/started correctly on xenserver. The other fix is to keep
the xen-domU flag that may be returned by virt-what. This effect
won't change the cmdline being consumed as the mgmt server side (java)
code sets the boot args in both xenstore and as pv args. The systemvm's
/boot is ext2 that can be booted by PyGrub on both old and recent
XenServer versions.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav ddc8d131c0 systemvmtemplate: Fixes #2541 adds Letsencrypt CA cert
On patching, the global cacerts keystore is imported in 'cloud' service
specific local keystore. This fixes #2541.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-05-04 12:37:23 +05:30
Rohit Yadav 4277b92abe Merge branch '4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-30 08:22:16 +02:00
Rohit Yadav 464551208c
xenserver: Add support for XS 7.3, 7.4 and XCP-ng 7.4 (#2605)
This adds support for XenServer 7.3 and 7.4, and XCP-ng 7.4 version as hypervisor hosts. Fixes #2523.

This also fixes the issue of 4.11 VRs stuck in starting for up-to 10mins, before they come up online.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-30 08:19:10 +02:00
Rohit Yadav 71ab3aff9a Merge branch '4.11' 2018-04-20 15:29:44 +05:30
Rohit Yadav 561630e449
router: Fix routing tables for public IP NAT based access (#2579)
This fixes routing table rule setup regression to correctly router
marked packets based on interface related ip route tables. This thereby
fixes the access of VMs in the same VPC using NAT/SNAT public IPs.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-20 15:29:04 +05:30
Rohit Yadav 644b0910cd Merge branch '4.11'
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-20 00:46:43 +05:30
Rafael Weingärtner 9288c64e5f systemvm: Use double quotes with 'RROUTER' variable in "common.sh" script (#2586)
While debugging the VR for #2579, I noticed that one of the scripts were breaking. The variable RROUTER was not set and this broke a conditional.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-20 00:27:45 +05:30
Rafael Weingärtner 15afc35ff9 Forward merge branch '4.11' (PR: #2576) to master 2018-04-18 13:11:44 -03:00
Rafael Weingärtner bfe4cb0c41
Fix Python code checkstyle execute by "systemvm\test\runtests.sh" (#2576)
* dependencies update

* Add extra blank line required by ...!?

* fix W605 invalid escape sequence and more blank lines

* print all installed python packages versions
2018-04-18 13:07:37 -03:00
Rafael Weingärtner 20b93eaa06 Log command output in CsHelper.execute command (#2568) 2018-04-13 11:59:01 +02:00
Rohit Yadav e71d4d4371 CLOUDSTACK-10304: turn off apache2 server tokens and signature in systemvms (#2563)
* systemvm: turn off apache2 server tokens and signature

This turns off apache2 server version signature/token in headers.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* systemvm: remove invalid code as conf.d is not available now

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-04-13 10:44:25 +02:00
Khosrow Moossavi 535e6153cc CLOUDSTACK-10232: SystemVMs and VR to run as HVM on XenServer (#2465)
Publishing boot args both to grub and xenstore-data and let
cloud-early-config decides if the VM is in PV or HVM mode
to read from correct source.
2018-03-27 15:48:37 +05:30
René Moser c8dcc64b65 CLOUDSTACK-10341: VR minor fixes to systemvmtemplate (#2468)
- Fixes rsyslog: fix config error in rsylslog.conf

Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: action '*' treated as ':omusrmsg:*' - please use ':omusrmsg:*' syntax instead, '*' will not be supported in the future [v8.24.0 try http://www.rsyslog.com/e/2184 ]
Feb 26 08:09:54 r-413-VM liblogging-stdlog[19754]: error during parsing file /etc/rsyslog.conf, on or before line 95: warnings occured in file '/etc/rsyslog.conf' around line 95 [v8.24.0 try http://www.rsyslog.com/e/2207 ]

- Run apache2 only after cloud-postinit

- Increase /run size for VR with 256M RAM

root@r-395-VM:~# systemctl daemon-reload
Failed to reload daemon: Refusing to reload, not enough space available on /run/systemd. Currently, 15.8M are free, but a safety buffer of 16.0M is enforced.

tmpfs            23M  6.5M   16M  29% /run
2018-03-23 11:52:29 +05:30
Rohit Yadav ab0bce2a1b
CLOUDSTACK-10296: Find time different from last timestamp (#2458)
This fixes a difference issue in rVR heartbeat check script raised
recently on dev@.
Reduce logging to avoid logging to fill ramdisk
Make checkrouter return fault state when keepalived is not running

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-03-15 16:32:18 +05:30
Rohit Yadav da8cf8c370
CLOUDSTACK-10319: Prefer TLSv1.2, deprecate TLSv1.0,1.1 (#2480)
This deprecates and remove TLS 1.0 and 1.1 from preferred list of
protocols and keeps only TLSv1.2.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-03-12 11:43:59 +01:00
Rohit Yadav c0440e8124 CLOUDSTACK-10317: Fix SNAT rules for additional public nics (#2476)
* CLOUDSTACK-10317: Fix SNAT rules for additional public nics

This allows networks with additional public nics to have correct
SNAT iptables rules applied on configuration.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

* update based on Wei's suggested change

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-03-08 10:01:36 +01:00
Marc-Aurèle Brothier 97441a82f9 CLOUDSTACK-10282: ipv6 firewall rules operation should be done with ip6tables (#2450)
For ipv6 firewall rules operation should be done with ip6tables.
2018-02-15 10:09:23 +01:00
Wido den Hollander ce67726c6d CLOUDSTACK-10243: Do not use wait() on Python subprocess (#2421)
This might (and does block) in certain situations on the VR as
also explained in the Python documentation:

https://docs.python.org/2/library/subprocess.html#subprocess.Popen.wait

  Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE
  and the child process generates enough output to a pipe such that
  it blocks waiting for the OS pipe buffer to accept more data.
  Use communicate() to avoid that.

Using the check_output function handles most of this for us and
also provides better error handling.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2018-02-10 18:27:00 +01:00
Rohit Yadav 61a5a29705
CLOUDSTACK-10252: Delete dnsmasq leases file on restart (#2427)
Delete dnsmasq's leases file when dnsmasq is restarted to avoid it
use old ip-mac-address-vm mapping leases.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2018-01-24 11:09:45 +01:00
Frank Maximus 3b23d5af74 CLOUDSTACK-10245: Fix password server regression (#2419)
In case of isolated, both self.config.is_vpc() and self.config.is_router() are false,
but self.config.is_dhcp() is true.
Moved the password server logic to the `if has_metadata` block,
as this is valid for all 3 systemvm types.
2018-01-23 17:20:03 +01:00
Frank Maximus a9fdb31585 CLOUDSTACK-9749: Fix Password server running on internal LB VM (#2409)
Fixes code to start password server only on routers.
2018-01-19 13:41:57 +05:30
Wido den Hollander e01dd89c93 CLOUDSTACK-10217: Clean up old MAC addresses from DHCP lease file (#2393)
When the IPv4 address of a Instance changes we need to make sure the
old entry is removed from the DHCP lease file on the Virtual Router
otherwise the Instance will still get the old lease.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2018-01-10 00:41:55 +05:30
Rohit Yadav d19629a115 CLOUDSTACK-10013: Fixes based on code review and test failures
This includes test related fixes and code review fixes based on
reviews from @rafaelweingartner, @marcaurele, @wido and @DaanHoogland.

This also includes VMware disk-resize limitation bug fix based on comments
from @sateesh-chodapuneedi and @priyankparihar.

This also includes the final changes to systemvmtemplate and fixes to
code based on issues found via test failures.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-12-23 17:51:42 +05:30
Rohit Yadav 15b11a3b27 CLOUDSTACK-10013: Fix VMware related issues and fix misc tests
This fixes test failures around VMware with the new systemvmtemplate.
In addition:

- Does not skip rVR related test cases for VMware
- Removes rc.local
- Processes unprocessed cmd_line.json
- Fixed NPEs around VMware tests/code
- On VMware, use udevadm to reconfigure nic/mac address than rebooting
- Fix proper acpi shutdown script for faster systemvm shutdowns
- Give at least 256MB of swap for VRs to avoid OOM on VMware
- Fixes smoke tests for environment related failures

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-12-23 09:22:44 +05:30
Rohit Yadav facc5945f0 CLOUDSTACK-10193: Fix smoke tests failures with new systemvmtemplate
- Several systemvmtemplate optimizations
- Uses new macchinina template for running smoke tests
- Switch to latest Debian 9.3.0 release for systemvmtemplate
- Introduce a new `get_test_template` that uses tiny test template
  such as macchinina as defined test_data.py
- rVR related fixes and improvements

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-12-23 09:22:44 +05:30
Rohit Yadav 85aee8d18d CLOUDSTACK-10013: SystemVM codebase refactorings and improvements
- Refactors and simplifies systemvm codebase file structures keeping
  the same resultant systemvm.iso packaging
- Password server systemd script and new postinit script that runs
  before sshd starts
- Fixes to keepalived and conntrackd config to make rVRs work again
- New /etc/issue featuring ascii based cloudmonkey logo/message and
  systemvmtemplate version
- SystemVM python codebase linted and tested. Added pylint/pep to
  Travis.
- iptables re-application fixes for non-VR systemvms.
- SystemVM template build fixes.
- Default secondary storage vm service offering boosted to have 2vCPUs
  and RAM equal to console proxy.
- Fixes to several marvin based smoke tests, especially rVR related
  tests. rVR tests to consider 3*advert_int+skew timeout before status
  is checked.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-12-23 09:22:44 +05:30