Commit Graph

719 Commits

Author SHA1 Message Date
Rohit Yadav 7df52405b0 FR3: Host-HA backported changes from master (#50)
- Improves job scheduling using state/event-driven logic
- Reduced database and cpu load, by reducing all background threads to one
- Improves Simulator and KVM host-ha integration tests
- Triggers VM HA on successful host (ipmi reboot) recovery
- Improves internal datastructures and checks around HA counter
- New FSM events to retry fencing and recovery
- Fixes KVM activity script to aggresively check against last update time

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-11-07 15:51:56 +05:30
Rohit Yadav 366d82e292 FR12 (CLOUDSTACK-9993): Secure Agent Communications (#38)
This introduces a new certificate authority framework that allows
pluggable CA provider implementations to handle certificate operations
around issuance, revocation and propagation. The framework injects
itself to `NioServer` to handle agent connections securely. The
framework adds assumptions in `NioClient` that a keystore if available
with known name `cloud.jks` will be used for SSL negotiations and
handshake.

This includes a default 'root' CA provider plugin which creates its own
self-signed root certificate authority on first run and uses it for
issuance and provisioning of certificate to CloudStack agents such as
the KVM, CPVM and SSVM agents and also for the management server for
peer clustering.

Additional changes and notes:
- Comma separate list of management server IPs can be set to the 'host'
  global setting. Newly provisioned agents (KVM/CPVM/SSVM etc) will get
  radomized comma separated list to which they will attempt connection
  or reconnection in provided order. This removes need of a TCP LB on
  port 8250 (default) of the management server(s).
- All fresh deployment will enforce two-way SSL authentication where
  connecting agents will be required to present certificates issued
  by the 'root' CA plugin.
- Existing environment on upgrade will continue to use one-way SSL
  authentication and connecting agents will not be required to present
  certificates.
- A script `keystore-setup` is responsible for initial keystore setup
  and CSR generation on the agent/hosts.
- A script `keystore-cert-import` is responsible for import provided
  certificate payload to the java keystore file.
- Agent security (keystore, certificates etc) are setup initially using
  SSH, and later provisioning is handled via an existing agent connection
  using command-answers. The supported clients and agents are limited to
  CPVM, SSVM, and KVM agents, and clustered management server (peering).
- Certificate revocation does not revoke an existing agent-mgmt server
  connection, however rejects a revoked certificate used during SSL
  handshake.
- Older `cloudstackmanagement.keystore` is deprecated and will no longer
  be used by mgmt server(s) for SSL negotiations and handshake. New
  keystores will be named `cloud.jks`, any additional SSL certificates
  should not be imported in it for use with tomcat etc. The `cloud.jks`
  keystore is stricly used for agent-server communications.
- Management server keystore are validated and renewed on start up only,
  the validity of them are same as the CA certificates.

New APIs:
- listCaProviders: lists all available CA provider plugins
- listCaCertificate: lists the CA certificate(s)
- issueCertificate: issues X509 client certificate with/without a CSR
- provisionCertificate: provisions certificate to a host
- revokeCertificate: revokes a client certificate using its serial

Global settings for the CA framework:
- ca.framework.provider.plugin: The configured CA provider plugin
- ca.framework.cert.keysize: The key size for certificate generation
- ca.framework.cert.signature.algorithm: The certificate signature algorithm
- ca.framework.cert.validity.period: Certificate validity in days
- ca.framework.cert.automatic.renewal: Certificate auto-renewal setting
- ca.framework.background.task.delay: CA background task delay/interval
- ca.framework.cert.expiry.alert.period: Days to check and alert expiring certificates

Global settings for the default 'root' CA provider:
- ca.plugin.root.private.key: (hidden/encrypted) CA private key
- ca.plugin.root.public.key: (hidden/encrypted) CA public key
- ca.plugin.root.ca.certificate: (hidden/encrypted) CA certificate
- ca.plugin.root.issuer.dn: The CA issue distinguished name
- ca.plugin.root.auth.strictness: Are clients required to present certificates
- ca.plugin.root.allow.expired.cert: Are clients with expired certificates allowed

UI changes:
- Button to download/save the CA certificates.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-09-26 09:19:31 +05:30
Rohit Yadav 876fc7434d APPLE-165: Host HA management and HA provider for KVM
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.

The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.

The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.

The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2017-01-18 18:18:53 +05:30
Rohit Yadav 9d8b1fd7e5 CLOUDSTACK-8562: Make role permissions orderable
- Makes role permissions orderable in UI/backend
- Role permissions evaluated by fixed order
- Rules draggable in UI
- Migration script adds a default order

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-05-03 23:00:46 +05:30
Rohit Yadav f30c52a16c CLOUDSTACK-8562: DB-Backed Dynamic Role Based API Access Checker
This feature allows root administrators to define new roles and associate API
permissions to them.

A limited form of role-based access control for the CloudStack management server
API is provided through a properties file, commands.properties, embedded in the
WAR distribution. Therefore, customizing API permissions requires unpacking the
distribution and modifying this file consistently on all servers. The old system
also does not permit the specification of additional roles.

FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Dynamic+Role+Based+API+Access+Checker+for+CloudStack

DB-Backed Dynamic Role Based API Access Checker for CloudStack brings following
changes, features and use-cases:
- Moves the API access definitions from commands.properties to the mgmt server DB
- Allows defining custom roles (such as a read-only ROOT admin) beyond the
  current set of four (4) roles
- All roles will resolve to one of the four known roles types (Admin, Resource
  Admin, Domain Admin and User) which maintains this association by requiring
  all new defined roles to specify a role type.
- Allows changes to roles and API permissions per role at runtime including additions or
  removal of roles and/or modifications of permissions, without the need
  of restarting management server(s)

Upgrade/installation notes:
- The feature will be enabled by default for new installations, existing
  deployments will continue to use the older static role based api access checker
  with an option to enable this feature
- During fresh installation or upgrade, the upgrade paths will add four default
  roles based on the four default role types
- For ease of migration, at the time of upgrade commands.properties will be used
  to add existing set of permissions to the default roles. cloud.account
  will have a new role_id column which will be populated based on default roles
  as well

Dynamic-roles migration tool: scripts/util/migrate-dynamicroles.py
- Allows admins to migrate to the dynamic role based checker at a future date
- Performs a harder one-way migrate and update
- Migrates rules from existing commands.properties file into db and deprecates it
- Enables an internal hidden switch to enable dynamic role based checker feature

Deprecate commands.properties

- Fixes apidocs and marvin to be independent of commands.properties usage
- Removes bundling of commands.properties in deb/rpm packaging
- Removes file references across codebase

Reviewed-by: John Burwell <john.burwell@shapeblue.com>
QA-by: Boris Stoyanov <boris.stoyanov@shapeblue.com>

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2016-04-25 14:52:02 +05:30
Wido den Hollander 2b8fd2469f CLOUDSTACK-8443: Support CentOS 7 for 4.5
This is based on two PRs:
- 731
- 757

This commit is based on the 4.5 branch for a future 4.5 release.
2015-09-13 15:30:20 +02:00
Remi Bergsma 363cc7bc1f Improve cloud-install-sys-tmplt to work in dev environment again
Backported PR #678 to 4.5

This I changed:
 ``jasypt='/usr/share/cloudstack-common/lib/jasypt-1.9.0.jar'``
in master it is 1.9.2. I changed it to 1.9.0 here, to match the original script.

Original commit IDs:
  2f858a7d08
  ee9b644e28
  8a1e79f518

Signed-off-by: Remi Bergsma <github@remi.nl>
2015-08-19 19:59:24 +02:00
Remi Bergsma 43dabb611d RHEL 7 and CentOS 7 need the same fix
(cherry picked from commit d1cb4c7d50)
Signed-off-by: Remi Bergsma <github@remi.nl>
2015-08-19 19:35:38 +02:00
Remi Bergsma 91cfb6068a fixing white space and formatting
(cherry picked from commit 14013d5d1b)
Signed-off-by: Remi Bergsma <github@remi.nl>
2015-08-19 19:35:24 +02:00
Ilya Musayev a2ddf2773e CLOUDSTACK-8624: Added the support for mysql db port and lowered the requiremnts for available disk capacity to 2.1GB VS original 5GB as it was too excessive. 2015-07-10 06:59:56 +05:30
Jayapal 259b2639f5 Fixed issue in adding vm SG rules on vm reboot for xenserver 6.5
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

This closes #479

(cherry picked from commit 59e6596fef)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-06-18 13:33:01 +03:00
Wido den Hollander 9ff3fe371e CLOUDSTACK-8559: IP Source spoofing should not be allowed
We did not verify if the packets leaving an Instance had the correct
source address.

Any IP packet not matching the Instance IP(s) will be dropped

(cherry picked from commit 3e3c11ffca)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-06-15 21:51:01 +03:00
Rohit Yadav aee35c96a8 CLOUDSTACK-8252: Ignore VLAN 4095 which is n/a on linux
VLAN id 4095 is commonly used as a 'tag passthrough' in virtualization environments
(VMware, specifically). This vlan id is incompatible with Linux, but we can
allow the admin to manually configure the bridge if the same passthrough is
desired.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-05-22 13:22:40 +01:00
Rohit Yadav 3925512115 CLOUDSTACK-8401: Fix KVM's SG script to properly cleanup old network rules
- Router VMs don't have a chain rule with -def suffix, this fixes name and
  properly removes VR vms not running on a host
- Before trying to remove dnats, filter empty/None elements from list
- destroy_ebtables_rules should check what kind of action is request to be
  performed (-A for add or -D for removed) and execute based on that
- Before executing any command, log it for debugging purposes
- Method to cleanup bridge, may be used in future

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-25 03:02:10 +02:00
Rohit Yadav d66677101c CLOUDSTACK-4611: cleanup_rules using ebtables rules from /proc/modules
The SG python script depends on ebtables-save which is not available on Debian
based distros (Ubuntu and Debian for example). The commit uses /proc/modules
to find available bridge tables (one of nat, filter or broute) and then
find VMs that need to be removed. Further it uses set() to remove duplicate VMs
so we don't try to remove a VM's rules more than once leading to unwanted errors
in the log.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-25 02:58:12 +02:00
Rohit Yadav d91d161107 CLOUDSTACK-8395: vmops plugin should work on both XS 6.5 and 6.2
This fixes the issue of Security Groups not working in case of XenServer 6.5;
- Uses nethash ipset data-structure to store CIDRs (efficient than iphash and
  avoids overflow errors in case users add /8 /4 ingress/egress cidrs)
- Support for ipset versions both on 6.2 and 6.5, both have different outputs. This
  fixes the issue of destroy_network_rules_for_vm failing
- Implements defensive filtering of list, instead of popping last item without
  checking if it's None or empty
- Greps using names that are 'quoted' to avoid bash errors
- Before setting up new network rule, tries to clean and remove old ipset entry
- Idents, whitespace and naming fixes

PS. This is my 1000th commit to the 🐵 project :)

This closes #186

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-23 14:08:17 +02:00
Rohit Yadav 64ab3554a1 scripts: filter output instead of popping string from list
This is a defensive enhancement for KVM SG script that filters out empty string
instead of popping last item which may or may not be an empty string.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-21 17:33:18 +02:00
Rohit Yadav f4cbc4c010 xenserver: remove unwanted vmops.orig file (created during a past merge)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-21 12:58:39 +02:00
Remi Bergsma 43a9eb40b8 make sure sync cannot block reboot
The recent discussed improvement has the risk that if 'sync' hangs, the reboot may be delayed in the same way as the 'reboot' command would do. To work around, we're adding a 5 second timeout. If it cannot sync in 5 seconds, it will not succeed anyway and we should proceed the reset.

@snuf: Could we use your OVM3 heartbeat script for other hypervisors as well? One way to do it seems like a nice idea :-)
2015-04-10 15:14:08 -05:00
Remi Bergsma b23661931a write logfile just before rebooting the host
As discussed with @wido @pyr and @nuxro added an extra log line.

Tested it and it logs fine (tested to local disk) when syncing first:
Apr  3 15:31:23 mcctest2 heartbeat: kvmheartbeat.sh system because it was unable to write the heartbeat to the storage

By the way, it did also log to the agent.log but this extra log has the benefit of ending up in the system log so you'll probably find it easier there. Existing logs:
2015-04-03 15:27:23,943 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 0
2015-04-03 15:28:23,944 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 1
2015-04-03 15:29:23,946 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 2
2015-04-03 15:30:23,948 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 3
2015-04-03 15:31:23,950 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout, retry: 4
2015-04-03 15:31:23,950 WARN  [kvm.resource.KVMHAMonitor] (Thread-24:null) write heartbeat failed: timeout; reboot the host

This closes #145

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-04-10 15:13:37 -05:00
Remi Bergsma a92315f28c reboot much faster in case of storage failure
When storage cannot be reached, it does not make sense to reboot as it will try to flush buffers, umount NFS mounts, etc. This will not work and thus cause a long delay. With this change, the box will reboot immediately (like pressing the reset button).
2015-04-10 15:13:27 -05:00
Star Guo 290938b08e scripts: add ip set interface up because in CentOS7 the interface will not auto up
This closes #97

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-03-10 10:18:10 +05:30
Jayapal dd6bcde65b CLOUDSTACK-8298: Update copying large size VR config file in xenserver
When there is large size VR configuration (aggregate commands) copying data to VR using vmops plugin was failed
 because of the ARG_MAX size limitation. The configuration data size is around 300KB.

 Updated this to create file in host by scp with file contents. This will create file in host.
 Then copy the file from the host to VR using hte vmops createFileInDomr method.

  In host file get created in /tmp/ with name VR-<UUID>.cfg, once it copied to VR this file will be removed.

(cherry picked from commit 619f014255)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-03-04 16:56:07 +05:30
Rohit Yadav f70afa1375 scripts: use cloudmanagementserver.keystore instead of cloud.keystore
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-02-28 17:05:29 +05:30
Marcus Sorensen e77b4ae8e8 CLOUDSTACK-8263: KVM - use virsh instead of libvirt for resizing qcow2, as libvirt bindings are insufficient
Change-Id: I08246219cb1469a46dc6a9ec76a8c3a67b0b8bf6
2015-02-17 18:10:08 -08:00
Marcus Sorensen 61a50ce5e4 CLOUDSTACK-8263: KVM - notify qemu process of resized volume for libvirt-resized storage
Change-Id: Iddd8bb068855d3565075d3ecf7c6c0f074d00e1a
2015-02-17 14:27:13 -08:00
Rohit Yadav cf7a8cc052 CLOUDSTACK-8220: Let's have a separate XenServer 6.5 resource
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
(cherry picked from commit 06437dadf5)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2015-02-06 14:45:06 +05:30
Remi Bergsma 66b77380d0 use directIO flags when dd'ing template
This makes sure dom0 in xenserver doesn't get hammered
when copying templates. It doesn't make sense to use
the cache of dom0 as the template does not fit in
memory. The directIO flags prevent it from trying.

(cherry picked from commit 4e1527e87a)
2014-12-16 10:49:57 +01:00
Rohit Yadav 752980f370 Revert "packaging: updated hardcoded jasypt version to 1.9.2"
This reverts commit 43f39a1ec3.
2014-12-04 19:47:10 +05:30
Rohit Yadav 43f39a1ec3 packaging: updated hardcoded jasypt version to 1.9.2
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
2014-12-04 04:02:46 +05:30
Edison Su 2667855ccb CLOUDSTACK-5446:
delete all the leftover snapshots on primary storage in case of snapshot
errors, after a new backup snapshot is finished
2014-11-19 15:54:09 -08:00
Pierre-Luc Dion 6c955a3a47 CLOUDSTACK-7887: change int to str into swiftxen 2014-11-12 19:22:13 -05:00
Sanjay Tripathi e6907ed8df CLOUDSTACK-7868: Failed storage.PrimaryStorageDownloadCommand leaves corrupt VDIs in primary storage. 2014-11-08 13:46:45 +05:30
Rajesh Battala 42fd2d91da CLOUDSTACK-7654 fixed issues with zip format templates.
(cherry picked from commit 67ff7dac82)
2014-10-13 00:30:23 -04:00
David Nalley ac48aa8e0c cleaning up some from a revert 2014-10-12 23:30:04 -04:00
Hugo Trippaers 6687727b76 Frank forgot the license header 2014-09-19 10:14:38 +02:00
Frank Zhang 8b89494a35 CLOUDSTACK-6278
Baremetal Advanced Networking support
2014-09-18 16:54:37 -07:00
Daan Hoogland 6e1e56d399 CLOUDSTACK-7527 reboot faster by writing to /proc/sysrq-trigger
(cherry picked from commit d04f59a30d)
2014-09-18 12:51:42 +02:00
Daan Hoogland dec9133dcd CLOUDSTACK-7184: xenheartbeat gets passed timeout and interval
(cherry picked from commit 4d065b9a3a)

Conflicts:
	plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/discoverer/XcpServerDiscoverer.java
	plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/resource/CitrixResourceBase.java
	server/src/com/cloud/configuration/Config.java
	server/src/com/cloud/configuration/ConfigurationManagerImpl.java
	server/src/com/cloud/resource/DiscovererBase.java
2014-09-18 12:51:10 +02:00
Kishan Kavala 4f3de024de Add script to ensure cgroups are not co-mounted in rhel7/lxc. If required, script will unmount co-mounted cgroups and remount them seperately 2014-09-11 14:34:40 +05:30
Kishan Kavala 08dc5c6f91 CLOUDSTACK-7428: Allow LXC cluster in SG enabled zones. Use lxc driver in security_group.py script for lxc host 2014-08-27 11:52:59 +05:30
Anthony Xu bd6f03aa95 iptreemap is not supported in new ipset, use iphash instead 2014-08-25 11:22:30 -07:00
Kishan Kavala b37ee25359 replace vconfig with ip link 2014-08-22 15:39:04 +05:30
Vincent Bernat 53650ed7bf CLOUDSTACK-7193: handle domain ID being an int
Recent versions of libvirt (at least 0.9.8) will return an int when
queried for the ID of a domain, not a string. This breaks some parts of
the `security_group.py` script which expects a string containing an
int. Notably, this breaks the part handling VM reboots which is
therefore not executed.

Signed-off-by: Vincent Bernat <Vincent.Bernat@exoscale.ch>
Signed-off-by: Sebastien Goasguen <runseb@gmail.com>
2014-08-18 10:36:21 -04:00
Brenn Oosterbaan 7c92bac4a3 CLOUDSTACK-7345 changed dd blocksize to 128k when using NFS.
Signed-off-by: Daan Hoogland <daan@onecht.net>
(cherry picked from commit 8b7130fa65)
2014-08-14 10:10:13 +02:00
Edison Su f30fc6b673 need to check ccp-qemu-img 2014-08-12 15:13:42 -07:00
Joris van Lieshout 37baddd721 dd with direct io is less impacting on Dom0 kernel resources
Signed-off-by: Daan Hoogland <daan@onecht.net>
(cherry picked from commit c4b78c3aaa)
2014-08-12 13:17:02 +02:00
Frank.Zhang 12ad254069 CLOUDSTACK-6278
Baremetal Advanced Networking support

    add missing license header
2014-08-05 11:11:02 -07:00
Frank.Zhang 1ee7e0c77e CLOUDSTACK-6278
Baremetal Advanced Networking support
2014-08-04 15:00:44 -07:00
Frank.Zhang 88f866645b fix iptables chain name too long (must be under 30 chars) 2014-07-18 17:31:06 -07:00