This implements an out-of-band management plugin for nested-cloudstack
environments where the hypervisor host is a VM in a parent CloudStack environment
that is used as a host in the (testing) CloudStack environment. This plugin
allows power operations to translate into start/stop/reboot of the VM (host).
The out-of-band management configuration accepted are:
- Address: The API URL of the parent CloudStack enviroment
- Port: The uuid of the (host) VM in the parent CloudStack environment
- Username: The apikey of the user account who has ownership on the (host) VM
- Password: The secretkey of the user account who has ownership on the (host) VM
Note: change password of the oobm interface is not support by this plugin
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Host-HA offers investigation, fencing and recovery mechanisms for host that for
any reason are malfunctioning. It uses Activity and Health checks to determine
current host state based on which it may degrade a host or try to recover it. On
failing to recover it, it may try to fence the host.
The core feature is implemented in a hypervisor agnostic way, with two separate
implementations of the driver/provider for Simulator and KVM hypervisors. The
framework also allows for implementation of other hypervisor specific provider
implementation in future.
The Host-HA provider implementation for KVM hypervisor uses the out-of-band
management sub-system to issue IPMI calls to reset (recover) or poweroff (fence)
a host.
The Host-HA provider implementation for Simulator provides a means of testing
and validating the core framework implementation.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Fixes usage server blocker which fails to work due to missing role_id
column in cloud_usage.account table.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
KVM hosts on shared storage failure was accepted by mgmt server with the
host state as Up, even though there was no primary/shared storage available on
it. This patch offers a quick fix by throwing an exception in the storage monitor
which connects storage pool on host. The failure is trapped by agent manager
that disconnects the agent without any investigation.
Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when
the storage is inaccessible (firewalled, or shutdown) before returning back with
an error. It is safe to assume that this won't add pressure on mgmt server due to
several reconnection attempts, and KVM agent would retry reconnection every 2
minutes.
For such KVM hosts, where failure happens due to storage issues; they will be
briefly put in Alert state but will be mostly be in Connecting state during which
the KVM host attempts to mount/reconfigure NFS storage pool.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Removes order by id which is not necessary as we already do order by
sort_order. An additional order by seems to have caused mysql errors
in some environment, though it was not reproducible with MySQL 5.5/5.6/5.7
but this can be safely removed as it's not necessary.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.
Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (http://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.
This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host
For testing this feature `ipmisim` can be used:
https://pypi.python.org/pypi/ipmisim
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Makes role permissions orderable in UI/backend
- Role permissions evaluated by fixed order
- Rules draggable in UI
- Migration script adds a default order
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Uses non-blocking SSL handshake and non-blocking connections
- Uses 60s as timeout for both client/server to guard against indefinitely
blocking clients
- Unit test to prove fix, client and malicious clients trying to connect to server
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Makes role permissions orderable in UI/backend
- Role permissions evaluated by fixed order
- Rules draggable in UI
- Migration script adds a default order
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This feature allows root administrators to define new roles and associate API
permissions to them.
A limited form of role-based access control for the CloudStack management server
API is provided through a properties file, commands.properties, embedded in the
WAR distribution. Therefore, customizing API permissions requires unpacking the
distribution and modifying this file consistently on all servers. The old system
also does not permit the specification of additional roles.
FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Dynamic+Role+Based+API+Access+Checker+for+CloudStack
DB-Backed Dynamic Role Based API Access Checker for CloudStack brings following
changes, features and use-cases:
- Moves the API access definitions from commands.properties to the mgmt server DB
- Allows defining custom roles (such as a read-only ROOT admin) beyond the
current set of four (4) roles
- All roles will resolve to one of the four known roles types (Admin, Resource
Admin, Domain Admin and User) which maintains this association by requiring
all new defined roles to specify a role type.
- Allows changes to roles and API permissions per role at runtime including additions or
removal of roles and/or modifications of permissions, without the need
of restarting management server(s)
Upgrade/installation notes:
- The feature will be enabled by default for new installations, existing
deployments will continue to use the older static role based api access checker
with an option to enable this feature
- During fresh installation or upgrade, the upgrade paths will add four default
roles based on the four default role types
- For ease of migration, at the time of upgrade commands.properties will be used
to add existing set of permissions to the default roles. cloud.account
will have a new role_id column which will be populated based on default roles
as well
Dynamic-roles migration tool: scripts/util/migrate-dynamicroles.py
- Allows admins to migrate to the dynamic role based checker at a future date
- Performs a harder one-way migrate and update
- Migrates rules from existing commands.properties file into db and deprecates it
- Enables an internal hidden switch to enable dynamic role based checker feature
Deprecate commands.properties
- Fixes apidocs and marvin to be independent of commands.properties usage
- Removes bundling of commands.properties in deb/rpm packaging
- Removes file references across codebase
Reviewed-by: John Burwell <john.burwell@shapeblue.com>
QA-by: Boris Stoyanov <boris.stoyanov@shapeblue.com>
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
There 2 things which has been changed.
* We look on power_state_update_time instead of update_time. Didn't make sense to me at all to look at update_time.
* Due DB update optimisation, powerState will only be updated if < MAX_CONSECUTIVE_SAME_STATE_UPDATE_COUNT. That is why we can not rely on these information unless we make sure these are up to date.
This patch makes it possible to expose VM UUID to subsystems, this can be
useful for implementing VM Snapshots for KVM in future.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In backupSnapshot, it checks for snapshot in primary but does not check in advance if
it is null.
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
* Move config options to SAML plugin
This moves all configuration options from Config.java to SAML auth manager. This
allows us to use the config framework.
* Make SAML2UserAuthenticator validate SAML token in httprequest
* Make logout API use ConfigKeys defined in saml auth manager
* Before doing SAML auth, cleanup local states and cookies
* Fix configurations in 4.5.1 to 4.5.2 upgrade path
* Fail if idp has no sso URL defined
* Add a default set of SAML SP cert for testing purposes
Now to enable and use saml, one needs to do a deploydb-saml after doing a deploydb
* UI remembers login selections, IDP server
- CLOUDSTACK-8458:
* On UI show dropdown list of discovered IdPs
* Support SAML Federation, where there may be more than one IdP
- New datastructure to hold metadata of SP or IdP
- Recursive processing of IdP metadata
- Fix login/logout APIs to get new interface and metadata data structure
- Add org/contact information to metadata
- Add new API: listIdps that returns list of all discovered IdPs
- Refactor and cleanup code and tests
- CLOUDSTACK-8459:
* Add HTTP-POST binding to SP metadata
* Authn requests must use either HTTP POST/Artifact binding
- CLOUDSTACK-8461:
* Use unspecified x509 cert as a fallback encryption/signing key
In case a IDP's metadata does not clearly say if their certificates need to be
used as signing or encryption and we don't find that, fallback to use the
unspecified key itself.
- CLOUDSTACK-8462:
* SAML Auth plugin should not do authorization
This removes logic to create user if they don't exist. This strictly now
assumes that users have been already created/imported/authorized by admins.
As per SAML v2.0 spec section 4.1.2, the SP provider should create authn requests using
either HTTP POST or HTTP Artifact binding to transfer the message through a
user agent (browser in our case). The use of HTTP Redirect was one of the reasons
why this plugin failed to work for some IdP servers that enforce this.
* Add new User Source
By reusing the source field, we can find if a user has been SAML enabled or not.
The limitation is that, once say a user is imported by LDAP and then SAML
enabled - they won't be able to use LDAP for authentication
* UI should allow users to pass in domain they want to log into, though it is
optional and needed only when a user has accounts across domains with same
username and authorized IDP server
* SAML users need to be authorized before they can authenticate
- New column entity to track saml entity id for a user
- Reusing source column to check if user is saml enabled or not
- Add new source types, saml2 and saml2disabled
- New table saml_token to solve the issue of multiple users across domains and
to enforce security by tracking authn token and checking the samlresponse for
the tokens
- Implement API: authorizeSamlSso to enable/disable saml authentication for a
user
- Stubs to implement saml token flushing/expiry
- CLOUDSTACK-8463:
* Use username attribute specified in global setting
Use username attribute defined by admin from a global setting
In case of encrypted assertion/attributes:
- Decrypt them
- Check signature if provided to check authenticity of message using IdP's
public key and SP's private key
- Loop through attributes to find the username
- CLOUDSTACK-8538:
* Add new global config for SAML request sig algorithm
- CLOUDSTACK-8539:
* Add metadata refresh timer task and token expiring
- Fix domain path and save it to saml_tokens
- Expire hour old saml tokens
- Refresh metadata based on timer task
- Fix unit tests
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This closes#489
/+= may break on some environments, url safe encoded passwords will have -_,
characters which are more acceptable
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
VMs in transition states - Starting, Stopping, Migrating - are also taken into account for enforcing "max. guest limit"
(cherry picked from commit 3100fc1554)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Correctly update the status of an uploaded volume upon failure to attach it to a VM.
(cherry picked from commit 10a106f5d8)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
During ping task, while scanning and updating status of all VMs on the host that are stuck in a transitional state
and are missing from the power report, do so only for VMs that are not removed.
(cherry picked from commit de7173a0ed)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Set destination volume path as NULL while duplicating volume during migration.
If migration fails, destination volume will be marked as removed. And if migration succeeds, volume path will be rightly updated.
(cherry picked from commit d30d5644bb)
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>