17 KiB
Marvin Refactor
The Marvin test framework will undergo some key improvements as part of this refactor:
- All CloudStack resources modelled as entities which are more object-oriented
- Data modelled as factories that form basic building blocks
- DSL support for assertions
Introduction
Marvin which has been used thus far for testing has undergone several significant changes in this refactor. Many of these changes were driven by the need for succinctly describing a test scenario in a few lines of code. This document describes the changes and the reasons behind this refactor. While this makes the framework simple to use the internals of marvin have become a bit complex. For this reason we will cover some of the internal workings as part of this document.
Rationale
Two main rationale were responsible for this refactor
- Brittle nature of the integration library
- Separating data from the test
Integration library
Typically to write a test case previously the test case author was expected to
know (in advance) all the APIs he was going to call to complete his scenario.
With the growing list of APIs, their parameters and optional arguments it
becomes tedious often to compose a single API call. To overcome this the
integration libraries were written. These libraries (integration.lib.base, integration.lib.common etc) present a list of resources or entities - eg:
VirtualMachine, VPC, VLAN to the library user. Each entity can perform a set of
operations that in turn transform into an API call.
class VirtualMachine(object):
def deploy(self, apiclient, service, template, zone):
cmd = deployVirtualMachine.deployVirtualMachineCmd()
cmd.serviceofferingid = service
cmd.templateid = template
...
...
def list(self,apiclient)
cmd = listVirtualMachines.listVirtualMachinesCmd()
return apiclient.listVirtualMachines(cmd)
This makes the library usage more object-oriented. So in the testcase the author only has to make a call to the VirtualMachine class when creating/destroying/starting/stopping virtualmachine instances.
The disadvantage of this approach is that the integration library is hand-written and brittle. When changes are made several tests are affected in the process. There are also inconsistencies caused by mixing the data required for the API call with the arguments of the operation being performed. eg:
class VirtualMachine(object):
....
@classmethod
def create(cls, apiclient, services, templateid=None, accountid=None,
domainid=None, zoneid=None, networkids=None, serviceofferingid=None,
securitygroupids=None, projectid=None, startvm=None,
diskofferingid=None, affinitygroupnames=None, group=None,
hostid=None, keypair=None, mode='basic', method='GET'):
....
....
In this call, every argument is optionally lookedup in the services dictionary or as part of the argument thereby complicating the body of the create(..) call. Also the naming and the size of the API call is daunting for anyone using the library.
Data vs Test
Another major disadvantage of the previous approach was data required for the test was mixed with the test itself. This made it difficult to generate new data from existing data objects. Data being highly coupled with the test reduces readability.
Additionaly due to the strict structure of this data it would impose itself onto the implementation of a resource's methods in the integration library.
However all of the data is reusable by other tests if presented as factories. The refactor will address this using factories that act as building blocks for creating reusable data. The document also describes how these blocks are extended.
CloudStack API Generation
The process of API module generation remains the same as before. CloudStack expresses its API in XML and JSON via the ApiDiscovery plugin. For instance the createFirewallRule API looks as follows (some fields removed for brevity)
"api": [
{
"name": "createFirewallRule",
"description": "Creates a firewall rule for a given ip address",
"isasync": true,
"params": [
{
"name": "cidrlist",
"description": "the cidr list to forward traffic from",
"type": "list",
"length": 255,
"required": false
},
{
"name": "icmpcode",
},
{
"name": "icmptype",
},
{
"name": "type",
},
],
"response": [
{
"name": "state",
"description": "the state of the rule",
"type": "string"
},
{
"name": "endport",
},
{
"name": "protocol",
},
],
"entity": "Firewall"
}
]
This JSON/XML can be used to create a binding in your favorite language and for Marvin's purpose this will be python. An API module named createFirewallRule.py with two classes (request and response) - createFirewallRuleCmd and createFirewallRuleResponse represents the creation of firewall rules.
Changes to API Discovery
Generated API modules now include the entity attribute from the listApi
response. The API discovery plugin has been enhanced to include the type of
entity that an API is acting upon. For instance when doing createFirewallRule
the entity that the user is dealing with is the Firewall. We do not
intuitively guess what entity an API acts upon but depend on the CloudStack
endpoint to tell us this information. Mostly because we cannot always predict
the entity an API acts upon using the name of the API
eg: dedicatePublicIpRange
listapisresponse: {
count: 1,
api: [
{
name: "dedicatePublicIpRange",
description: "Dedicates a Public IP range to an account",
isasync: false,
related: "listVlanIpRanges",
params: [],
response: [],
entity: "VlanIpRange"
}
]
}
}
This transforms into the following Marvin entity class through auto-generation:
class VlanIpRange(CloudStackEntity):
def dedicate(self, apiclient, account, domainid, **kwargs):
cmd = dedicatePublicIpRange.dedicatePublicIpRangeCmd()
cmd.id = self.id
cmd.account = account
cmd.domainid = domainid
[setattr(cmd, key, value) for key,value in kwargs.iteritems()]
publiciprange = apiclient.dedicatePublicIpRange(cmd)
return publiciprange if publiciprange else None
kwargs represents all the optional arguments for dedicatePublicIpRange
The use of the entity in generating a higher level model for the CloudStack API is described in the next section.
Entity and Factory Generation
Marvin now includes a new module named generate that contains all the code
generators.
xmltoapi.py- this module is responsible for converting the JSON/XML response to a python binding. Previously this was thecodegenerator.pyapitoentity.py- this module is responsible for grouping actions on a given entity into a single module and define all its actions as methods on the entity object.entity.py- is the base entity creator that transforms an API into a cloudstackEntityfactory.py- is the base factory creator that transforms an API into a factory
For eg: in the method createFirewallRule the entity is the Firewall and the
action being performed on the entity is create
So our entity becomes
class Firewall:
def create(...):
createFirewallRule()
Almost all APIs are transformed naturally into this model but there are a few
exceptions. These exceptions are dealt with by the linguist.py module in
which APIs that don't split this way are broken down using special
transformers.
Required and Optional Arguments
All required arguments to an API will be available in the API operation
Entity.verb(reqd1=None, reqd2=None, ..., **kwargs)
Here the Entity (eg:Firewall) can perform an operation verb() (eg:create)
using the arguments [reqd1, reqd2]. The optional arguments (if any) will be
passed as key, value pairs to the keyword args **kwargs.
All entity classes are autogenerated and placed in the marvin.entity module.
You may want to look at some sample entities like virtualmachine.py or
network.py. To anyone who has used the previous version of marvin, these will
look familiar. If you are looking at them for the first time, it will be
obvious to you that each entity is a simple class defined with CRUD operations
that map to the cloudStack API.
-
Creators A creator of an entity is the API operation that brings the entity into existence on the cloud. For instance a firewall rule is created using the createFirewallRule API. Or a virtualmachine comes into existence with the deployVirtualMachine command. These are our creators for entities firewall and virtualmachines respectively. Every entity class's
__init__method is basically a call to its creator -
Enumerators Often it is not necessary to bring an entity into existence since it is already present on the cloud infrastructure. We simply list* these entities and should still be able to treat them and use them like entities created using their corresponding creator methods. The list* APIs become our enumerators for each entity.
Factories
Factories in cloudstack are implemented using the factory_boy framework. The factory_boy framework helps cloudstack define complex relationships in its model. For eg. In order to create a virtualmachine typically one needs a service offering, a template and a zone present to be able to launch the VM. Factory boy enables traversing these object relationships effectively (top-down or bottom-up) to create those objects.
Every entity in the new framework is created using its corresponding factory
EntityFactory. Factories can be thought of as objects that carry necessary
and sufficient data to satisfy the API call that brings the entity into
existence. For example in order to create an account the AccountFactory will
carry the firstname, lastname, email, username of the Account since these
are the required arguments to the createAccount API.
So the account factory looks as follows:
import factory
class AccountFactory(factory):
FACTORY_FOR = Account
accounttype = None
firstname = None
lastname = None
email = None
username = None
password = None
Here the AccountFactory is a bare representation with all None fields. These
are the default factories. The default factories are simply base classes for
defining hierarchical data using inheritance. For instance we have three
types of accounts in cloudstack - DomainAdmin, Admin and User
Each of these accounttypes represents an inheritance from the AccountFactory.
And for each factory we have a specific value for the accounttype. In fact we
don't have to repeat ourselves when defining a factory for each type of account:
UserAccount(AccountFactory)
AdminAccount(UserAccount) with (accounttype=1)
DomainAdminAccount(UserAccount) with (accounttype=2)
By simply altering the accounttype and having Admin and DomainAdmin inherit from User we have defined factories for all types of accounts in cloudstack
In order to create accounts in our tests all we have to do is the following:
class TestAccounts(cloudstackTestCase):
def setUp(...):
apiclient = getApiClient()
def test_AccountForUser(...):
user = UserAccount(apiclient)
assert user is valid
def test_AccountForAdmin(...):
admin = AdminAccount(apiclient)
assert admin is valid
def test_AccountForDomainAdmin(...):
domadmin = DomainAdminAccount(apiclient)
assert domadmin is active
def tearDown(...):
user.delete()
admin.delete()
domadmin.delete()
Basic tools for extending factories
Sequences
Sequences are provided by factory boy to randomize the object generated by each call to the factory. Typically these are incremented integers but for the CloudStack objects each distinguishing attribute is randomized to prevent collisions and duplicate objects.
To define an attribute as a sequence we simply call the factory.Sequence(..) method with a lambda function defining said sequence.
eg:
class SharedNetworkOffering(NetworkOfferingFactory):
name = factory.Sequence(lambda n: 'SharedOffering' + my_random_generator_function(n))
...
SubFactory
SubFactories are an important factory_boy building block for creating factories that depend on other factories.
For eg: in order to create a SharedNetwork a networkofferingid of a SharedNetworkOffering is required. So we first call on the factory of SharedNetworkOffering using the factory.SubFactory(..) and use the id to create the SharedNetwork using the SharedNetwork's factory
class SharedNetwork(NetworkFactory):
name = factory.Sequence(...)
networkoffering = \
factory.SubFactory(
SharedNetworkOffering,
attr1=val1
)
networkofferingid = networkoffering.id
RelatedFactory is a special case of SubFactory in that RelatedFactories are created after the existing factory is created.
SubFactories are very powerful to chain many factories together to compose complex objects in cloudstack.
PostGeneration Hooks
In many cases additional hooks are done to simplify working with cloud
resources. For instance, when creating a virtual machine in an advanced zone it
is useful to associate a NAT rule to be able to SSH into the virtual machine
for post processing the effects on the virtualmachine like testing connectivity
to the internet for instance. PostGeneration hooks work after factories have
been created to perform such special functions. For examples, check the
marvin.factory.data.vm module for the VirtualMachineWithStaticNat factory
where we create a static nat rule allowing SSH access to the created VM.
Guidelines for defining new factories
All factories are auto-generated and there is no need to define the default
factories. Test case authors will mostly be creating data factories inherited
from the default factories. All the data factories are defined in
marvin.factory.data. Currently implementations are provided for often used
data objects.
- networkoffering
- networks
- service and disk offerings
- security groups
- virtualmachine
- vpcoffering
- vpcvirtualmachine
- firewallrules
- ingress and egress rules
and many more implementations should serve as examples to extend new data objects.
Factory naming convention is simple. Any data inheriting from default factory
EntityFactory should be named without the suffix Factory. The data should
take the name of the purpose of the factory. Use simple prepositions
(Of,And,With etc) to combine words. For instance: VirtualMachineWithStaticNat
or VirtualMachineInIsolatedNetwork. Naming the data clearly aids its widespread
use. A badly named factory will likely not be used in more than one test.
Should DSL assertions
The typical assertion capabilites of unittest are enough to express all validation but it does not read naturally. Should_dsl is a library that makes the assertions read like natural language. This is installed by default with marvin now enabling all test cases to write assertions using simple dsl statements
eg:
vm = VirtualMachineIsolatedNetwork(apiclient)
vm.state | should | equal_to('Running')
vm.nic | should_not | be(None)
Utilities
All the pre-existing utilities from the previous util.py are still available
with enhancements in the util.py module. The legacy util.py module is
deprecated but retained since older tests refer to this module. All new changes
should go to the util.py under marvin/
unittest2 and nose2
Marvin earlier was coupled with Python2.7 since python's unittest did not have the same capabilites in versions <2.7. With unittest2 all features are now backported to older python implementations. Marvin has also switched to unittest2 so that we don't have to depend on the specific version of python to be able to install and use marvin for testing. This change is internal and should not be felt by the test case writer.
There are plans to move to nose2 as well but this is separated from factory work at the moment.
Legacy Libraries and Tests
In order to not disrupt the running of existing tests all the older libraries
in base.py, common.py and util.py are moved to the legacy module. Any new
tests should be written using factories. Older libraries are retained to be
able to run our existing tests whose imports will be switched as part of this
refactor.