.. _reproducibility: Reproducibility guide =============================== .. attention :: This lab is all inclusive (including MAD and EnOS) at https://gitlab.inria.fr/Madeus/mad-openstack Topology deployed in this lab ------------------------------- The lab makes use of *EnOS* to book the resources on Grid'5000, and *MAD* to deploy OpenStack on those resources. In particular, we will need four G5K machines for our deployment: 1. **mad-node**: a machine we will deploy ourselves to run *EnOS* and *MAD*; 2. **control node**: hosts the control modules, projects' APIs and databases; 3. **network node**: hosts network agents; 4. **compute node**: manages the compute modules where guest VMs live. Note that while we will deploy *mad-node* ourselves on G5K, the three other nodes will be deployed automatically by EnOS. The following figure depicts the status of the different components in play during the lab: .. literalinclude:: topology.txt EnOS will be in charge of provisioning the compute, control and network nodes. Afterwards, *mad* will deploy Docker containers inside each nodes, which correspond to OpenStack services. For instance, the *control-node* will host the /nova-api/ and /nova-scheduler/ containers while the *compute-node* will host the /nova-compute/ and /nova-libvirt/ containers to provide VM hypervisor mechanisms. Note that to deploy on G5K, we need a dedicated node to run *EnOS* and *MAD* because it is discouraged to run experiments on the frontend. This restriction is meant to avoid disturbing other users that are logged to the frontend node, since it has limited resources. On a regular deployment, EnOS could be run directly from your laptop. Provisioning of the mad-node ------------------------------ The first step is to determine on which cluster you will deploy OpenStack. To that end, you can run *funk* (Find yoUr Nodes on g5K) from any frontend to see the availability on G5K: .. code-block:: console user@laptop~$ ssh rennes.g5k frennes:~$ funk -w 4:00:00 In this example, we check the availability of G5K's clusters for the next four hours (adapt the time regarding your situation). Note that you can adapt the time of your reservation afterward, using the *oarwalltime* command. Find a cluster with at least four nodes available before going further. Once it is done, reach the cluster's site first, and then, get a new machine which we will use as our Madeus node (mad-node). In this document, we target the *nova* cluster, located in the *Lyon* site: .. code-block:: console frennes:~$ ssh lyon Note that we created a ~tmux~ session in order to be resilient to any failure during your ssh session. Whenever you want to restore this session, you can connect to the frontend and attach to your /tmux session/, as follows: .. code-block:: console flyon:~$ tmux Then, there are two strategies regarding how you can setup the Madeus node: 1. The first one (the easiest and fastest one) can be sufficient if you are not interested in collecting metrics from the mad-node. It will set the mad-node without *kadeploy*. 2. Otherwise, you should deploy this node with *kadeploy*. Deploy the mad-node without Kadeploy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: console flyon:~$ oarsub -I -l "nodes=1,walltime=4:00:00" -p "cluster='nova' In this example, we get a new machine in interactive mode (i.e. "-I") for the next four hours from the *nova* cluster. If it succeeds you should be directly connected to this node (check your prompt). Deploy the mad-node with Kadeploy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: console flyon:~$ oarsub -I -l "nodes=1,walltime=4:00:00" -p "cluster='nova'" -t deploy flyon:~$ MAD_NODE=$(cat $OAR_NODE_FILE | uniq) flyon:~$ kadeploy3 -f ${OAR_NODE_FILE} -e debian9-x64-big -k ... flyon:~$ # The previous command can take some time... ... flyon:~$ ssh root@${MAD_NODE} apt-get install dirmngr flyon:~$ ssh ${MAD_NODE} The difference here regarding the *oarsub* command lies in the ``-t deploy`` which is used to set your job in "deploy" mode. Then, we ask *kadeploy* to provision the node with a g5k ``debian9-x64-big`` environment. Finally, we need to install ``dirmngr`` on the node. Install MAD and EnOS on the mad-node ------------------------------------- Download MAD in your working directory, and install it: .. code-block:: console user@mad-node:~$ git clone https://gitlab.inria.fr/dpertin/mad_model -b expe/openstack mad user@mad-node:~$ cd mad user@mad-node:~/mad$: make install_deps user@mad-node:~/mad$: source venv/bin/activate Note that MAD is a Python project. We installed it inside a virtual environment, with *virtualenv*, to avoid any conflict regarding the version of its dependencies. Furthermore, it does not install anything outside the virtual environment which keeps your OS clean. Remember that you have to be in the virtual environment to use MAD. It means that if you open a new terminal on the *mad-node*, you need to re-enter the /venv/. For instance, now that MAD is installed, you can come as follow: : .. code-block:: console user@laptop:~$ ssh lyon.g5k user@flyon:~$ cd ~/mad user@flyon:~/mad$: source venv/bin/activate Please check everything is fine by running: .. code-block:: console (venv) user@mad-node:~/mad$: ./bench.py -h Prepare the Benchmark ------------------------------- Set the benchmark parameters ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The benchmark parameters are set in the ``bench_params.yaml`` file. In this file, you can define many benchmark scenarios that can be selected when you call the MAD benchmark tool. Here is a scenario example: .. literalinclude:: bench_params.yaml :linenos: :language: yaml In a scenario, we first set the *Execo* parameters under the key ``parameter``: * *tool*: the list of tool to consider (``mad`` and/or ``shell``); * *test_type*: the list of tests to benchmark; * *registry*: the configuration of the registry (``cached``, ``local`` and/or ``remote``). * *repeat* (optional): the list of iteration index to play. Furthermore, we need global parameters: * *iterations* (optional): the number of iterations; * *monitoring*: when set to True it deploys the monitoring stack on nodes; * *reservation_file*: path to the ``reservation.yaml`` file required by EnOS. For instance, the ~perf~ scenario is adapted to compare multiple Madeus component definitions. Here it compares the deployment time between Ansible-like (i.e. ``seq_1t``), Aeolus-like (i.e. ``dag_2t``), Mad-like (i.e. ``dag_nt4``) definitions, regarding three registry configurations. Set the EnOS parameters ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To book resources, EnOS reads a ``configuration`` file. This file states the OpenStack resources you want to measure together with their topology. A configuration could say, "Deploy a basic OpenStack on a single node", or "Put OpenStack control services on ClusterA and compute services on ClusterB", but also "Deploy each OpenStack services on a dedicated node and add WAN network latency between them". The description of the configuration is done in a ``reservation.yaml`` file: .. literalinclude:: reservation_g5k_perf.yaml :linenos: :language: yaml Use your favorite text editor to open/create the ``reservation.yaml`` file, for instance: ``vim enos/reservation_g5k_perf.yaml``, and edit the file to fit your situation. In the following, we will study some parts of the document in particular: 1. ``provider`` key: define on which testbed to deploy OpenStack; 2. ``resources`` key: define how many and what machines to deploy on the testbed. First, this file provides a description of the desired testbed, under the ``provider`` key. Way you describe your topology may vary a little bit depending on the testbed you target. The current EnOS implementation supports Vagrant (VBox), Grid’5000, Chameleon and OpenStack itself. To that end, EnOS defines different ``providers`` which are in charge of provisioning resources on a a specific testbed. Please, refer to the EnOS provider documentation to find examples of resources description depending on the testbed. For the sake of this lab we are going to use the Grid'5000 provider. In particular, pay attention to 2 elements you might need to adapt here: 1. *walltime*: define the time of your reservation; 2. *vlans*: to interconnect your resources on G5K, EnOS relies on KaVLAN which must be reserved on from the site you plan to deploy OpenStack on, prior the experiment. In our example, we reserve a VLAN from the ``rennes`` site. Adapt the related value regarding the G5K site you plan to use. Second, the description of the desired resources is declared in the ``reservation_g5k_perf.yaml`` file, under the ``resources`` key. Here we declare the G5K cluster we target, as well as the resources we want to deploy. In the provided file, we declare 3 machines on the ``nova`` cluster (which is located in Lyon): a ~control~ node, a *compute* node and a *network* node on which will be deployed all the required OpenStack services. .. note :: In a general case, you can generate this file by typing ``~enos new > generated_reservation.yaml`` that you can explore for you curiosity. It contains definitions regarding different deployment testbeds: not only G5K but also vagrant, chameleon, etc. Benchmark ------------------------------- Launch the benchmark scenario with: .. code-block:: console (venv) user@mad-node:~/mad$ ./bench.py -L -M -f bench_params.yaml -t perf Note that at the end of your session, you can release your reservation by typing: .. code-block:: console (venv) user@mad-node:~/mad$ cd enos (venv) user@mad-node:~/mad/enos$ source venv/bin/activate (venv) user@mad-node:~/mad/enos$ enos destroy --hard This will destroy all your deployment and delete your reservation. Since *execo* keeps records of the benchmark in a directory called ``MadBench_``, you can stop the benchmark at some point, then continue the benchmark where it stopped by using the ``-c`` parameter as follows: .. code-block:: console (venv) user@mad-node:~/mad$ ./bench.py -L -M -f bench_params.yaml -t perf -c MadBench_ Please be sure to adapt the desired record directory ``MadBench_`` to your situation.