\chapter{Evaluation} % English: Evaluation \label{chap:evaluation} This chapter summarizes the results of the experiments. At the center of the evaluation stands the comparison of the energy consumption of the default \emph{Contiki}-\ac{RPL} and version with added persistance of routing information. Another important point is how the network topology behaves in the different configuration and how the network reacts to resetting nodes. For this, the power consumption of the individual nodes and the complete network, the stability of the \ac{DAG} and the performance of the network are evaluated. \section{Firmware Configurations} \label{sec:confs} For the firmware, three different modes have been evaluated. The first mode uses the default \emph{Contiki}-\ac{RPL} (N), the second uses the hardened implementation with only the storing of persistent state enabled (H) and in the third mode additionally the sanity of the routing information is checked using \acp{UID} and the clock values of the neighboring nodes (HS). Each mode is tested for a network where a reset occurs (R) and for a network where no resets occur. From this result 6 different variants of the experiment which are shown in \autoref{tab:variants}. \begin{table}[h] \centering \caption{Experiment configurations} \begin{tabular}{r c c c} \toprule Test run & Hardened & Sanity & Resets \\ \midrule N & & & \\ R & & & X \\ H & X & & \\ HR & X & & X \\ HS & X & X & \\ HSR & X & X & X \\ \end{tabular} \label{tab:variants} \end{table} \section{Influence of Environmental Factors} When evaluating the data obtained from \fitlab some environmental factors that have an influence on wireless transmissions have to be taken into account. \subsection{Radio Interference} The radio of the \emph{M3} node uses the 2.4 GHz \ac{ISM} bands. One problem with this is, that this shared spectrum suffers from interference with other users since it is widely used for other network applications (e.g. \emph{WiFi}). This means that interference with surrounding devices in the building and with other experiments in the testbed is legitimate concern. As long as other devices use some form of \ac{MAC} that is compatible with the one used in IEEE 802.15.4, this is less of a problem since different senders is able to coordinate their transmissions to some degree. Because the complete networking stack the \ac{ON} uses can be user-defined, as is the case with all software running on the \ac{ON}, this may not always be given. In the case of previous experiments by Müller et al. \cite{mueller2017}, this problem was responded to by only running the experiments at night time, where other senders are less active. In the case of \fitlab, where experiments may run unsupervised, this is likely not a valid strategy. Another mitigation that has been performed is to select channel 16 in the 2.4 GHz spectrum in use for IEEE 802.15.4 and then capturing traffic that can be seen on this channel. Also, the \ac{API} allows to monitor if other experiments are scheduled during an experiment and a more appropriate time can be selected instead. Since we use almost all nodes that are available for experiments and not down for maintenance, other experiments are effectively prevented from interfering with ours. Selecting channel 16, in this case, has also proven to be effective in limiting interference from other networks. \subsection{Signal Propagation} As can be seen in \autoref{fig:testbed}, the test network is located on the floor of a building. This building has multiple floors. The larger part of the building on the right is separated from the smaller part of the building on the left in some places by a wall. There are a few pillars between some of the test nodes. All of them present obstacles for the propagation of the signal. As for the influence they have on the experiment, it can be noted that the positions of all obstacles remains constant during and between each experiment. However, the physical topology (e.g. which nodes are neighboring each other) changes compared to a strictly linear topology as suggested by the map. \section{Topology of the RPL Network} The topology of the network does have a large influence on how much the network is affected by a resetting node, as was previously shown by Kulau et al. \cite{kulau2017energy}. For the different runs of the experiment to remain comparable, it is necessary that under the same conditions the network obtains a similar topology. \subsection{Ideal Network Topologies} As previously established, the affect resetting nodes have on the network depends on the position and role of the node inside the network. In a tree-like topology as depicted in \autoref{fig:treetop}, nodes have at most one alternative parent and the entire sub-tree underneath the resetting node will be affected as they lose their default route to the root node. For a mesh-like topology, such as in \autoref{fig:meshstar}, the network may be able to quickly recover after the nodes have selected one of their alternative parents. While these topologies serve the purpose of illustrating which factors play into the behavior of a recovering network, it is not easily possible to recreate such topologies in a real-wold scenario, since the topology of the network is limited by the different configurations \fitlab offers. \begin{figure} \centering \begin{tikzpicture}[<-,>=stealth', level/.style={sibling distance = 5cm/#1, level distance = 1.5cm}, font = \small, every node/.style={circle,draw}] \node {1} child { node {2} child { node {5} child { node {11} } child { node {12} } } child { node {6} child { node {12} } child { node {13} } } } child { node {3} child { node {7} } child { node {8} child { node {14} } child { node {15} } } child { node {9} } } child { node {4} child { node {10} child { node {16} } child { node {17} } child { node {18} } } child { node {11} child { node {19} } } }; \end{tikzpicture} \caption{Tree topology} \label{fig:treetop} \end{figure} % TODO build in \latex \tikz \begin{figure} \centering \includegraphics[width=.5\textwidth]{../images/sim_star_new.pdf} \caption{Star mesh topology \cite{mueller2017}} \label{fig:meshstar} \end{figure} \subsection{Measured Topology} Each topology resulting from any experiments in \fitlab converges upon a very similar \acs{DAG}. An example of such a \ac{DAG} is shown in \autoref{fig:dagexample}. One thing that can be observed is that the resulting tree has up to 6 layers. It should be noted that most nodes are an equal distance to another as displayed in \autoref{fig:testbed} and the links can be assumed to be of similar quality. \begin{figure} \centering \includegraphics[width=\textwidth]{../images/dag.pdf} \caption{An example of a DAG generated for the configuration used in the evaluation} \label{fig:dagexample} \end{figure} \subsubsection{Relationship to the Physical Topology} One common property of the \ac{DAG}s is that node 159 is the root of the largest sub-tree. When comparing the nodes of this sub-tree to their positions on the map, one noticeable property of this sub-tree is that it even contains nodes that have a closer physical distance to nodes from other sub-trees. In the depicted tree an example would be node 196 joining the sub-tree of 159 instead of a acquiring the directly neighboring node 155 as a parent. When comparing to the shape of the surrounding building these two nodes are divided by two outside facing walls of the surrounding building, while the path across 159, 200 and 224 is only obstructed by a dry-wall which presents less of an obstacle to the radio signal. \subsubsection{Selecting a Node to Reset} Initial evaluations of the resulting topologies give an indication to which node needs to reset to have a measureable effect on the network. Node 200 has been selected to be reset during a single random time during the phases with resets, R, HR and HSR. The reasons for this are that node 200 has a large enough sub-tree in most runs of the experiment to affect enough nodes and because it is the node that is most frequently selected as a preferred parent when connecting the two halves of the building. \subsection{Route Stability} The stability of the network is determined by how stable the conditions are on which the routing protocol bases its decisions. For a network where resets occur, these conditions will change upon the reset of a node and the routing protocol reacts to this situation. Therefore, the number of changes of the routing decisions is a measure of the stability of the network created by the routing protocol. \begin{figure} \centering \includegraphics[height=.3\textheight]{../images/stability.pdf} \caption{Number of changes of the default route of any node during a phase} \label{fig:stability} \end{figure} \autoref{fig:stability} shows the number of changes of the default route of any node during each phase of an experiment. The default implementation in phase N causes the fewest changes if no single node resets occur. The hardened implementations of H and HS on the other hand lead to more changes. This may be due to the processing, or restoring and invalidating a previous invalid state from persistent memory. The number of changes for the hardened implementations during a phase with single node resets is smaller than for the default implementation. This means that the hardened implementations recover more easily from single node resets than the default implementation does. \subsubsection{Relation to Location} \autoref{fig:hmroutes} shows a heat-map of the distribution of default routes during each phase. Each cell results from the number of times the route has been selected at the end of an interval of 10 seconds during the phase of an experiment. This number was then then normalized by the length of the phase during the experiment, since the length of a phase may vary due to variations in how fast the test-lab reacts to instructions given by the orchestration component. Thus, routes that are more often selected are shown in a darker hue, while routes that are rarely selected are shown in a lighter hue. \begin{figure} \centering \includegraphics[width=\textwidth]{../images/routes.pdf} \caption{Heat-map of the normalized choice of default routes for the different phases} \label{fig:hmroutes} \end{figure} The most noticeable thing about the these maps is that the distribution of routes of routes between the different phases does not vary much, in that the maps are almost identical. One thing to conclude from this is that the choice of the \ac{DAG} is not altered in a relevant way by adding the persistance mode in phase H or additionally validating \acp{UID} in phase HS. This means that that there is not enabling the hardened implementation in regard to the choice of an optimal \ac{DAG}. Another thing to be noted is that the when viewing each row of the map, some nodes show a distribution of routes which is very dense for a small number of different neighbors while others do not have such routes. When compared to the topology of the \ac{DAG}, the nodes that offer more stable routes are generally inside nodes of the tree (e.g. not leaves). Nodes that tend to change routes more join the tree as nodes. The comparison of the phases have resets with their counterparts yields that there are more changes between the N and the R phase than are between the H and the HR and the HS and the HSR phases. From this it can be interpreted that more route changes may have occurred during the reset of the node in phase R. This would mean that the network is in a more unstable state during R than in HR and HSR. %\subsubsection{Relation to Rank and Number of Neighbors} % %\autoref{fig:rankvsneighvschanges} shows the pairwise relationship of the rank %of a node, its number of neighbors and the number of changes of its preferred %parent selection. % %% TODO create new figure % %%\begin{figure} %% \centering %% \includegraphics[width=\textwidth]{../images/changes.pdf} %% \caption{Rank, default parent changes, number of neighbors} %% \label{fig:rankvsneighvschanges} %%\end{figure} % %With increased rank, the number of changes of the preferred parent increases. %This may cause an increase in energy consumption. One possible explanation may %be that the possible number of nodes that may fail along the path to the sink is %increased when a node has a higher rank. This is supported by the fact that the %increase in changes is higher for the phases that include resetting a node. As %such it can be expected to see a noticeable increase in power consumption for %these phases (see \autoref{fig:consum-rank.pdf}). \subsection{Convergence Time} When considering the resulting routing topology, the time it takes the network to converge upon one topology is also of interest. A larger amount of routing messages will have to be transmitted, the longer some single node in the network takes to acquire a preferred parent. \autoref{fig:convtime} shows the convergence time of the network for each phase. The phases without resets are grouped to the left and on the right are displayed the phases with resets. \begin{figure} \centering \includegraphics[height=0.3\textheight]{../images/convergence.pdf} \caption{Network convergence time for each phase} \label{fig:convtime} \end{figure} It is noticeable that both the H and HS phases, which use the hardened implementation, have generally longer convergence times than the default implementation of \ac{RPL}. In the case of the restoration of the routing state from persistent memory, as in phase R and HS, all previously recorded \ac{DIO} messages will be replayed to the \ac{RPL} module of \emph{Contiki}. This implies that the time it takes to process these messages and make changes to the saved \ac{DAG} and routing table adds to the time it takes to choose a preferred parent and therefore lengthens the network formation time. It can be assumed that much of this time is spend writing and reading this data from the peristant memory. In the case of the HS and HSR phases, additional \ac{DIO} messages are exchanged to verify the stored routing information and messages before restoring them. These messages contain \acp{UID} to identify the information and measure the freshness of the information using the clock the implementation keeps for the local routing information. Presumably, the time it takes to exchange these messages further adds to the delay until a suitable preferred parent is selected for each node. Another thing that is remarkable is how much the convergence time varies for the H and HS phases compared to the phases without the hardened implementation, regardless of whether there are resets or not. As a consequence, the network forms in a more reliable manner in the default implementation. An interesting observation about the difference in the convergence time between the phases with resets and those without is that generally the convergence time is shorter for a network with resets. It would be expected that the inverse of this relation would be the case. The exact reason for this can only be speculated upon. If a reset occurs during the initial formation of the \ac{DAG} during a phase, the resetting node might not partake in the formation of the network. This then would mean that less alternative paths inside the network exist from which to choose which might lead to the network converging faster. % TODO network convergence time derectly after reset \section{Energy Consumption} % TODO error from measuring 2nd phases --> indirect comparison by # messages ... This section discusses the energy consumption of the test network and how it changes based on the implementation in use and whether a single node reset occurs. \autoref{fig:consphases} shows the total energy consumption of the network during the different phases. Nodes 200 and 157 have been excluded since they act as the resetting node and the root node. \begin{figure} \centering \includegraphics[height=0.3\textheight]{../images/consumption-phases.pdf} \caption{Total consumption except nodes 200 and 157} \label{fig:consphases} \end{figure} For a network in which no resets occur, the consumption of the default implementation (N) is significantly lower than for both versions of the hardened implementation (H, HS). One possible factor in this may be the energy spend on writing the persistent state to the flash memory. Another may be the additional computing time spend on processing the restored state and in case of the HS phase, the exchange of \ac{DIO} messages. This effect is amortized when recovering from a reset in the phases HR and HSR, where the default implementation uses more energy than the hardened version (HR). For instances, the extended hardened implementation (HSR) uses less power than the default implementation, but the mean of the consumption is higher for HSR. This means that the additional exchange of messages to verify the state stored in persistent memory consumes more energy on average than the restoring of the persistent state saves. \subsubsection{Constant error between consecutive phases} The comparison of the phases in which no resets occur (N, H, HS) versus the phases with resets (R, HR, HSR) yields, that a smaller energy consumption is measured for the phases with resets. For each firmware, the phase with and without resets run consecutively. For each series of measurements for each individual node the power consumption of each second phase is smaller over the complete duration of the phase. This leads to the conclusion that this behavior is not triggered by the single node reset, but rather caused by an external factor. For this reason, the measured values of the energy consumption of the individual phases are only valid for comparison between phases that either have resets or do not. For the comparison of the phases that use the same firmware (e.g. N and R) other variables can be used, such as the number of protocol messages and the number of changes of the preferred parent. %TODO measurement sequence \subsection{Consumption of the DAG Root and Resetting Node} \autoref{fig:cmpsinkreset} shows the total consumption of the network, except for the sink node and the resetting node. Node 123 is shown for comparison as it is close to the average of the consumption of all other nodes. \begin{figure} \centering \includegraphics[height=0.3\textheight]{../images/consumption-hosts.pdf} \caption{Energy consumption for the sink node and the resetting node} \label{fig:cmpsinkreset} \end{figure} When viewed separatly, the energy consumption of these nodes varies widely from the other nodes and from each other as seen in \autoref{fig:cmpsinkreset}. In the case of the sink node, this is because, to minimize packet loss, its radio is forced to the \ac{RX} state and it acts as the \ac{UDP}-sink and thus has to process many packets. The resetting node can be expected to consume much less power while it is resetting, since a restart involves power-cycling the node. Thus, for the phases with resets (R, HR, HSR), the consumption is lower for node 200 and 157 consumes significantly more energy than the average node. Such a node would typically be powered from the power grid. %\subsection{Relation to Rank} % %% TODO % %As can be viewed from \autoref{fig:relconsum}, the energy consumption increases %with the rank for HS and HSR. For the N and H, this relationship is inverted. % %\begin{figure} % \centering % \includegraphics[height=0.4\textheight]{../images/consumption-regress.pdf} % \caption{Relation of rank and consumption of a node} % \label{fig:relconsum} %\end{figure} \subsection{Relation to Position inside the Testbed} \autoref{fig:posenergy} displays the positions of the nodes inside the testbed. Every cell shows a color representing the relative energy consumption associated with that node. Lighter colors represent a lower energy consumption, whereas darker colors represent a higher energy consumption. \begin{figure} \centering \includegraphics[width=0.4\textheight]{../images/consumption-nodes.pdf} \caption{Energy consumption of nodes arranged by their position} \label{fig:posenergy} \end{figure} One noticeable thing about this distribution is that node 87 has the highest energy consumption in all phases. When looking at the \ac{DAG}, this node is mostly located at a lower rank. Nodes that are physically located closer to the root node 157, tend to have a lower energy consumption than surrounding nodes. A wall separates the nodes 192, 194 and 196 from 157. This coincides with these nodes having a higher energy consumption. \section{Network Performance} In this section, the network performance and the control overhead for the different phases is evaluated. \subsection{End-to-End} \autoref{fig:perf} shows the average end-to-end delay for all nodes during each phase of the experiment. While in phase H, shorter delays are possible. At the same time, the distribution varies more. With the added sanity checks in phase HS, the distribution is more focused around 2 ms. The default implementation in N lies somewhere in-between the two. %\begin{figure} % \centering % \subfloat[delay]{{\includegraphics[width=.5\textwidth]{../images/performance-delay.pdf}}}% % \qquad % \subfloat[jitter]{{\includegraphics[width=0.5\textwidth]{../images/performance-jitter.pdf}}}% % \subfloat[loss]{{\includegraphics[width=0.5\textwidth]{../images/performance-loss.pdf}}}% % \caption{Delay, jitter, loss for each phase} % \label{fig:perf} %\end{figure} \begin{figure} \centering \subfloat[delay]{{\includegraphics[width=.5\textwidth]{../images/performance-delay.pdf}}}% \subfloat[loss]{{\includegraphics[width=.5\textwidth]{../images/performance-loss.pdf}}}% \caption{End-to-end delay and package reception rate} \label{fig:perf} \end{figure} The packet loss during each phase is displayed in \autoref{fig:perf}. For a scenario without single node resets, the default implementation fares best, while in a scenario with single node resets the hardened version without the use of \acp{UID} looses the fewest packets. If additionally the sanity checking of the persistent state is enabled, the most packets are lost. One possible explanation for this is that if the persistent state is directly restored, most of the time this state is sufficient for the forwarding of newly arriving packets. If the node must first validate the saved state, it looses time during which arriving packets may be dropped. This would suggest that the validation of the saved state is actually slower than the default method of recovery. \subsection{Control Overhead} The number of messages that need to be emitted during the repair operations of the \ac{DAG} determines the utilisation of the radio of the node. It is to be expected that a large part of the energy consumption of the each node is determined by the number of messages it emits. Thus, when evaluating the efficiency of the different implementations and the impact of the single node resets, the overhead of messages that are transmitted by the implementation serves an important measure. \autoref{fig:overhead} shows the overhead created by control messages that were created during each phase by message type. For each type, the default implementation creates the fewest additional messages of any type, while the number of messages is the highest for the implementation used in HS. This may be attributable to the higher number of messages exchanged during the validation process. The larger number of overhead created during phase H is likely due to an old state being restored from previous runs and then invalidated. \begin{figure} \centering \includegraphics[height=0.3\textheight]{../images/performance-overhead.pdf} \caption{Overhead created by \ac{RPL} messages} \label{fig:overhead} \end{figure} For the phase R the overhead is higher than for the phase N, where no resets occur. At the same time the inverse is true for the phases with hardened implementations. Here the effect of an old state being restored and then invalidated is later canceled out, when restoring the state after the node reset occurred and actually less overhead is created than for the default implementation. It should also be noted that there is no significant difference for the overhead during the HR and HSR phases, which means that the implementation used in HS and HSR does not offer a clear benefit over the implementation used in H and HR in terms of message overhead. \subsubsection{Consumption} \begin{figure} \centering \includegraphics[width=\textwidth]{../images/performance-consumption.pdf} \caption{Relationship between overhead and energy consumption} \label{fig:overconsum} \end{figure} As can be seen from \autoref{fig:overconsum}, the number of control messages correlate to the observed consumption. For a larger overhead, the total consumption increases proportionally.