research-article

Open access

Towards the Legibility of Multi-robot Systems

Authors:

Beatrice Capelli,

María Santos,

Lorenzo SabattiniAuthors Info & Claims

ACM Transactions on Human-Robot Interaction, Volume 13, Issue 2

Article No.: 21, Pages 1 - 32

https://doi.org/10.1145/3647984

Published: 14 June 2024 Publication History

PDF eReader

Abstract

Communication is crucial for human-robot collaborative tasks. In this context, legibility studies movement as the means of implicit communication between robotic systems and a human observer. This concept has been explored mostly for manipulators and humanoid robots. In contrast, little information is available in the literature about legibility of multi-robot systems or swarms, where simplicity and non-anthropomorphism of robots, along with the complexity of their interactions and aggregated behavior impose different challenges that are not encountered in single-robot scenarios. This article investigates legibility of multi-robot systems. Hence, we extend the definition of legibility, incorporating information about high-level goals in terms of the coordination objective of the group of robots, to previous results that focused solely on the legibility of spatial goals. A set of standard multi-robot algorithms corresponding to different coordination objectives are implemented and their legibility is evaluated in a user study, where participants observe the behavior of the multi-robot system in a virtual reality setup and are asked to identify the system’s spatial goal and coordination objective. The results of the study confirmed that coordination objectives are discernible by the users, hence multi-robot systems can be controlled to be legible, in terms of spatial goal and coordination objective.

1 Introduction

As robotic platforms become increasingly affordable, reliable, and efficient, multi-robot systems are expected to slowly infiltrate many activities in our everyday lives, blurring the divide between human and robot spaces. The inherent redundancy of multi-robot systems and robotic swarms, afforded by their multiple, and often interchangeable, individual components and their spatial flexibility will likely increase their deployment in a variety of applications, including precision agriculture, logistics, environmental monitoring, or surveillance, among others (see, e.g., Reference [60] for a survey).

While an efficient and safe coordination of the robots is imperative for the fulfillment of each specific task objective, the intersection of robot and human work-spaces demands effective communication among them. The need for such communication, which has become a prominent subject for the interaction between humans and individual robots (see Reference [21] for an example with manipulators), poses further challenges for the multi-robot scenario due do the complexity associated with the coordination among numerous individual robots. This hurdle becomes apparent in both communication directions: how can a human user exert control over the multi-robot system and how can the robots give some information to the human? While some authors have addressed the former question (e.g., References [28, 51]), literature addressing the latter is lacking.

Intention communication is the key to a fruitful interaction between robots and humans [27]. Therefore, in this article, we focus on legibility of multi-robot systems. In particular, we consider legibility as the ability of a robot, or in this case of a group of robots, to manifest its intention without the use of explicit communication. To this end, this work studies a series of standard coordinated control strategies and their adequacy in conveying spatial goals of the multi-robot system to a human observer. Furthermore, the different algorithms are analyzed to determine if a human observer is capable of distinguishing the coordination objective of the multi-robot system, defined by each specific decentralized coordinated control algorithm, and how different parameters that characterize the control algorithm may affect the interpretability of the multi-robot team motions.

The article is organized as follows. The remainder of Section 1 provides a literature review on different aspects related to legibility of robotic systems and outlines the main contributions of this article. The definition of legibility for multi-robot systems is presented in Section 2. Section 3 proceeds to describe the coordination objectives whose legibility is investigated in the article. Such investigation is conducted via a user study, whose procedure is outlined in detail in Section 4. Then, in Section 5, we report the results of the user study and their analysis. Main limitations of the achieved results are discussed in Section 6, while discussion and conclusions are provided in Section 7.

1.1 Literature Review

1.1.1 Proximal Interaction.

When considering Human-Robot Interaction (HRI), a key factor is the differentiation between proximal and remote interaction [36], i.e., the users share the same environment with the robot or they do not, respectively. In applications with remote interaction, the user and the robotic system are in different physical locations [52], e.g., search and rescue operations [50], aerospace repairs [30], bomb disposals [59], or surgical procedures [47]. This type of interaction is characterized by a user-robot communication by means of a physical instrument, such as a teleoperation tool or a joystick, that allows the user to provide the robot with input commands and, at the same time, allows the user to receive information from the robot, in terms of visual, audio, or force feedback.

Instead, in proximal interaction, the environment is shared by the human user and the robotic platform, and the communication is carried out directly through, e.g., gesture recognition, speech, and gaze engagement. This condition of proximal interaction will be the focus of this article. As the robots share the same environment with the user, in addition to communication, we need to take into account the safety of the user. For example, in physical-HRI, the robot, usually a manipulator, has to safely interact with the user, which is achieved by imposing velocity limits as a function of the distance to the user [25]. Even though safety is an issue in proximal interaction applications, in this work, we will not address this aspect due to the fact that the user study was carried out in virtual reality, and that the focus was mainly on legibility.

1.1.2 Implicit Communication.

From the point of view of the communication, the bi-lateral question still arises: how should the user communicate to the robot and how should the robot itself refer important information back to the user? This work focuses on the latter question. Along the lines of Reference [39], the approaches can be divided into explicit and implicit cues, based on the use of ad hoc apparatus for communication purposes. For example, explicit cues that have been used in the literature include LEDs [66], light indicators [45], voice [62], gaze movement of a humanoid robot [45], or Augmented Reality (AR) [12, 29, 56]. Although similar to a user interface, AR has the advantage of merging the information directly into the real sight of the user. However, the presence of a tool (e.g., smart glasses or a tablet), places an additional entity in the communication channel: there is no longer direct communication between the robot and the user. Furthermore, using an additional tool may not always be possible, due to task-related constraints (e.g., a tablet cannot be used, because the user is wearing gloves or is using some tool) or may require some specific training (e.g., the user needs to learn how to use smart glasses). The need for task-specific training also emerges with the use of explicit cues linked to colors or shapes, as some sort of alphabet is needed to match the cues to their meanings. The aforementioned considerations become even more impactful in the case of multi-robot systems: the cognitive complexity, defined in Reference [36] as the amount of cognitive resources necessary for a user to understand and provide input to a multi-robot system, significantly increases with the number of robots. Furthermore, considering the simple mobile robots typically adopted in multi-robot applications, the lack of anthropomorphism impedes the use of cues mimicking human communication. For instance, using voice cues would imply choosing which robot communicates, what information needs to be delivered, and how the communication channel is established.

Implicit cues, conversely, use directly the movement of the robots to convey information, thus leveraging human understanding of movement and the tendency of people to discover intentionality in everything, even in robots [16]. Along these lines, the authors of Reference [65] proposed the use of animation principles to communicate intent in assistive free-flyers. By manipulating the velocities and trajectories of the UAVs, they discovered how to effectively improve communication and that the users preferred motions with ad-hoc modification to infer some significance. The work presented in References [19, 20, 21] focused on manipulators and studied their legibility, i.e., their ability to communicate intent through movement, to improve collaborative tasks. In Reference [21] the authors investigated the difference between legibility and predictability, in Reference [19] they studied how to generate legible motions, and in Reference [20] they verified the utility of these legible motions into a collaborative task. In their studies the motion of the robot is generated by means of a planner that optimizes legibility itself. We took inspiration from their definition of legibility to proceed in our study toward multi-robot applications. Hence, preliminary works were presented in References [5, 9, 10], where the characteristics of the motion of a multi-robot system (see Section 1.1.3 for more details) were directly studied. However, no planner to generate legible motions was defined. Furthermore, high-level coordination objectives were not considered in these previous works, and are explicitly addressed in this article. Such high-level objectives may be linked to the studies on the communication of robots’ plan, which is a topic studied in the Artificial Intelligence (AI) field [12, 38, 75]. In this field, in particular in the context of Human-Machine Interface (HMI) and with mechanical interfaces, such as in teleoperation tasks, legibility is often referred to as transparency. In Reference [11] a detailed description of the differences and of the implementations of legibility and transparency in AI can be retrieved.

1.1.3 Legibility in Multi-robot Systems.

In the context of social applications, robots must effectively interact with people in an engaging manner [4], a requirement that oftentimes demands expressing emotion through their actions. In robotic swarms and multi-robot systems composed by simple, non-anthropomorphic robots, conveying emotions to an observer can pose an increased difficulty. In these cases, the emotional expressivity and readability of the movement of the robotic swarm has been linked to different parameters related to the individual control of the robots in the swarm and the inter-robot interactions [18, 42], under the context of standard collective behaviors such as cyclic pursuit or coverage [58] or relating them to dancing [63]. Regularity of the motion pattern and different levels of cohesion in the group of robots have been identified as important aspects in human understanding, as discussed in References [32, 61]. Building upon these ideas, in this article, we investigate the influence of the choice of control parameters on the capability, for the user, to infer spatial and coordination objectives of the multi-robot systems.

Legibility of a multi-robot system was introduced in our previous works [9, 10] as the ability of the multi-robot system to communicate to the user its spatial goal. Such ability was characterized based on two performance indices: (1) the correctness of the communication and (2) the time needed for this communication. The first question to answer was whether the motion of a multi-robot system was able to communicate some information to the user. We investigated this issue in our previous work [10], where we found out that some motion-variables (namely, the trajectory, the dispersion and the stiffness of the multi-robot system) were statistically relevant for the communication of the spatial goal of the group. The motion-variables were chosen to describe some important characteristics of the multi-robot system. The user study was conducted in a virtual reality setup, where the user was able to share the same environment with the robots. The kind of controller used to define the movement of the robots was based on artificial potential fields.

The following step was to understand if the same motion-variables could be used to communicate the same information, i.e., the spatial goal, of multiple groups of robots at the same time [9]. The results of this study were similar to those of the previous user study and the motion-variables confirmed to be statistically relevant with respect to the legibility of a multi-robot system.

In Reference [34] the legibility of multi-robot systems was further investigated, together with glanceability, defined as the ability to communicate a spatial goal throughout the whole movement of the robots. The authors focused on different types of control laws, similar to those implemented in this study, but they investigated also the effect of target difficulty, namely, whether the spatial goals had the same number of neighboring goals or not, and initial radius, namely, how much the robots were spread out at the beginning of the experiment. Only the legibility with respect to the spatial goal was analyzed, with rendezvous (i.e., consensus) being the most legible controller and the trajectory-based motion the best for glanceability. Along similar lines, a user study was performed in Reference [69] to understand if humans are able to identify aggregation/dispersion behaviors in swarms of robots. In both cases, the user studies were carried out on a standard screen and with a top-down view that is not suitable with our idea of proximal interaction. In fact, the user study reported in this article, and in the previous works [9, 10], were conducted in a virtual reality setup with the objective of imitating as closely as possible a proximal interaction scenario.

1.2 Statement of Contributions

This article studies the legibility of multi-robot systems. In particular, the contributions of this article are:

—

Extended definition of legibility for multi-robot systems;

—

Extended characterization of the legibility of spatial goals considering standard control laws through a user study;

—

Characterization of the legibility of coordination objectives, defined in terms of standard decentralized coordinated control strategies, in the user study.

Taking into account the results of previous works [9, 10], we extend the study on the legibility with respect to spatial goals. In particular, we consider the effect of different coordination objectives and of different numbers of robots on the legibility of the spatial goal. To this end, we identify a set of standard decentralized coordinated control laws, used extensively in the multi-robot systems literature, for realizing desired coordination objectives. We then investigate the legibility characteristics of each of these control laws. To the best of the authors’ knowledge, this is the first work to investigate the legibility of coordination objectives.

Our previous advances on the legibility of multi-robot systems [5, 9, 10] focused on a microscopic approach, adding to the trajectory parameters typically analyzed in legibility studies [21] those parameters affecting inter-robot relationships, as in Reference [41]. On the contrary, this article adopts a holistic view of multi-robot legibility, thus relating to the observation of the aggregate movement of the team.

2 Definition of Legibility in Multi-robot Systems

What does legibility mean for a multi-robot system? Can we define it in terms of spatial goal (where are the robots going) and coordination objective (what are the robots doing)? These can be useful when sharing the space with the user. Legible motion endows robots with the ability to communicate information directly, without the need of a specific user interface. While user interfaces continue to be the standard in many human-robot interaction scenarios, here we want to understand if it is possible to communicate implicitly this information. The prototype coordinated control algorithms that we choose to investigate for the definition of the coordination objectives are: consensus, formation, and coverage. We chose these controllers due to their ubiquitous use in multi-robot systems: as pointed out in a recent survey paper [13], these control strategies represent the fundamental building blocks that are generally exploited in the literature to address most of the multi-robot control problems.

Given that the legibility of multi-robot systems is still in its early stages, we need to understand what legibility means in this context. We consider legibility as the ability of a group of robots to communicate to the user its spatial goal and/or its coordination objective. Hence, we propose the following definition:

Definition 1.

[Legible multi-robot system] A multi-robot is legible if its spatial goal and coordination objective are understandable by a user by solely observing the motion pattern (i.e., without the use of explicit communication methods), where

—

The spatial goal represents the overall direction of motion of the multi-robot system;

—

The coordination objective represents how robots are moving with respect to each other and how they are aggregating.

3 Multi-robot Coordination Objectives

A large body of work on multi-robot systems and robotic swarms deals with how to coordinate large teams of robots to produce collective behaviors that go beyond the capabilities of the robots as individuals [3]. As such, the aim is to perform collective, emerging behaviors that arise from rules executed by individual robots acting on local information. The fulfillment of objectives is thus typically analyzed from the team-level behavior, rather than from the individual actions of the robots. For instance, robot team performance can be evaluated in terms of overall geometrical deployment of the robots, as discussed in Reference [68].

Taking the taxonomy outlined in Reference [13] as a reference, we analyze the legibility of standard coordinated control algorithms commonly used for achieving multi-robot coordination objectives. Specifically, we examine consensus, formation, and coverage algorithms. These algorithms are posed such that the group’s coordination objective is encoded through a cost that represents the performance of the team as a whole, and the individuals cooperate toward that objective by following a direction of the gradient of the cost.

In this section, we outline the control laws executed for each of the coordination objectives utilized in the user study. To this end, let us consider a team of N planar robots,¹ with the position of robot i being denoted as \(x_i\in \mathbb {R}^2\), \(i\in \mathcal {N}=\lbrace 1,\dots ,N\rbrace\). While many kinematic configurations can be considered for the individual platforms in robotic swarms (e.g., omnidirectional [35] or differential drive [48, 71, 72]), we here opt for omnidirectional robots to minimally distort the ideal collective behavior considered in the control laws. Consequently, the movement of the robots can be described through single integrator dynamics, i.e.,

\begin{equation} \dot{x} = u, \end{equation}

with \(x={[x_1^T,\dots ,x_N^T]}^T\in \mathbb {R}^{2N}\) being the aggregate state of the system, and \(u={[u_1^T,\dots ,u_N^T]}^T\in \mathbb {R}^{2N}\), its control. If a coordination objective is to be fulfilled, then the control input of each particular robot, \(u_i\), \(i\in \mathcal {N}\), will be influenced by the state and the actions of its neighbors. This flow of information among adjacent robots is typically represented through a graph \(\mathcal {G} =(\mathcal {V},\mathcal {E})\), where the vertex set, \(\mathcal {V} = \lbrace 1, \dots , N\rbrace\), represents the robots and the edge set, \(\mathcal {E}\in \mathcal {V}\times \mathcal {V}\), encodes the adjacency relationships, i.e., \((i,j)\in \mathcal {E}\) if robot i communicates with or can sense/be sensed by robot j. For the control laws considered in this article, the relationships between robots are bidirectional, namely, \((i,j)\in \mathcal {E}\) if \((j,i)\in \mathcal {E}\), that is, \(\mathcal {G}\) is undirected. In addition, they are subjected to change over time: the edge set \(\mathcal {E}\) need not be static. For notational convenience, the neighboring set of robot i is denoted as \(\mathcal {N}_i=\lbrace j\in \mathcal {N}~|~(i,j)\in \mathcal {E}\rbrace\).

The remainder of this section provides an outline of the different multi-robot coordination objectives analyzed in the user study in Section 4, i.e., consensus, formation and coverage. For each of them, we include a general description of the expected behavior as well as the laws to be executed by the robots. Additional details about implementation, including parameter choices, are reported in Appendix A for the purpose of reproducibility of results.

3.1 Consensus

The consensus protocol deals with the spatial aggregation of agents in a common point through local coordination of neighbors [1, 14, 43]. Assuming the graph describing inter-robot sensing, \(\mathcal {G}\), is connected², the team converges to a common location if each individual executes the control law

\begin{equation} \dot{x}_i = \kappa \sum _{j\in \mathcal {N}_i}(x_j-x_i), \quad \forall i \in \mathcal {N}, \end{equation}

(1)

where \(\kappa \in \mathbb {R_+}\) is a proportional gain.³ Executing the control law in Equation (1) implies that robot i will be moving in the direction that averages the relative positions of its neighbors, \(x_j-x_i\), \(\forall j\) adjacent to robot i. Asymptotically, this will make them converge to a point \(\bar{x} = x_1(\infty) = \dots = x_N(\infty)\), that is, the robots will reach consensus about where to meet, as long as the graph stays connected.

An instance of the consensus protocol in Equation (1) is shown in Figure 1, where a group of ideal robots, modeled as points, starts scattered over the domain and agree on a location where to meet. Note that, in the experiments included in Section 4, the robots have a physical footprint and, therefore, are unable to finish stacked on top of each other. As opposed to the ideal case presented in Figure 1, when executing the consensus protocol on a team of real robots, they will aggregate as much as possible without incurring inter-robot collisions, thanks to the implemented collision avoidance (see Appendix A.2).

Fig. 1.

3.2 Formation Control

A standard problem when controlling a multi-robot team is that of moving the individual robots so that they display a particular shape. This is typically achieved through formation control strategies, e.g., References [31, 54, 73], that specify how the individual robots should move to collectively display a shape.

The specification of the geometry to be realized by the team is key to the operation of a formation controller. To this end, we can specify a 2D formation \(\Delta\) through a set of relative, desired inter-agent distances,

\begin{equation} \Delta = \lbrace \delta _{ij} \in \mathbb {R}~|~\delta _{ij}\gt 0, i, j = 1, \dots , N, i\ne j \rbrace . \end{equation}

We assume here that \(\Delta\) is a feasible formation, that is, that the distances are not conflicting and, thus, the geometry can can be realized (see Reference [46] for details on feasible formations).

Having specified the formation, we can recover the structure of the consensus protocol in Equation (1) and make each robot move according to a weighted average of the relative positions of its neighbors,

\begin{equation} \dot{x}_i = \sum _{j\in \mathcal {N}_i}w_{ij}(\Vert x_i-x_j\Vert ,\delta _{ij})(x_j- x_i), \quad \forall i \in \mathcal {N}, \end{equation}

(2)

where the weight \(w_{ij}\) is positive if the neighboring robot is further than the desired distance, i.e., if \(\Vert x_i-x_j\Vert \gt \delta _{ij}\), and negative otherwise. For the experiments in Section 4, the weights are calculated as

\begin{equation} w_{ij}(\Vert x_i-x_j\Vert ,\delta _{ij}) = \frac{\Vert x_i-x_j\Vert -\delta _{ij}}{\Vert x_i-x_j\Vert }, \quad (i,j)\in \mathcal {E}. \end{equation}

Note that, to execute the control law in Equation (2), each robot needs to be cognizant of which robots conform its neighborhood, \(\mathcal {N}_i\), their relative positions, \(x_j-x_i\), and which distance is to be maintained with respect to each of them, \(\delta _{ij}\), as reflected in \(\Delta\). Therefore, under this protocol the robots are no longer anonymous: each robot needs to know the identities of other robots to discriminate which ones it needs to coordinate with.

Figure 2 shows the evolution of a team of 10 robots achieving a formation composed by two concentric pentagons, as specified by a formation \(\Delta\). The formation control protocol in Equation (2) ensures that the inter-robot distances are maintained but it does not influence the rotation of the formation with respect to the global frame.

Fig. 2.

3.3 Coverage Control

Coverage control deals with the problem of distributing robots over a domain \(D\subset \mathbb {R}^2\) such that the events of interest are optimally monitored by the team [15]. When covering an area, a standard strategy is to partition it so that each robot only takes responsibility over a portion. A typical way of dividing responsibilities is to make robot i be in charge of those points in the domain that are closer to it than to any other robot. This partitioning is known as a Voronoi tessellation [53],

\begin{equation} V_i(x) = \lbrace q\in D~|~\Vert q-x_i\Vert \le \Vert q-x_j\Vert , \quad \forall j\in \mathcal {N}\rbrace , \end{equation}

where \(V_i\) denotes the Voronoi cell assigned to robot i.

In general, this partition is not uniform across the domain in terms of area assigned to each robot. Indeed, the objective of the coverage problem is to monitor closely areas of higher importance by making the robots concentrate around them, which, in turn, results in differently sized Voronoi cells. The relative importance of a point in the domain typically reflects the probability of occurrence of an event or the concentration of a resource and is described through a spatial field, hereinafter referred to as density function, \(\phi\). An example of this density function is depicted in Figure 3, with yellow shades corresponding to higher values of \(\phi\).

Fig. 3.

When the density function is solely a function of the points in the domain and does not evolve over time, \(\phi :D\mapsto \mathbb {R_+}\), the overall coverage performance of the multi-robot team can be quantified by the locational cost defined in References [15, 22],

\begin{equation} \mathcal {H}(x) = \sum _{i=1}^N\int _{V_i(x)}\Vert x_i-q\Vert ^2\phi (q)dq, \end{equation}

(3)

with a lower cost corresponding to a better coverage of \(\phi\). The multi-robot system can minimize this cost following a direction of descent [15], which lends a continuous version of Lloyd’s algorithm [44],

\begin{equation} \dot{x}_i = \kappa (c_i(x) - x_i), \end{equation}

(4)

with \(\kappa \in \mathbb {R}_+\) a proportional gain, and \(c_i(x)\) the center of mass of the Voronoi cell of robot i,

\begin{equation} c_i(x) = \frac{\int _{V_i(x)}q\phi (q) dq}{\int _{V_i(x)}\phi (q) dq}. \end{equation}

(5)

3.3.1 Time-varying Coverage.

In some situations, the importance of the points in the domain may evolve over time. As a result, the coverage strategy outlined in Equations (3)–(5) needs to adapt to reflect possible dynamic changes to the density function. Considering a dynamic density, \(\phi :(q,t)\in D\times \mathbb {R}_+\mapsto \phi (q,t)\in \mathbb {R}_+\), the locational cost can be reformulated as

\begin{equation} \mathcal {H}(x,t) = \sum _{i=1}^N\int _{V_i(x)}\Vert x_i-q\Vert ^2\phi (q,t)dq. \end{equation}

(6)

The introduction of a dynamic density in the locational cost calls for a modification of the coverage controller. As investigated in Reference [40], one can minimize the locational cost in Equation (6) by setting

\begin{equation} \dot{x} = \left(I-\frac{\partial c}{\partial x}\right)^{-1}\left(\kappa (c(x,t)-x)+\frac{\partial c}{\partial t}\right), \end{equation}

where \(c = {[c_1^T, \dots , c_N^T]}^T \in \mathbb {R}^{2N}\) is the vector containing the centers of mass calculated as in Equation (5) but with respect to the time-varying density function, \(\kappa \in \mathbb {R}_+\) is a proportional gain, and \(I\in \mathbb {R}^{2N\times 2N}\) is the identity matrix.

4 User Study

In this section, we describe the user study that was conducted to investigate the legibility of multi-robot systems. Following the preceding studies in References [9, 10], this user study investigates how the different controllers described in Section 3, as well as how the number of robots executing them, affected the legibility of the system, both in terms of spatial goal and coordination objective. Hence, this study takes a step further in understanding the legibility of multi-robot systems by analyzing how distinguishable the different coordination objectives are for the human observer. To this end, users were asked to recognize the spatial goal of the multi-robot system and the different kinds of controllers, namely, we wanted to discover if the users can understand the coordination objective of the multi-robot system, in addition to the spatial goal. A representative trial of the experiment is shown in the video available at Reference [8].

4.1 Experimental Scenario

We carried out a user study in a virtual reality setup motivated by the possibility of having a high number of robots sharing their environment with the human user, thus replicating a scenario with proximal interaction. About the use of virtual reality as a means to retrieve data for a user-study, we think that the level of realism carried by modern virtual reality headsets is sufficient to acquire plausible data [67]. In addition, virtual reality allows for arbitrarily large environments and number of robots without the disturbances that could arise in real world experiments (e.g., delays, failures, or noisy communication), therefore mitigating exogenous factors that may influence the perception of the different multi-robot coordination objectives. An alternative choice to this setup would have been to show to the users the movement of the robots with a video [34, 65] or a sequence of photos [55], but we believe that the advantages introduced by virtual reality, such as the perception of proximal interaction, exceed the drawback of using an additional means for the experiments.

In particular, the virtual reality setup was made using Unity, a cross-platform game engine, and it was rendered with the Oculus Rift headset.⁴ The setting was a plain environment with ample space, approximately the size of a tennis court (\(22\;\mathrm{m} \times 11\;\mathrm{m}\)). Within the environment, we placed the spatial goals, which were cubes emitting light of different colors (Figure 4(a)). The simplicity of the environment was purposefully designed to avoid creating distractions for the users, allowing them to focus solely on the movement of the robots. We used 20 omnidirectional robots to make the multi-robot system. The robots, depicted in Figure 6, were taken from the standard Unity assets, and are characterized by a 0.1 \(\mathrm{m}\) radius.

Fig. 4.

Fig. 5.

Fig. 6.

4.2 Experimental Procedure

The experiment was divided in two parts: one related to the spatial goal of the multi-robot team and one to its coordination objective. The overall procedure is summarized in Figure 7, and described in details hereafter.

Fig. 7.

4.2.1 First Part.

Users were asked to recognize as fast as possible the goal toward which the robot team was moving. The answer was given by means of the Oculus Remote (see Figure 5) combined with the direction of the head of the user. Once the users had identified with a sufficient degree of confidence the probable goal, they had to point toward the chosen goal (in the field of view of the user, a point was shown that corresponded to the direction tracked by the system) and press the central button of the Oculus Remote.

4.2.2 Second Part.

Users had to recognize both the spatial goal and the coordination objective of the multi-robot system. The coordination objectives were introduced to the users via three representative examples shown in the video available online at Reference [6]. No information about the kind of controllers or other features of the coordination objectives were disclosed to the users, with the objective of not to insert any bias in the experiments. Furthermore, the examples were significantly different from the movements of the second part of the experiment. In fact, the number of robots did not correspond to neither used in the experiments nor the directions of the movements correspond to one of the goals. The examples consisted of a team of eight robots moving right to left in front of the user. After this introduction, the second part of the experiment started and the users were asked to recognize the spatial goal and the coordination objective of the group. They were informed that the amount of time was measured separately, namely, they had to give the answers (spatial goal and coordination objective) as soon as they understood one of the two. Each trial ended when the user gave both answers. The answer for the spatial goal was provided in the same way as in the first part of the experiment, instead the chosen coordination objective was specified through the interface in Figure 4(b). No information about the kind of controllers or other features of the coordination objectives were disclosed to the users, with the objective of not to insert any bias in the experiments. Furthermore, the examples were significantly different from the movements of the second part of the experiment. In fact, the number of robots did not correspond to neither used in the experiments nor the directions of the movements correspond to one of the goals. The examples consisted of a team of 8 robots moving right to left in front of the user. After this introduction, the second part of the experiment started and the users were asked to recognize the spatial goal and the coordination objective of the group. They were informed that the amount of time was measured separately, namely, they had to give the answers (spatial goal and coordination objective) as soon as they understood one of the two. Each trial ended when the user gave both answers. The answer for the spatial goal was provided in the same way as in the first part of the experiment, instead the chosen coordination objective was specified through the interface in Figure 4(b). No information about the kind of controllers or other features of the coordination objectives were disclosed to the users, with the objective of not to insert any bias in the experiments. Furthermore, the examples were significantly different from the movements of the second part of the experiment. In fact, the number of robots did not correspond to neither used in the experiments nor the directions of the movements correspond to one of the goals. The examples consisted of a team of eight robots moving right to left in front of the user. After this introduction, the second part of the experiment started and the users were asked to recognize the spatial goal and the coordination objective of the group. They were informed that the amount of time was measured separately, namely, they had to give the answers (spatial goal and coordination objective) as soon as they understood one of the two. Each trial ended when the user gave both answers. The answer for the spatial goal was provided in the same way as in the first part of the experiment, instead the chosen coordination objective was specified through the interface in Figure 4(b). No information about the type of controllers or other features related to the coordination objectives was disclosed to the users, with the objective of avoiding the introduction of any bias in the experiments. Furthermore, the examples were significantly different from the movements of the second part of the experiment. In fact, the number of robots did not correspond to neither used in the experiments nor the directions of the movements correspond to one of the goals. The examples consisted of a team of 8 robots moving right to left in front of the user. After this introduction, the second part of the experiment started and the users were asked to recognize the spatial goal and the coordination objective of the group. They were informed that the amount of time was measured separately, namely, they had to give the answers (spatial goal and coordination objective) as soon as they understood one of the two. Each trial ended when the user gave both answers. The answer for the spatial goal was provided in the same way as in the first part of the experiment, instead the chosen coordination objective was specified through the interface in Figure 4(b).

The experiment ended with a survey with general questions for the user, including demographic information and previous experiences with virtual reality, and some specific questions regarding the experiment itself. While these data do not influence the statistical analysis of the experiment, these questions clarify the perception of the experiments from the users. The questionnaire was based on a 5-point Likert scale.

4.3 Experimental Variables

The independent variables of this study are the coordination objectives of the multi-robot system and the number of robots in the group. In the first part, the coordination objectives were: consensus, formation, and coverage. In the case of coverage, both static and time-varying density functions were evaluated with the objective of analyzing whether the two corresponding controllers affect or not the legibility of the multi-robot system. However, in the second part, time-varying coverage was not utilized given that, visually, the robots behaved too similarly to the consensus protocol for this scenario, as seen in preliminary tests. In fact, in both cases the robots agglomerate at the beginning and then they move toward the goal. The only difference was in terms of dispersion in the environment: the time-varying coverage had a higher dispersion with respect to the consensus. Representative examples of trajectories utilized in the experiments are reported in the Appendix A.5.

For the coordination objectives in Section 3, each control law has certain parameters that define the behavior of the multi-robot system. These parameters are distinctive of the controllers and they should not be considered as independent variables. The choice of the parameters is detailed in Appendix A.4. Note that, differently from the previous works [9, 10], trajectory, which is the path followed by the group of robots during the movement, is not an independent variable in this study. In fact, based on the results of References [9, 10], the best trajectory to communicate the spatial goal turned out to be a straight line that connects the starting point of the robots to their goal. The implementation of these movements is described in Appendix A.1.

However, the number of robots we tested for were 5 and 15, and we chose these numbers so that they were distinguishable between themselves and also different from the number used in the previous works [9, 10], where the experiments were ran on teams of 20 robots, thus allowing us to evaluate the effect of the number of robots on the correct recognition of the coordination objectives.

In contrast, the dependent variables of the experiments are the correctness of the answers and the time to answer taken in both parts of the experiments. The term dependent variables refers to the quantities that we measure during the experiments and upon which we will carry out the statistical analysis. We consider these dependent variables, because they represent the effectiveness of a legible system, namely, we define the ability to communicate in Definition 1 in terms of providing the correct information and in a short time.

4.4 Factorial Design

To investigate the statistical correlation between the independent and the dependent variables, we built a factorial plan [49]. Given the independent variables of the two parts of the experiment, as reported in Section 4.3, we had \(4\times 2\) and \(3\times 2\) combinations in the two parts, respectively. In addition, following the design of experiment methodology [49], we took some precautions to reduce the disturbing factors that might influence the answer of the experiments. It is worth noting that both the environment and some precautions were inspired by our previous work [9, 10].

4.4.1 Point of View.

The position of the user into the environment was fixed throughout the experiments to avoid additional disturbing factors. Figure 8 reports the height of the camera view into the Unity scene, chosen to be approximately the standard height of a human being, simulating the condition of a proximal interaction in which the user is standing close to the robots. Furthermore, the motion of the user’s head was mapped into a coherent motion of the point of view, in such a way to reduce the motion sickness [33] that could incur when the movement of the head does not match with a concordant movement of what the user sees.

Fig. 8.

4.4.2 Position of the Goals.

In the experimental scenario, described in Section 4.1, five goals were equally distributed in front of the user, as shown in Figure 9. Users were explicitly informed of the fact that all the goals had equal probability of being visited. The two extreme goals, located at the far left and far right of the setup, were never used in the experiment to ensure that all the analyzed goals had two neighbors.

Fig. 9.

The goals were randomized and the same combination of goal and trial could not happen twice for a given user. Thus, in the second part of the experiment the goal used in the first part for a given combination of coordination objective and number of robots was excluded.

4.4.3 Learning Effect.

To reduce the effects of the learning effect that takes place as participants advance through the user study, we introduced some additional parts to the experimental procedure described in Section 4.2. In particular, the experiments started with a small tutorial to familiarize the user with the input method. Then a practice scenario allowed the users to test the input method and to get used to the surrounding environment. In the second part of the experiment, the users were able to see again the demonstrative examples as many times as they wanted. In addition, the order of the trials of the factorial plan were randomized.

4.4.4 Movement of the Robots.

The controllers of the robots, described in Section 3, were implemented in MATLAB, and the trajectories of all the robots were saved offline to be replicated exactly in the Unity environment during the experiments. This affords high repeatability during the experiments, which results in a reduction of the disturbing factor that may be introduced from random modification of the trajectories in a real-time implementation.

All the trajectories were normalized in terms of total duration, which was fixed to 60 s: Namely, the group of robots reached the spatial goal in the given time. We experimentally found out that this is a reasonable time (given also the previous experiences [9, 10]). The initial positions of the robots were equal for all the trials and were chosen to reach the specified formation as quickly as possible. This was done to balance out the formation objective with the consensus and coverage objectives in terms of time required for the multi-robot team to achieve a distinguishable behavior, since we discovered that, for the set of parameters chosen in the experiments (see Appendix A.4), the formation was the slowest to converge to a recognizable pattern.

4.5 Users

We performed the experiments on a group of 21 users (age 25.76 \(\pm\) 5.37 y.o., 2 females and 19 males). All the participants were students or researchers within the engineering department, but none of them were involved in the project nor they partook in the building of the experiment. Regarding their robotics background, six users had a multi-robot background and five did not have any robotic knowledge, namely, they were completely new to the subject. All the users signed a consent form and they were instructed on the experimental procedure as described in Section 4.2. We conducted a within-subjects study (see Reference [57] for details) and, according to this methodology, each user tests all the possible combinations of the independent variables in both parts of the experiment. This choice allowed us to obtain a sufficient amount of data from a limited number of participants. The main drawback of this method may be the learning effect or the fatigue, which we addressed with the precautions described in Section 4.4.

Last, the possible disorientation that may arise through the use of a virtual reality headset was counterbalanced by the fact that the majority of the users claimed to have tried at least once virtual reality (10 users had used it once or twice, 7 more than twice, and 4 users had never used it before).

4.6 Statistical Tools

To process the data obtained from the experiments, we utilized standard statistical tools [49], which will be detailed hereafter.

Usually, the first step in a statistical analysis is to check the significance of the independent variables over the dependent variables. The null hypothesis is that there is no statistical relevance, which can be rejected or supported based on the calculation of the p-value. Considering a significance level of 0.05 implies that if the p-value is smaller than 0.05, then we can reject the null hypothesis with a \(5\%\) of risk of asserting a wrong assumption. To calculate the p-value, two different approaches are used depending on the kind of dependent variable: categorical, for variables like blood type or binary answers, or continuous, for variables such as time or temperature. In particular, for the analysis of our user study, we have one categorical variable (correctness of the answer given by the users) and one continuous variable (time taken to give the answer).

For categorical variables, the significance tests build upon a contingency table, which is a table where the rows are the levels of the independent variable and the columns are the categories of the dependent variable. The cells are filled based on the result of each trial (i.e., based on the value of the independent variable and of the value of the dependent variable, we add an entry to the corresponding cell). Consequently, each cell has a number that corresponds to the frequency of that given combination of independent and dependent variable. Once the contingency table is built, a significance test can be carried out, with the Pearson’s Chi-Square test being a typical choice to determine the significance. However, Fisher’s exact test is used instead when one frequency of the contingency table is less than five. Both tests measure the independence between the column and row variables of the given contingency table.

For continuous variables the standard tool is ANOVA (Analysis of variance), which explains whether the independent variables are statistically relevant with respect to the dependent variable. Additionally, a post-hoc test, such as the Tukey-Kramer multiple comparison procedure, can be carried out. This test brings out the differences among the groups (i.e., the combination of the levels of the independent variables) and it can highlight if the statistical relevance is caused by a particular group or not.

5 Results

This section outlines the statistical analysis and discussion of the data acquired during the experiments. The objective is to understand whether the multi-robot system investigated in the experiments is legible, in terms of Definition 1, by analyzing the relationship between the independent variables and legibility.

5.1 Spatial Goal

Along the lines of Definition 1, the spatial goal of the multi-robot system is the first information that the system should be able to communicate to be legible. To study the statistical relevance of the independent variables over the ability of the multi-robot system to communicate its spatial goal, we identify the following null hypotheses:

H1—Null hypothesis: There is no statistical relationship between the independent variables and the legibility of the system, in terms of correctness for the understanding of the spatial goal.

H2—Null hypothesis: There is no statistical relationship between the independent variables and the legibility of the system, in terms of response time for the understanding of the spatial goal.

For analyzing the correctness of the goal, we built the corresponding contingency tables, Tables 1 and 2. Since correctness is a categorical variable, we need two separate tables, one for each independent variable. Given that the results are not suitable for Pearson’s Chi-Square test (see Section 4.6 for details), we instead carried out a Fisher’s exact test. The p-value for both results was high (see Table 3), as expected from the contingency tables. In fact, most of the answers were correct, independently from the coordination objective or the number of robots (no evident differences among the rows of the contingency tables considering the same column). Actually, the percentage of success 98.81\(\%\) (166 right answers over 168 total) would bring to a lack of relevance of any independent variable. This result is consistent with the outcome of the previous works [9, 10], because it seems that the main parameter that affects the communication of the spatial goal is the trajectory of the multi-robot system, and in this experiment all the trials had a straight trajectory toward their goal, independently from the coordination objectives or the number of robots. As the Fisher’s exact tests have not highlighted any statistical relevance, we did not further investigate this relationship, and we cannot reject H1. This result is also confirmed by the survey, in which the answer to the question Do you think you have correctly understood the spatial goal of the robot? was 4.238 \(\pm\) 0.625.

Table 1.

Table 2.

Table 3.

Independent variable	p-value
Coordination objective	0.6856
Number of robots	0.4970

Table 3. p-Values from the Fisher’s Exact Test on Correctness

As regards the response time, given its continuous nature, we used ANOVA to discover if the variables chosen are statistically relevant. We discarded the trials with a wrong answer, because a penalty function would have to be considered, which can interfere with the data. Table 4 reports the p-values for the independent variables considered in the ANOVA, from which we can deduce that coordination objective and its interaction with the number of robots are relevant (in the tables the relevant variables are highlighted in grey). Instead, the number of robots is not relevant. However it is worth noting that the Tukey-Kramer post-hoc test highlighted that these results were mostly influenced by the trial with formation and five robots, which turned out to be the hardest to understand. This is reported in Figure 10. As a result, we reject H2 for the coordination objective and its interaction with the number of robots.

Table 4.

Independent variable	p-value
Coordination objective	0.0309
Number of robots	0.0811
Coordination objective-Number of robots	\(\lt\)0.0001

Table 4. p-Values from the ANOVA on Time Response

Fig. 10.

5.1.1 Influence of Coordination Objective on Spatial Goal Identification.

The results of Section 5.1 about H1 were verified also with the data of the second part of the experiment, even if one of the coordination objectives was not considered (time-varying coverage). The percentage of success was 97.6\(\%\) (123 correct answers in understanding the spatial goal over 126 total), and all the tests for the response time achieved the same results. To verify that the users were not influenced by the fact of being asked to answer to two queries simultaneously, we carried out an additional ANOVA adding the part of the experiment as independent variable. In this analysis, we neglected the trials with the time-varying coverage of the first part to avoid an unbalanced study. The null hypothesis for this study is:

H3—Null hypothesis: There is no statistical relationship between the two parts of the experiment and the legibility of the system, in terms of response time for the understanding of the spatial goal, namely, the addition of the second answer did not affect the legibility.

This additional analysis confirms H3 (p-value 0.468 in the ANOVA). Hence, in the second part, the users were not influenced by the additional question on the coordination objective. The survey confirms also this aspect: the users answered 1.714 \(\pm\) 0.956 to the question In the second part, do you think that choosing both the goal and the coordination objective distracted you?

5.1.2 Coverage Control.

In the first part of the experiment, we implemented both the standard and the time-varying version of coverage control. To understand if there is some difference between the two versions, we evaluate the following hypothesis:

H4—Null hypothesis: There is no statistical relationship between the type of coverage control and the legibility of the system, in terms of response time for the understanding of the spatial goal.

To investigate H4, we carried out ANOVA only between the two types of coverage to understand whether the time-varying aspect can affect the legibility. From the ANOVA, no statistical relevance emerged (p-value 0.0874), but considering the mean (standard coverage 13.98 \(\pm\) 6.65 \(\mathrm{s}\), time-varying coverage 16.69 \(\pm\) 5.9 \(\mathrm{s}\)), we can assert that using the standard coverage for the second part is legitimate, since standard coverage is slightly more legible. In fact, the time-varying coverage looks similar to the consensus, because the robots converge to a point and then they move toward a goal, i.e., the density function moves toward the goal and, as a consequence, the robots. Therefore, we cannot reject H4.

5.2 Coordination Objective

For the second ability of Definition 1, we study the ability of the multi-robot system to communicate its coordination objective. The null hypotheses are:

H5—Null hypothesis: There is no statistical relationship between the independent variables and the legibility of the system, in terms of correctness for the understanding of the coordination objective.

H6—Null hypothesis: There is no statistical relationship between the independent variables and the legibility of the system, in terms of response time for the understanding of the coordination objective.

For the analysis of the correctness of the recognition of the coordination objective in the second part of the experiment, we proceeded as for the correctness of the goal. Tables 5 and 6 report the contingency tables for the coordination objective and for the number of robots, respectively. Given that the entries are sufficiently numerous, we carried out the Pearson’s Chi-Square test, and we discovered that both independent variables are significant (Table 7 reports the p-values). Hence, we can reject H5: specifically, this result shows that both the coordination objective and the different number of robots influence the ability of the user in understanding the coordination objective. It is worth noting that even the users perceived a good confidence in recognizing the coordination objective: the users answered 3.667 \(\pm\) 0.796 to the question Were you able to distinguish among the coordination objectives?. In addition, the real percentage of correctness of this part was \(68\%\) (86 correct answers over 126 total), which means that the users were able to recognize the coordination objectives.

Table 5.

Table 6.

Table 7.

Independent variable	p-value
Coordination objective	\(\lt\)0.0001
Number of robots	0.0074

Table 7. p-Value From the Pearson’s Chi-square on Correctness for the Coordination Objective Recognition

We removed the wrong answers to proceed with the analysis of the response time, as we did for testing H2 in Section 5.1. We checked the balance of the remaining data, namely, if some combination had too many or too few entries. Table 8 reports the p-values calculated with ANOVA, which allows us to reject H6. However, as already pointed out for the response time for the goal recognition in Section 5.1, the post-hoc test reported that the trial with formation and five robots is the one that differs most from the others, namely, it was the most difficult to recognize.

Table 8.

Independent variable	p-value
Coordination objective	\(\lt\)0.0001
Number of robots	0.0154
Coordination objective-Number of robots	0.0045

Table 8. p-Value from ANOVA on Response Time for the Coordination Objective Recognition

About the survey of this last part, we report that consensus was considered as the easiest coordination objective to recognize (\(100\%\) of the users chose consensus in the question Which coordination objective was the easiest to recognize?). After the experiment, many users asserted that consensus was the simplest, because they associated the small occupied area to that coordination objective, and, on the other side, formation and coverage occupied a similar area. Instead, coverage had been identified as the hardest to recognize (\(71.4\%\) of the users answered coverage to the question Which coordination objective was the hardest to recognize? and the remaining answered formation, confirming even more the preference on consensus).

Formation might seem as the easiest coordination objective to recognize, but it turned out to be the most difficult in this user study. Based on comments received by the users after performing the experiments, we inferred that this is due probably to the time needed to create the formation and the high dispersion of the robots when they perform this coordination objective. In addition the point of view is very different from the point of view with which formation control is usually presented, namely, top-view (see Figure 11). In fact, with the position of the camera inside the virtual environment (as reported in Figure 8) and sharing the environment it is difficult to recognize any shape. Differently, consensus was easy to recognize, because the spatial occupancy is easy to identify.

Fig. 11.

6 Implications and Limitations

The results collected from the user study offer significant insights into the legibility of multi-robot systems, as will be discussed in detail in Section 7. However, the main assumptions upon which the user study was design lead to a number of limitations, detailed hereafter.

6.1 Legibility Definition

One of the key contributions of this article is in the formal definition of legibility for a multi-robot system. Such definition is then exploited to derive the metric for the evaluation of the user study results. While this definition is given in a generic manner, suitable for diversified case studies, some application scenarios may not be well represented. For instance, the spatial goal is defined as a single constant point in space, while some application scenarios may require, for instance, periodic trajectories. Regarding the coordination objectives, we considered their definition in terms of coordinated control strategies, while other approaches (e.g., planning methodologies [64, 74]) might be also be interesting to investigate.

6.2 Virtual Reality Simulations and Lack of Realism

The experiments were performed in virtual reality only. This allowed us to precisely control the experimental conditions, reducing (as much as possible) the presence of external, uncontrolled, confounding factors. At the same time, the lack of realism of the virtual reality simulation environment reduces the generality of the results, which could be enriched considering the disturbances and variability that would arise from real-world experiments.

As a design choice, the user was not allowed to move within the virtual reality environment. Considering the user’s motion would be another possible confounding factor that could lead to interesting insights. In fact, variation in the viewpoint of the user is known to be a relevant factor in the context of communication between humans and robots [2, 26].

6.3 Design Choices

The collected results are dependent on the design of the spatial targets, as well as on the design of the coordination objectives. In particular, while the selected coordination objectives are recognized, in the literature, as prototypical examples, different coordinated control strategies might be considered as well: in this case, one could expect to obtain additional insights, from a user study. At the same time, the specific parameter design of each coordination objective (e.g., the shape of the formation, or the number of robots) might also influence the results. Investigating the effect of the choice of parameters for each control action on the legibility would represent an interesting line for future research.

6.4 Reduced Diversity in the User Study Participants

The user study was conducted recruiting students and researchers within the engineering department: as such, diversity in terms of gender and background knowledge was not guaranteed. Considering these additional dimensions for future user studies would increase the generality of the results.

7 Discussion and Conclusions

In this article, we investigated the legibility of multi-robot systems. We extended the definition of legibility for this kind of systems, incorporating both the ability of communicating a spatial goal and a coordination objective. Then, we carried out a user study to answer to the question “Can a multi-robot system be legible, considering Definition 1?” In the experiment, we investigated both the legibility with respect to the spatial goals and the coordination objectives. The user study was conducted in a virtual reality setup, and the acquired data were analyzed with standard statistical tools.

About the ability of communicating the spatial goal, i.e., the first part of Definition 1, we can merge the information gathered in this article and in the previous works [9, 10]. From the results of the performed user studies, we can infer that, at least for simple tasks (as the ones considered in this article, where spatial goals are sufficiently diversified), an overall straight trajectory is preferable, together with a limited inter-robot distance.

Large inter-robot distances could be due by high dispersion caused by strong artificial potential fields, or by a big variance of the Gaussian probability density function of the coverage control, or again by the initial procedure to create the shape in formation control. The performed user studies allow us to infer the conclusion that (at least for the conditions considered in this article) a high level of cohesiveness of the group along the path toward the goal helps in making the overall motion more legible for the human observer.

Regarding the communication of a coordination objective, i.e., the second part of Definition 1, it is important to highlight the characteristics of the desired behavior. In fact, for consensus its inherent characteristic of agglomeration was easily identifiable. Thus, consensus was the easiest coordination objective to recognize. Instead, in the case of formation control, the major feature is the geometry of the formation. If this geometry does not emerge quickly enough, then the identification of the coordination objective is significantly hindered. To improve the legibility of this coordination objective, we would probably need to create an understandable formation even with the different point of view typical of proximal interaction. Further investigations are needed to understand the relationship between legibility and the shape of the formation. In addition, working toward the legibility of multi-robot systems executing complex behaviors will imply balancing the legibility aspect with the performance of the overall system.

From the results of the user study, we can establish that multi-robot systems are legible according to Definition 1. In fact, multi-robot systems can communicate effectively their spatial goal and their coordination objective. While these results are satisfactory, we are still witnessing the dawn of the study of legibility of multi-robot systems and a lot of work has to be done to improve the methods to communicate and to merge legible motion with other tasks of multi-robot systems.

Regarding future directions, we plan to perform a comparative study to assess, in a quantitative manner, what coordination objectives are more easily confused with what other ones, to provide better guidelines in the choice of parameters for the control laws. Furthermore, we plan to explore if explicit communication outperforms implicit communication, and by how much. At the moment, we cannot state that legible motion is better nor worst than explicit communication both in terms of efficiency or perceived integration by the user. However, we think that implicit communication should be integrated with standard communication means, as it represents an essential part of everyday communication among human beings. In this context, this article advances the understanding of legibility for multi-robot systems, pushing toward the integration of these systems with humans.

We remark that the environment where the robot motion took place was intentionally left as simple as possible, to focus the study on the motion characteristics only. As a future line of research, we plan to study the interplay between the motion patterns and the environment design, in terms of legibility: As suggested in Reference [37], in fact, the design of the environment plays a crucial role in this regard.

Another interesting line of future work is represented by multi-robot systems interacting with multiple users. While works can be found in the literature that consider these aspects for the single robot case (see, e.g., Reference [23]), we plan to extend this context to a multi-robot scenario.

Acknowledgment

Authors thank Noemi Canovi for her help in the user study.

Footnotes

Note that, while the control laws outlined in this section are generalizable to an n-dimensional space, i.e., \(x_i\in \mathbb {R}^n\), \(i\in \lbrace 1, \dots , N\rbrace\); we present them here with \(x_i\in \mathbb {R}^2\) to relate to the planar robots considered in this study.

A graph \(\mathcal {G}\) is connected if, for every pair of vertices in \(\mathcal {V}(\mathcal {G})\), there is a path that has them as its end vertices.

With \(\mathbb {R_+}\), we denote the set of strictly positive real numbers.

⁴

The developed virtual reality setup is freely available online (executable file and source code) [7].

A Experimental Implementation

This article evaluates the legibility of multi-robot systems via a user study where participants identify key aspects of the system’s behavior. To this end, as described in Section 4, a series of virtual reality scenarios are shown to the participants for them to identify where the multi-robot team is headed (spatial objective) and how the team is self-organized (coordination objective). This appendix expands on the control laws outlined in Section 3, providing a detailed explanation of the controller implementation displayed in the user study.

A.1 Spatial Drive of the Multi-robot Team

A core legibility aspect studied in this article relates to the inference of the spatial goal of the multi-robot system, i.e., where is the team headed toward within the domain. In the consensus or the formation control protocols, one way of displacing the multi-robot team throughout the domain is to independently command one of the robots in the team, a leader, to move toward a certain location. As a consequence of the inter-agent interactions encoded in the control law Equations (1) and (2), the movement of this leader robot influences the positioning of the whole team in the domain. If we denote as \(x_L \in \mathbb {R}^2\) the position of the leader robot, then we can make the consensus and formation coordination objectives move toward a spatial goal by executing the following go-to-goal controller,

\begin{equation} \dot{x}_L = \kappa _{goal} (x_{goal} - x_L), \end{equation}

where \(x_{goal} \in \mathbb {R}^2\) denotes the desired location of the team, and \(\kappa _{goal} \in \mathbb {R}_+\) is a proportional gain, respectively. With the purpose of minimally disturbing the motions corresponding to the multi-robot coordination objectives, we discard selecting the leader among of the N robots displayed in the experiment and, rather, pick the leader robot to be virtual.

Coverage, however, naturally affords the spatial allocation of the multi-robot team within the domain. Under the protocol presented in Equations (3)–(5), the robots are displaced toward the spatial goals by specifying the density function, \(\phi\), in Equation (3) to be maximal around the goal and decrease as the distance to the goal increments. To this end, we choose the density functions to be bivariate unimodal Gaussians of the form

\begin{equation} \phi (q) = \frac{k}{\sigma ^2} \exp \left(-\frac{\Vert q-\mu \Vert ^2}{2\sigma ^2}\right), \end{equation}

(7)

where \(q \in D\) is a point in the domain, \(k \in \mathbb {R}_+\) is a proportional constant, \(\sigma \in \mathbb {R}_+\) is the variance, and \(\mu \in \mathbb {R}^2\) is the mean of the function. By making the mean, \(\mu\), coincide with the spatial goal, \(x_{goal}\), the team moves toward the objective location when executing the control law in Equation (4). The parameters chosen for the density function are specified in Appendix A.4.

In the case of the time-varying coverage in Section 3.3.1, we consider a density function \(\phi (x,t)\) of the form in Equation (7), where the dependency on t stems from considering a time-varying mean, \(\mu (t)\). The evolution of the mean serves the purpose of guiding the multi-robot team from its initial location toward the spatial goal. Consequently, \(\mu\) coincides with the mean position of the robots at the start and moves on a straight line toward the spatial goal as the experiment progresses according to

\begin{equation} \mu (t) = \bar{x}(t_0) + \frac{x_{goal}-\bar{x}(t_0)}{t_F-t_0}(t-t_0), \end{equation}

where \(\bar{x} \in \mathbb {R}^2\) denotes the mean position of the robots, \(\bar{x}(t) = \sum _{i\in \mathcal {N}}x_i(t)/N\), and \(t_0\) and \(t_F\) the initial and final time, respectively. For the user study, \(t_0 = 0\)s and \(t_F=60\text{ s}\).

A.2 Collision Avoidance

The controllers presented in Section 3 model robots as points that move in the space according to single integrator dynamics. However, in reality, robots have a finite footprint and mass, and collisions must be avoided in general. We used Control Barrier Functions (CBFs) as in Reference [70] to prevent collisions among the robots. This choice was motivated by the minimal perturbation that the CBFs introduce on the coordinated behavior of the robots as opposed to other methods such as repulsive potential fields, which can introduce inter-robots connections that modify the overall behavior of the system (as seen in References [9, 10]) and potentially hinder the legibility of the coordinated behavior.

Let \(d_{min}\in \mathbb {R}_+\) be the minimum distance between any pair of robots with positions \(x_i\) and \(x_j\), we can define the following CBF to ensure pairwise collision-avoidance,

\begin{equation} h_{ij}(x_i,x_j) = \Vert x_i - x_j \Vert ^2 - d_{min}^2. \end{equation}

A safe velocity \(u^* \in \mathbb {R}^{2N}\) for the multi-robot team can be found solving the following quadratic program (see Reference [72] for further details),

\begin{align*} u^*(x) &= \mathop{\arg\min}_{u\in \mathbb {R}^{2N}}\Vert u-u^{nom}\Vert ^2\\ &\text{s.t. } \frac{\partial h_{ij}(x_i,x_j)}{\partial x_i} u \ge -\gamma h_{ij}(x_i,x_j)^3, \\ \forall i &\, \in \lbrace 1,\dots ,N-1\rbrace , j\in \lbrace i+1,\dots ,N\rbrace , \end{align*}

where \(u^{nom} \in \mathbb {R}^2\) is the nominal input obtained from the controllers in Section 3 and \(\gamma \in \mathbb {R}_+\). The controller \(u^*\) avoids collisions while remaining as close as possible to the nominal controller.

A.3 Formation Control

To execute the formation controller in Equation (2), we need a formation \(\Delta\) specifying the desired pairwise distances between robots. As is well known, regular polygonal shapes are easy to recognize, for users [17, 24]. Hence, to make the result of the formation control clearly distinguishable from the other coordination objectives, for the user study in Section 4, we choose formations where the robots are arranged in concentric regular polygons. In particular, for \(N=5\), we choose a pentagon and for \(N=15\), two concentric polygons (one heptagon and one octagon). For the achieved formation see Figures 12(a) and 12(b), respectively.

Fig. 12.

The formations are generated as follows: let \(d_{g}\) be the distance between goals (they are set equally spaced out), then the formation is circumscribed by a circle of radius \(R_{out} = d_{g}/3\). For the case of \(N=15\), the inner polygon is concentric to the outer octagon and circumscribed by a polygon of radius \(R_{in} = R_{out}/4\). With the geometries defined, the formation \(\Delta\) is obtained by placing the robots on the vertices of the polygons. The pairwise distances \(\delta _{ij}\) for a robot i will correspond to the Euclidean distance between its corresponding vertex and the rest of the robots in the polygon. An example of how these distances are generated is shown in Figure 12(a).

Fig. 13.

Fig. 14.

Fig. 15.

Fig. 16.

A.4 Choice of Parameters

The parameters used for each coordination objective are reported in Table 9. The saturation on the velocity, \(\dot{x}\) saturation, was chosen to produce a realistic behavior of the robots, which have a maximum speed at which they can move. The coverage parameter \(\sigma\) was chosen to produce a similar spatial occupancy, in terms of area occupied by the convex hull of the positions of the robots, as the formation control (see Section A.3 for further details). However, consensus has a smaller dispersion given its inherent operation and the agglomeration of the robots was only affected by the safety distance, \(d_{min}\). The parameters corresponding to the proportional gain \(\kappa\), which properly scales the input, and \(\kappa _{goal}\) were set to make the team achieve the spatial goal within the specified time frame of 60 s.

Table 9.

Parameter	Consensus	Formation	Coverage
\(\kappa\)	1	1.5	1
\(\kappa _{goal}\)	\(1.5\kappa N\)	\(0.5*\kappa\)	–
\(\dot{x}\) saturation \([ \mathrm{m}/ \mathrm{s}]\)	0.4	0.4	0.4
\(d_{min}\) \([\mathrm{m}]\)	0.1	0.1	0.1
\(\sigma\)	–	–	1

Table 9. Parameters for the Coordination Strategies

A.5 Examples of trajectories

The following figures depict examples of trajectories, represented from top-down, followed by the robots during the experiments. In the figures, red asterisks represent the initial position of the robots, green circles represent intermediate positions, and blue triangles represent the final positions.

A.6 Quantitative Results

We report hereafter the quantitative results recorded in the user studies: for both parts of the experiments, mean and standard deviation of the response time are reported.

The three spatial goals are identified, respectively, as G0, G1, G2.

Each coordination behavior was executed with 5 and 15 robots, and the respective combinations are identified as follows:

—

COVtv5: Time-varying coverage control, 5 robots (executed in the first part only);

—

COVtv15: Time-varying coverage control, 15 robots (executed in the first part only);

—

COV5: Coverage control, 5 robots;

—

COV15: Coverage control, 15 robots;

—

CON5: Consensus, 5 robots;

—

CON15: Consensus, 15 robots;

—

F5: Formation control, 5 robots;

—

F15: Formation control, 15 robots.

We remind the readers that, in the second part of the experiment, the time-varying coverage was not utilized.

Tables 10, 11, and 12 collect the values (mean and standard deviation) of the time taken to provide the answer in the first part of the experiments, considering, respectively, only the spatial goal, only the coordination objective and the number of robots, and both aspects together.

Table 10.

Goal	Mean and std
G0	16.5211 \(\pm\) 6.8270 \(\mathrm{s}\)
G1	13.5277 \(\pm\) 5.4885 \(\mathrm{s}\)
G2	17.9285 \(\pm\) 7.0081 \(\mathrm{s}\)

Table 10. First Part of the Experiment: Time to Provide the Answer, Based on the Spatial Goal

Table 11.

Combination	Mean and std
COVtv5	15.7137 \(\pm\) 5.5406 \(\mathrm{s}\)
COVtv15	16.8091 \(\pm\) 6.3161 \(\mathrm{s}\)
COV5	11.7760 \(\pm\) 6.1830 \(\mathrm{s}\)
COV15	15.8666 \(\pm\) 6.6326 \(\mathrm{s}\)
CON5	13.5878 \(\pm\) 4.9804 \(\mathrm{s}\)
CON15	14.9078 \(\pm\) 3.9684 \(\mathrm{s}\)
F5	25.9041 \(\pm\) 5.3961 \(\mathrm{s}\)
F15	13.0891 \(\pm\) 3.4690 \(\mathrm{s}\)

Table 11. First Part of the Experiment: Time to Provide the Answer, Based on the Coordination Objective and Number or Robots

Table 12.

Goal	Coordination objective and number of robots
	COVtv5	COVtv15	COV5	COV15	CON5	CON15	F5	F15
G0	15.4466 \(\pm\) 5.5441	17.6065 \(\pm\) 6.2949	12.3671 \(\pm\) 7.3302	17.2483 \(\pm\) 10.3072	16.2340 \(\pm\) 4.6033	18.9170 \(\pm\) 0.9037	31.4565 \(\pm\) 1.0373	—
G1	—	11.3840 \(\pm\) 7.7075	11.7355 \(\pm\) 5.7812	13.8306 \(\pm\) 5.7178	11.5966 \(\pm\) 4.0080	13.9054 \(\pm\) 3.9488	22.4144 \(\pm\) 0.8037	—
G2	21.0560 \(\pm\) 0	15.4565 \(\pm\) 5.5784	7.5330 \(\pm\) 0	21.9565 \(\pm\) 3.1741	18.1046 \(\pm\) 4.8619	15.2839 \(\pm\) 4.0334	26.3573 \(\pm\) 5.9049	13.0891 \(\pm\) 3.4690

Table 12. First Part of the Experiment: Time to Provide the Answer, Considering the Spatial Goal, Combined with Coordination Objective and Number of Robots (Mean and Standard Deviation)

Tables 13, 14, and 15 collect the values (mean and standard deviation) of the time taken to provide the answer on the spatial goal, in the second part of the experiments, considering, respectively, only the spatial goal, only the coordination objective and the number of robots, and both aspects together.

Table 13.

Goal	Mean and std
G0	16.1587 \(\pm\) 6.7637 \(\mathrm{s}\)
G1	13.2096 \(\pm\) 5.6655 \(\mathrm{s}\)
G2	16.8527 \(\pm\) 6.7756 \(\mathrm{s}\)

Table 13. Second Part of the Experiment: Time to Provide the Answer on the Spatial Goal, Based on the Spatial Goal

Table 14.

Combination	Mean and std
COV5	12.5859 \(\pm\) 5.8023 \(\mathrm{s}\)
COV15	15.9054 \(\pm\) 6.2474 \(\mathrm{s}\)
CON5	12.1532 \(\pm\) 3.6131 \(\mathrm{s}\)
CON15	12.7797 \(\pm\) 4.0241 \(\mathrm{s}\)
F5	24.7832 \(\pm\) 5.1234 \(\mathrm{s}\)
F15	13.2513 \(\pm\) 3.7694 \(\mathrm{s}\)

Table 14. Second Part of the Experiment: Time to Provide the Answer on the Spatial Goal, Based on the Coordination Objective and Number or Robots

Table 15.

Goal	Coordination objective and number of robots
	COV5	COV15	CON5	CON15	F5	F15
G0	14.3789 \(\pm\) 5.1508	16.2796 \(\pm\) 6.8830	12.5622 \(\pm\) 3.3390	13.5449 \(\pm\) 3.6765	29.2770 \(\pm\) 2.9017	17.8680 \(\pm\) 4.5600
G1	8.0634 \(\pm\) 3.1209	12.4337 \(\pm\) 2.1464	9.3547 \(\pm\) 2.1777	9.8978 \(\pm\) 1.5452	22.9968 \(\pm\) 1.6696	12.1651 \(\pm\) 2.6980
G2	16.5637 \(\pm\) 7.8605	17.1992 \(\pm\) 6.0746	13.6225 \(\pm\) 4.8630	13.8156 \(\pm\) 4.8536	22.9692 \(\pm\) 7.6283	–

Table 15. Second Part of the Experiment: Time to Provide the Answer on the Spatial Goal, Considering the Spatial Goal, Combined with Coordination Objective and Number of Robots (Mean and Standard Deviation)

Tables 16, 17, and 18 collect the values (mean and standard deviation) of the time taken to provide the answer on the coordination objective, in the second part of the experiments, considering, respectively, only the spatial goal, only the coordination objective and the number of robots, and both aspects together.

Table 16.

Goal	Mean and std
G0	12.3250 \(\pm\) 8.1746 \(\mathrm{s}\)
G1	11.5157 \(\pm\) 5.8187 \(\mathrm{s}\)
G2	12.8764 \(\pm\) 7.7876 \(\mathrm{s}\)

Table 16. Second Part of the Experiment: Time to Provide the Answer on the Coordination Objective, Based on the Spatial Goal

Table 17.

Combination	Mean and std
COV5	11.0853 \(\pm\) 5.2705 \(\mathrm{s}\)
COV15	13.9896 \(\pm\) 5.7466 \(\mathrm{s}\)
CON5	8.2434 \(\pm\) 5.8996 \(\mathrm{s}\)
CON15	7.3398 \(\pm\) 4.9104 \(\mathrm{s}\)
F5	19.8253 \(\pm\) 9.6131 \(\mathrm{s}\)
F15	12.3888 \(\pm\) 3.6836 \(\mathrm{s}\)

Table 17. Second Part of the Experiment: Time to Provide the Answer on the Coordination Objective, Based on the Coordination Objective and Number or Robots

Table 18.

Goal	Coordination objective and number of robots
	COV5	COV15	CON5	CON15	F5	F15
G0	12.7619 \(\pm\) 6.3283	13.4790 \(\pm\) 6.5305	6.6565 \(\pm\) 4.9344	8.2044 \(\pm\) 6.4868	26.3880 \(\pm\) 7.6682	12.6540 \(\pm\) 5.1092
G1	8.4301 \(\pm\) 2.7616	14.6710 \(\pm\) 1.7589	7.4522 \(\pm\) 3.5358	6.6712 \(\pm\) 5.0501	15.8297 \(\pm\) 8.9930	12.3264 \(\pm\) 3.4705
G2	11.1333 \(\pm\) 3.8120	15.2655 \(\pm\) 5.4251	14.1920 \(\pm\) 7.9505	6.8931 \(\pm\) 3.2921	19.2560 \(\pm\) 10.0318	–

Table 18. Second Part of the Experiment: Time to Provide the Answer on the Coordination Objective, Considering the Spatial Goal, Combined with Coordination Objective and Number of Robots (Mean and Standard Deviation)

References

[1]

H. Ando, Y. Oasa, I. Suzuki, and M. Yamashita. 1999. Distributed memoryless point convergence algorithm for mobile robots with limited visibility. IEEE Trans. Robot. Autom. 15, 5 (1999), 818–828. DOI:

Abstract

1 Introduction

1.1 Literature Review

1.1.1 Proximal Interaction.

1.1.2 Implicit Communication.

1.1.3 Legibility in Multi-robot Systems.

1.2 Statement of Contributions

2 Definition of Legibility in Multi-robot Systems

3 Multi-robot Coordination Objectives

3.1 Consensus

3.2 Formation Control

3.3 Coverage Control

3.3.1 Time-varying Coverage.

4 User Study

4.1 Experimental Scenario

4.2 Experimental Procedure

4.2.1 First Part.

4.2.2 Second Part.

4.3 Experimental Variables

4.4 Factorial Design

4.4.1 Point of View.

4.4.2 Position of the Goals.

4.4.3 Learning Effect.

4.4.4 Movement of the Robots.

4.5 Users

4.6 Statistical Tools

5 Results

5.1 Spatial Goal

5.1.1 Influence of Coordination Objective on Spatial Goal Identification.

5.1.2 Coverage Control.

5.2 Coordination Objective

6 Implications and Limitations

6.1 Legibility Definition

6.2 Virtual Reality Simulations and Lack of Realism

6.3 Design Choices

6.4 Reduced Diversity in the User Study Participants

7 Discussion and Conclusions

Acknowledgment

Footnotes

A Experimental Implementation

A.1 Spatial Drive of the Multi-robot Team

A.2 Collision Avoidance

A.3 Formation Control

A.4 Choice of Parameters

A.5 Examples of trajectories

A.6 Quantitative Results

References

Index Terms

Recommendations

Lifelong Adaptation in Heterogeneous Multi-Robot Teams: Response to Continual Variation in Individual Robot Performance

Multi Robot Collision Avoidance with Continuous Curvature Manoeuvres

A switched-system approach to formation control and heading consensus for multi-robot systems

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations