
Citation 
 Permanent Link:
 http://digital.auraria.edu/AA00005917/00001
Material Information
 Title:
 Distributed algorithms for naming anonymous processes
 Creator:
 Talo, Muhammed ( author )
 Place of Publication:
 Denver, Colo.
 Publisher:
 University of Colorado Denver
 Publication Date:
 2016
 Language:
 English
 Physical Description:
 1 electronic file (118 pages) : ;
Thesis/Dissertation Information
 Degree:
 Doctorate ( Doctor of philosophy)
 Degree Grantor:
 University of Colorado Denver
 Degree Divisions:
 Department of Computer Science and Engineering, CU Denver
 Degree Disciplines:
 Computer science
Information systems
Subjects
 Subjects / Keywords:
 Computer networks ( lcsh )
Parallel algorithms ( lcsh ) Operating systems (Computers) ( lcsh ) Computer networks ( fast ) Operating systems (Computers) ( fast ) Parallel algorithms ( fast )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Review:
 We investigate anonymous processors computing in a synchronous manner and communicating via readwrite shared memory. This system is known as a parallel random access machine (PRAM). It is parameterized by a number of processors n and a number of shared memory cells. We consider the problem of assigning unique integer names from the interval [1, n] to all n processors of a PRAM. We develop algorithms for each of the eight specific cases determined by which of the following independent properties hold: (1) concurrently attempting to write distinct values into the same memory cell either is allowed or not, (2) the number of shared variables either is unlimited or it is a constant independent of n, and (3) the number of processors n either is known or it is unknown. Our algorithms terminate almost surely, they are Las Vegas when n is known, they are Monte Carlo when n is not known. We show lower bounds on time, depending on whether the amounts of shared memory are constant or unlimited. In view of these lower bounds, all the Las Vegas algorithms we develop are asymptotically optimal with respect to their expected time, as determined by the available shared memory. We also consider a communication channel in which the only possible communication mode is transmitting beeps, which reach all the nodes instantaneously.
 Review:
 The algorithmic goal is to randomly assign names to the anonymous nodes in such a manner that the names make a contiguous segment of positive integers starting from 1. The algorithms are provably optimal with respect to the expected time O(n log n), the number of used random bits O(n log n), and the probability of error.
 Bibliography:
 Includes bibliographical references.
 System Details:
 System requirements: Adobe Reader.
 Statement of Responsibility:
 by Muhammed Talo.
Record Information
 Source Institution:
 University of Colorado Denver Collections
 Holding Location:
 Auraria Library
 Rights Management:
 All applicable rights reserved by the source institution and holding location.
 Resource Identifier:
 986529339 ( OCLC )
ocn986529339
 Classification:
 LD1193.E52 2016d T35 ( lcc )

Downloads 
This item has the following downloads:

Full Text 
DISTRIBUTED ALGORITHMS FOR NAMING ANONYMOUS PROCESSES
by
MUHAMMED TALO B.S., Inonu University, Turkey, 2005 M.S., Firat University, Turkey, 2007
A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy
Computer Science and Information Systems
2016
This thesis for the Doctor of Philosophy degree by Muhammed Talo has been approved for the
Computer Science and Information Systems Program
by
Bogdan S. Chlebus, Advisor Ellen Gethner, Chair Michael Mannino Burt Simon Tam Yu
December 17, 2016
n
Talo, Muhammed (Ph.D., Computer Science and Information Systems) Distributed Algorithms for Naming Anonymous Processes Thesis directed by Associate Professor Bogdan S. Chlebus
ABSTRACT
We investigate anonymous processors computing in a synchronous manner and communicating via readwrite shared memory. This system is known as a parallel random access machine (PRAM). It is parameterized by a number of processors n and a number of shared memory cells. We consider the problem of assigning unique integer names from the interval [1,77.] to all n processors of a PRAM. We develop algorithms for each of the eight specific cases determined by which of the following independent properties hold: (1) concurrently attempting to write distinct values into the same memory cell either is allowed or not, (2) the number of shared variables either is unlimited or it is a constant independent of n, and (3) the number of processors n either is known or it is unknown. Our algorithms terminate almost surely, they are Las Vegas when n is known, they are Monte Carlo when n is not known.We show lower bounds on time, depending on whether the amounts of shared memory are constant or unlimited. In view of these lower bounds, all the Las Vegas algorithms we develop are asymptotically optimal with respect to their expected time, as determined by the available shared memory. Our Monte Carlo algorithms are correct with probabilities that are 1 n~Q
m
The form and content of this abstract are approved. I recommend its publication.
Approved: Bogdan S. Chlebus
IV
ACKNOWLEDGMENT
First of all, I would like to express my gratitude to my advisor Bogdan S. Chlebus. Without his guidance and mentorship this thesis would have been impossible. Not only did he guide me over the years of my graduate studies, also he always tried to understand me even when I clearly was not making any sense. You have set an example of excellence as a researcher, mentor, and role model.
I want to thank Gianluca De Marco for collaboration on several research projects.
I would like to thank Ellen Gethner and Sarah Mandos for all the support over the years.
My research and work on the thesis have also been sponsored by Turkish Government and supported by Bogdan S. Chlebus through the National Science Foundation. This would not have been possible without their support.
I would also like to thank the faculty and my fellow students at the University of Colorado Denver for creating excellent working conditions.
Finally, I thank my daughters Betul and Aise, my wife Dilek, and my parents for their patience and continuing support.
v
TABLE OF CONTENTS
1. INTRODUCTION....................................................... 1
2. THE SUMMARY OF THE RESULTS...................................... 5
3. PREVIOUS AND RELATED WORK.......................................... 7
4. TECHNICAL PRELIMINARIES........................................... 15
5. LOWER BOUNDS AND IMPOSSIBILITIES ............................... 23
5.1 Preliminaries..................................................... 23
5.2 Lower Bounds for a PRAM .......................................... 24
5.3 Lower Bounds for a Channel with Beeping........................... 35
6. PRAM: LAS VEGAS ALGORITHMS ....................................... 40
6.1 Arbitrary with Constant Memory.................................... 40
6.2 Arbitrary with Unbounded Memory................................... 42
6.3 Common with Constant Memory....................................... 46
6.4 Common with Unbounded Memory...................................... 52
6.5 Conclusion........................................................ 61
7. PRAM: MONTE CARLO ALGORITHMS...................................... 62
7.1 Arbitrary with Constant Memory.................................... 62
7.2 Arbitrary with Unbounded Memory................................... 68
7.3 Common with Bounded Memory ....................................... 74
7.4 Common with Unbounded Memory...................................... 84
7.5 Conclusion........................................................ 90
8. NAMING A CHANNEL WITH BEEPS .................................... 91
8.1 A Las Vegas Algorithm............................................. 91
8.2 A Monte Carlo Algorithm........................................... 95
8.3 Conclusion....................................................... 101
9. OPEN PROBLEMS AND FUTURE WORK.................................... 102
References........................................................... 103
vi
TABLES
Table
2.1 The list of the naming problems specifications and the corresponding Las Vegas
algorithms........................................................................ 6
2.2 The list of the naming problems specifications and the corresponding Monte
Carlo algorithms.................................................................. 6
vii
FIGURES
Figure
4.1 Procedure VerifyCollision............................................ 19
4.2 Procedure DetectCollision............................................ 20
6.1 Algorithm ArbitraryConstantLV....................................... 41
6.2 Algorithm ArbitraryUnboundedLV...................................... 43
6.3 Algorithm CommonConstantLV.......................................... 47
6.4 Algorithm CommonUnboundedLV......................................... 53
7.1 Algorithm ArbitraryBoundedMC........................................ 63
7.2 Algorithm ArbitraryUnboundedMC...................................... 69
7.3 Procedure EstimateSize............................................... 76
7.4 Procedure ExtendNames................................................ 77
7.5 Algorithm CommonBoundedMC........................................... 82
7.6 Procedure GaugeSizeMC............................................... 85
7.7 Algorithm CommonUnboundedMC......................................... 87
8.1 Algorithm BeepNamingLV.............................................. 92
8.2 Procedure NextString................................................. 97
8.3 Algorithm BeepNamingMC ............................................. 98
viii
1. INTRODUCTION
We consider a distributed system in which some n processors communicate using readwrite shared memory. It is assumed that operations performed on shared memory occur synchronously, in that executions of algorithms are structured as sequences of globally synchronized rounds. Each processor is an independent random access machine with its own private memory. Such a system is known as a (synchronous) Parallel Random Access Machine (PRAM). We consider the problem of assigning distinct integer names from the interval [I, n] to the processors of a PRAM, when originally the processors do not have distinct identifiers.
The problem to assign unique names to anonymous processes in distributed systems can be considered as a stage in either building such systems or making them fully operational. Correspondingly, this may be categorized as either an architectural challenge or an algorithmic one. For example, tightly synchronized message passing systems are typically considered under the assumption that processors are equipped with unique identifiers from a contiguous segment of integers. This is because such systems impose strong demands on the architecture and the task of assigning identifiers to processors is modest when compared to providing synchrony. Similarly, when synchronous parallel machines are designed, then processors may be identified by how they are attached to the underlying communication network. In contrast to that, PRAM is a virtual model in which processors communicate via shared memory; see an exposition of PRAM as a programming environment given by Keller et al. [62]. This model does not assume any relation between the shared memory and processors that identifies individual processors.
Distributed systems with shared readwrite registers are usually considered to be asynchronous. Synchrony in such environments can be added by simulations rather than by a supportive architecture or an underlying communication network. Processes do not need to be hardware nodes, instead, they can be virtual computing agents. When a synchronous PRAM is considered, as obtained by a simulation, then the underlying system architecture does not facilitate identifying processors, and so we do not necessarily expect that processors
1
are equipped with distinct identifiers in the beginning of a simulation.
We view PRAM as an abstract construct which provides a distributed environment to develop algorithms with multiple agents/processors working concurrently; see Vishkin [89] for a comprehensive exposition of PRAM as a vehicle facilitating parallel programing and harnessing the power of multicore computer architectures. Assigning names to processors by themselves in a distributed manner is a plausible stage in an algorithmic development of such environments, as it cannot be delegated to the stage of building hardware of a parallel machine.
When processors of a distributed/parallel system are anonymous then the task of assigning a unique identifier to each processor is a key step to make the system fully operational, because names are needed for executing deterministic algorithms. We consider naming to be the task of assigning unique integers in the range [1, n\ to a given set of n processors as their names. Distributed algorithms assigning names to anonymous processors are called naming in this thesis. We assume that anonymous processors do not have any features facilitating identification or distinguishing.
We deal with two kinds of randomized (naming) algorithms, called Monte Carlo and Las Vegas, which are defined as follows. A randomized algorithm is Las Vegas when it terminates almost surely and the algorithm returns a correct output upon termination. A randomized algorithm is Monte Carlo when it terminates almost surely and an incorrect output may be produced upon termination, but the probability of error converges to zero with the size of input growing unbounded. The naming algorithms we develop have qualities that depend on whether n is known or not, according to the following simple rule: each algorithm for a known n is Las Vegas while each algorithm for an unknown n is Monte Carlo. Our Monte Carlo algorithms have the probability of error converging to zero with a rate that is polynomial in n. Moreover, when incorrect (duplicate) names are assigned, the set of integers used as names makes a contiguous segment starting at the smallest name 1.
We say that a parameter of an algorithmic problem is known when it can be used in a
2
code of an algorithm. We consider two groups of naming problems for a PRAM, depending on whether the number of processors n is known or not.
Additionally, we consider two categories of naming problems depending on how much shared memory is available. In one case, there is a constant number of memory cells, which means that the amount of memory is independent of n but as large as needed for algorithm design. In the other case, the number of shared memory cells is unbounded, but how much is used depends on an algorithm and n. When there is an unbounded amount of memory then 0(n) memory cells actually suffice for the algorithms we develop. We also categorize naming problems depending on whether it is an Arbitrary PRAM (distinct values may be concurrently attempted to be written into a register, an arbitrary one of them gets written) or a Common PRAM variant (only equal values may be concurrently attempted to be written into a register).
Next, we investigate anonymous channel with beeping. There are some n stations attached to the channel that are devoid of any identifiers. Communication proceeds in synchronous rounds. All the stations start together in the same round. The channel provides a binary feedback to all the attached stations: when no stations transmit then nothing is sensed on the communication medium, and when some station does transmit then every station detects a beep.
A beeping channel resembles multipleaccess channels, in that it can be interpreted as a singlehop radio network. The difference between the two models is in the feedback provided by each kind of channel. The traditional multiple access channel with collision detection provides the following ternary feedback: silence occurs when no station transmits, a message is heard when exactly one station transmits, and collision is produced by multiple stations transmitting simultaneously, which results in no message heard and can be detected by carrier sensing as distinct from silence. Multiple access channels also come in a variant without collision detection. In such channels the binary feedback is as follows: when exactly one station transmits then the transmitted message is heard by every station, and otherwise,
3
when either no station or multiple stations transmit, then this results in silence. A channel with beeping has its communication capabilities restricted only to carrier sensing, without even the functionality of transmitting specific bits as messages. The only apparent mode of exchanging information on such a synchronous channel with beeping is to suitably encode it by sequences of beeps and silences.
Modeling communication by a mechanism as limited as beeping has been motivated by diverse aspects of communication and distributed computing. Beeping provides a detection of collision on a transmitting medium by sensing it. Communication by only carrier sensing can be placed in a general context of investigating wireless communication on the physical level and modeling interference of concurrent transmissions, of which the signaltointerferenceplusnoise ratio (SINR) model is among the most popular and well studied; see [54, 61, 85]. Beeping is then a very limited mode of wireless communication, with feedback in the form of either interference or lack thereof. Another motivation comes from biological systems, in which agents exchange information in a distributed manner, while the environment severely restricts how such agents communicate; see [2, 78, 86]. Finally, communication with beeps belongs to the area of distributed computing by weak devices, where the involved agents have restricted computational and communication capabilities. In this context, the devices are modeled as finitestate machines that communicate asynchronously by exchanging states or messages from a finite alphabet. Examples of this approach include the populationprotocols model introduced by Angluin et at. [7], (see also [9, 11, 73]), and the stoneage distributed computing model proposed by Emek and Wattenhoffer [41].
4
2. THE SUMMARY OF THE RESULTS
We consider randomized algorithms executed by anonymous processors that operate in a synchronous manner using readwrite shared memory with a goal to assign unique names to the processors. This problem is investigated in eight specific cases, depending on additional assumptions, and we give an algorithm for each case. The three independent assumptions regard the following: (1) the knowledge of n, (2) the amount of shared memory, and (3) the PRAM variant.
Las Vegas algorithms have been submitted a journal paper are taken from [26]. The naming algorithms we give terminate with probability 1. These algorithms are Las Vegas for a known number of processors n and otherwise they are Monte Carlo. All our algorithms use the optimum expected number 0(n logn) of random bits. We show that naming algorithms with n processors and C > 0 shared memory cells need to operate in fl(n/C) expected time on an Arbitrary PRAM, and in Q(nlogn/C) expected time on a Common PRAM. We show that any naming algorithm needs to work in the expected time 12 (log n); this bound matters when there is an unbounded supply of shared memory. Based on these facts, all our Las Vegas algorithms for the case of known n operate in the asymptotically optimum time, and when the amount of memory is unlimited, they use only an expected amount of space that is provably necessary. The list of the naming problems specifications and the corresponding Las Vegas algorithms with their performance bounds is summarized in Table 2.1.
Monte Carlo algorithms have been submitted a journal paper are taken from [27]. We show that a Monte Carlo naming algorithm that uses 0(n\ogn) random bits has to have the property that it fails to assign unique names with probability that is n~Q
A Las Vegas and a Monte Carlo naming algorithms for a beeping channel have been submitted a journal paper are taken from [28]. We considered assigning names to anonymous
5
Table 2.1: Four naming problems, as determined by the PRAM model and the
available amount of shared memory, with the respective performance bounds of their solutions. When the number of shared memory cells is not a constant then the given usage is the expected number of shared memory cells that are actually used.
PRAM Model Memory Time Algorithm
Arbitrary 0(1) 0(n) ArbitraryConstantLV in Section 6.1
Arbitrary 0(n/ log n) 0( log n) ArbitraryUnboundedLV in Section 6.2
Common 0(1) O(ulogu) CommonConstantLV in Section 6.3
Common 0(n) 0( log n) CommonUnboundedLV in Section 6.4
Table 2.2: Four naming problems, as determined by the PRAM model and the
available amount of shared memory, with the respective performance bounds of their solutions as functions of the number of processors n. When time is marked as polylog this means that the algorithm comes in two variants, such that in one the expected time is 0(\ogn) and the amount of used shared memory is suboptimal n0
PRAM Model Memory Time Algorithm
Arbitrary 0(1) 0(n) ArbitraryBoundedMC in Section 7.1
Arbitrary unbounded polylog ArbitraryUnboundedMC in Section 7.2
Common 0(1) O(ulogu) CommonBoundedMC in Section 7.3
Common unbounded polylog CommonUnboundedMC in Section 7.4
stations attached to a channel that allows only beeps to be heard. We present a Las Vegas naming algorithm and a Monte Carlo algorithm and show that algorithms are provably optimal with respect to the number of used random bits 0(n log n), the expected time 0(n log n), and the probability of error.
6
3. PREVIOUS AND RELATED WORK
Here we survey the previous work on anonymous naming.
Lipton and Park [70] were the first to consider the naming problem in asynchronous shared memory systems. They studied naming in asynchronous distributed systems with readwrite shared memory controlled by adaptive schedulers; they proposed a solution that terminates with positive probability, which can be made arbitrarily close to 1 assuming a known n. They developed a randomized algorithm that solves naming problem. Their algorithm is not guaranteed to terminate; however, if it terminates no two processors will obtain the same names. Once the processors terminated the given names comprise completely the set {1,2,..., n}.
Their algorithm operates as follows. In the beginning of an execution, all processors initialize the contents of shared registers to zeros. Every processor randomly selects an integer i from {1, 2,..., n2}, where n is the number of processors. Then each processor writes 1 to selected cell i. All processors repeat this procedure until there is at least one row which contains n, ls bit value. Finally, the integer chosen by each processor is used as a name. The algorithm presented in [70], uses (D(Ln2) bits and terminates (D(Ln2) time with probability 1 cL for some constant c > 1.
Teng [87] provided a randomized two layer solution for the naming problem considering the same setting as Lipton and Park, (asynchronous processor, the algorithm would work regardless of initial content of shared memory), but his solution improved the failure probability and decreased the space to 0(nlog2 n) shared bits, with probability at least 1 for a constant c. The algorithm terminates 0(nlog2 n) time.
The author developed a simple algorithm for asynchronous systems with known n, that is a trivial modification of Lipton and Parks algorithm [70]. Teng assumed that n processors divided into K groups with high probability, that is, each group has about n/K processors. Hence, he reduced the problem size from n to n/K. Then he used the similar technique of Lipton and Park for a smaller size problem. The number of processors is unknown in
7
each group. Therefore, every processor checks if the sum of the number of fs equals to n in maximum size rows of each group. The author also observed that if n is unknown then no algorithm guaranteed to terminate with correct names.
Lim and Park [69] showed that the naming problem can be solved in 0(n) space; however, they used word operations instead of bit operations. The authors used a shared memory array indexed 1 through n, where n is the number of processors. The basic idea of their algorithm is, at the beginning of the execution, all processors initialize the contents of shared registers to zeros. Then every processor randomly chooses a key and an ID, which is corresponding to the index of a cell, and tries to store its key to claimed cell. When there is more than one processors choose the same cell to write its ID, the processor with the maximum key keeps the claimed ID and the rest of the processors with small keys claim a new ID. If there are no zero entries in the array, i.e., each processor confirms its own ID, then the algorithm terminates. Note that their protocol can fail when more than one processor chooses the same ID and the same key. Additionally, the authors answered the open questions in which Teng posted at the conclusion of [87].
Egecioglu and Singh [39] proposed a synchronous algorithm that each processor repeatedly chooses a new random index value, which selection made independently and uniformly at random, and sets the corresponding shared register to 1. Then it counts the number of ones. If the total number of ones equals to n then the processor exits the loop and assign itself ID of the index value. The expected termination time for the synchronous algorithm is 0(n2). They also proposed a polynomialtime Las Vegas naming algorithm for asynchronous systems with oblivious scheduling of events for a known n under weak shared readwrite memory system. Intuitively, the idea in their algorithm is as follows. It uses K copies of an array. Each processor chooses a random index value for each copies of array K rather than a single array and sets the selected register to 1. Then every processor reads all K arrays. Processors perform write operation to each K copy of arrays in ascending order, but afterwards, read all arrays in descending order. A processor keeps executing the algorithm until the total number
8
of ones in a row equals to the number of processors. Processors may exit from the execution after detecting a successful read because of asynchrony. The processor that exit from the execution after a successful scan records the IDs of the succeeded row to its private memory. Because of asynchrony, the rest of the processors cannot be able to execute a successful scan at the same time. If there is a successful read and the number of detected processors are less than K, the rest of processors repeat the same sequence of steps on a different array, with argument n 1 and so on. When a processor exits from the execution after successful read waits for the rest of the processors to exit and then they assign names to themselves in a range of [1 ,n\. The authors also showed that symmetry cannot be broken if the exact number of processes is unknown. Moreover, they observed that the participation of every processor is necessary in order to terminate.
Kutten et al. [68] considered the naming in asynchronous systems of shared readwrite memory. They gave a Las Vegas algorithm for an oblivious scheduler for the case of known n, which works in the expected time 0{\ogn) while using 0(n) shared registers, and also showed that a logarithmic time is required to assign names to anonymous processes. Authors provided a nonterminating dynamic algorithm, where processes may stop and start taking steps during the execution and then added a static termination detection mechanism which works when the number of processors n is known.
Their dynamic algorithm operates as follows. When a process is active, it randomly selects an ID and always check to see if the same ID claimed by any other processes. A process repeatedly either reads the claimed register or writes a random bit which is chosen independently and randomly and records the chosen value. When a process reads a register, it checks out if the value of register has changed since it has written to it the last time. If the process observes that his claimed ID is also selected by any other processes, it selects randomly a new ID to claim. Note that in their algorithm each process detects the collision in constant time. If a process observes no change after reading the contents of the shared register then it moves on the next iteration. Next, they provided an algorithm to make the
9
dynamic algorithm to terminate. They assume that all shared registers are initialized. The termination detection algorithm employs a binary tree, where its leaves corresponding to claimed ID. Each process traverses the binary tree from leaves to root by updating the sum of children in each node. When a process sees that the root of the tree has the value n it exits the loop. In other words, the set of claimed IDs is fixed when the root of the binary tree has the value n.
They used the size of (9(log n) bits for some registers whereas the algorithm of [39, 70, 87] used single bit register. Additionally, they showed that if n is unknown then a Las Vegas naming algorithm does not exist, and a Unitestate Las Vegas naming algorithm can work only for an oblivious scheduler, that is to say, there is no terminating algorithm if n is not known or the scheduler is adaptive.
The authors also gave a Las Vegas algorithm which works for unbounded space under any fair scheduler. Finally, they provided a deterministic solution for the naming problem in readmodifywrite model by using just one register. This model is a much advanced computational model, where a process can read and update a shared variable in just one step.
Panconesi et al. [79] gave a randomized waitfree naming algorithm in anonymous systems with processes prone to crashes that communicate by singlewriter registers. They assume different processes may address a register with different index number and can read all other shared variables. They gave an algorithm that is based on waitfree implementation of aTest&SetOnce objects for an adaptive scheduler for the case of known n, which works in the expected running time 0(n log n log log n) bit operation with probability at least 1 o(l) while using a namespace of size (1 + e)n, where e > 0. The model considered in that work assigns unique registers to nameless processes and so has a potential to defy the impossibility of waitfree naming for general multiwriter registers as observed by Kutten et al. [68].
Buhrman et al. [23] considered the relative complexity of naming and consensus problems in asynchronous systems with shared memory that are prone to crash failures, demonstrating
10
that naming is harder that consensus.
Now we review work on problems in anonymous distributed systems different from naming. Aspnes et al. [10] gave a comparative study of anonymous distributed systems with different communication mechanisms, including broadcast and sharedmemory objects of various functionalities, like readwrite registers and counters. Alistarh et al. [5] gave randomized renaming algorithms that act like naming ones, in that process identifiers are not referred to; for more or renaming see [4, 13, 30]. Aspnes et al. [12] considered solving consensus in anonymous systems with infinitely many processes. Attiya et al. [15] and Jayanti and Toueg [60] studied the impact of initialization of shared registers on solvability of tasks like consensus and wakeup in faultfree anonymous systems. Bonnet et al. [21] considered solvability of consensus in anonymous systems with processes prone to crashes but augmented with failure detectors. Guerraoui and Ruppert [55] showed that certain tasks like timestamping, snapshots and consensus have deterministic solutions in anonymous systems with shared readwrite registers prone to process crashes. Ruppert [82] studied the impact of anonymity of processes on waitfree computing and mutual implementability of types of shared objects.
Lower bounds on PRAM were given by Fich et al. [43], Cook et al. [31], and Beame [19], among others. A review of lower bounds based on informationtheoretic approach is given by Attiya and Ellen [14], Yaos minimax principle was given by Yao [91]; the book by Motwani and Raghavan [77] gives examples of applications.
The problem of concurrent communication in anonymous networks was first considered by Angluin [6]. That work showed, in particular, that randomization is needed in naming algorithms when executed in environments that are perfectly symmetric; other related impossibility results are surveyed by Fich and Ruppert [44],
The work about anonymous networks that followed was either on specific network topologies or on problems in general messagepassing systems. Most popular specific topologies included that of a ring and hypercube. In particular, the ring topology was investigated
11
by Attiya et al. [16, 17], Flocchini et al. [45], Diks et al. [38], Itai and Rodeh [58], and Kranakis et al. [65], and the hypercube topology was studied by Kranakis and Krizanc [64] and Kranakis and Santoro [67].
Work on algorithmic problems in anonymous networks of general topologies or anonymous/named agents in anonymous/named networks included the following specific contributions. Afek and Matias [3] and Schieber and Snir [84] considered leader election, finding spanning trees and naming in general anonymous networks. Angluin et al. [8] studied adversarial communication by anonymous agents and Angluin et al. [9] considered selfstabilizing protocols for anonymous asynchronous agents deployed in a network of unknown size. Chalopin et al. [24] studied naming and leader election in asynchronous networks when a node knows the map of the network but its position on the map is unknown. Chlebus et al. [29] investigated anonymous complete networks whose links and nodes are subject to random independent failures in which single faultfree node has to wake up all nodes by propagating a wakeup message through the network. Dereniowski and Pelc [36] considered leader election among anonymous agents in anonymous networks. Dieudonne and Pec [37] studied teams of anonymous mobile agents in networks that execute deterministic algorithm with the goal to convene at one node. Fraigniaud et al. [48] considered naming in anonymous networks with one node distinguished as leader. Gc(,sieniec et al. [52] investigated anonymous agents pursuing the goal to meet at a node or edge of a ring. Glacet et al. [53] considered leader election in anonymous trees. Kowalski and Malinowski [63] studied named agents meeting in anonymous networks. Kranakis et al. [66] investigated computing boolean functions on anonymous networks. Metivier et al. [72] considered naming anonymous unknown graphs. Michail et al. [74] studied the problems of naming and counting nodes in dynamic anonymous networks. Pelc [80] considered activating an anonymous ad hoc radio network from a single source by a deterministic algorithm. Yamashita and Kameda [90] investigated topological properties of anonymous networks that allow for deterministic solutions for representative algorithmic problems.
12
General questions of computability in anonymous messagepassing systems implemented in networks were studied by Boldi and Vigna [20], Emek et al. [40], and Sakamoto [83].
Next, we review work on problems for Beeping Networks. The model of communication by discrete beeping was introduced by Cornejo and Kuhn [32], who considered a generaltopology wireless network in which nodes use only carrier sensing to communicate, and developed algorithms for node coloring. They were inspired by continuous beeping studied by Degesys et al. [35] and Motskin et al. [76], and by the implementation of coordination by carrier sensing given by Flury and Wattenhofer [46].
Afek et al. [1] considered the problem to find a maximal independent set of nodes in a distributed manner when the nodes can only beep, under additional assumptions regarding the knowledge of the size of the network, waking up the network by beeps, collision detection among concurrent beeps, and synchrony. Brandes et al. [22] studied the problem of randomly estimating the number of nodes attached to a singlehop beeping network. Czumaj and Davies [34] approached systematically the tasks of deterministic broadcasting, gossiping, and multibroadcasting on the bit level in generaltopology symmetric beeping networks. In a related work, Hounkanli and Pelc [56] studied deterministic broadcasting in asynchronous beeping networks of general topology with various levels of knowledge about the network. Forster et al. [47] considered leader election by deterministic algorithms in general multihop networks with beeping. Gilbert and Newport [51] studied the efficiency of leader election in a beeping singlehop channel when nodes are state machines of constant size with a specific precision of randomized state transitions. Huang and Moscibroda [57] considered the problems of identifying subsets of stations connected to a beeping channel and compared their complexity to those on multipleaccess channels. Yu et al. [92] considered the problem of constructing a minimum dominating set in networks with beeping.
Networks of nodes communicating by beeping share common features with radio networks with collision detection. Ghaffari and Haeupler [49] gave efficient leader election algorithm by treating collision detection as beeping and transmitting messages as bit strings. Their
13
approach by way of beep waves was adopted to broadcasting in networks with beeping by Czumaj and Davies [34], In a related work, Ghaffari et al. [50] developed randomized broadcasting and multibroadcasting in radio networks with collision detection.
14
4. TECHNICAL PRELIMINARIES
A synchronous sharedmemory system in which some n processors operate concurrently is the assumed model of computation. The essential properties of such systems are as follows: (1) shared memory cells have only reading/writing capabilities, and (2) operations of accessing the shared registers are globally synchronized so that processors work in lockstep.
An execution of an algorithm is structured as a sequence of rounds so that each processor performs either a read from or a write to a shared memory cell, along with local computation. We assume that a processor carries out its private computation in a round in a negligible portion of the round. Processors can generate as many private random bits per round as needed; all these random bits generated in an execution are assumed to be independent.
Each shared memory cell is assumed to be initialized to 0 as a default value. This assumption simplifies the exposition, but it can be removed as any algorithm assuming such an initialization can be modified in a relatively straightforward manner to work with dirty memory. A shared memory cell can store any value as needed in algorithms, in particular, integers of magnitude that may depend on n; all our algorithms require a memory cell to store O{\ogn) bits. An invocation of either reading from or writing to a memory location is completed in the round of invocation. This model of computation is referred in the literature as the Parallel Random Access Machine (PRAM) [59, 81]. PRAM is usually defined as a model with unlimited number of sharedmemory cells, by analogy with the randomaccess machine (RAM) model. We consider the following two instantiations of the model, determined by the amount of shared memory. In one situation, there is a constant number of shared memory cells, which is independent of the number of processors n but as large as needed in the specific algorithm. In the other case, the number of shared memory cells is unlimited in principle, but the expected number of shared registers accessed in an execution depends on n and is sought to be minimized.
A concurrent read occurs when a group of processors read from the same memory cell in the same round; this results in each of these processors obtaining the value stored in the
15
memory cell at the end of the preceding round. A concurrent write occurs when a group of processors invoke a write to the same memory cell in the same round. Without loss of generality, we may assume that a concurrent read of a memory cell and a concurrent write to the same memory cell do not occur simultaneously: this is because we could designate rounds only for reading and only for rounding depending on their parity, thereby slowing the algorithm by a factor of two. A clarification is needed regarding which value gets written to a memory cell in a concurrent write, when multiple distinct values are attempted to be written; such stipulations determine suitable variants of the model. We will consider algorithms for the following two PRAM variants determined by their respective concurrentwrite semantics. Common PRAM is defined by the property that when a group of processors want to write to the same shared memory cell in a round then all the values that any of the processors want to write must be identical, otherwise the operation is illegal. Concurrent attempts to write the same value to a memory cell result in this value getting written in this round.
Arbitrary PRAM allows attempts to write any legitimate values to the same memory cell in the same round. When this occurs, then one of these values gets written, while a selection of this value is arbitrary. All possible selections of values that get written need to be taken into account when arguing about correctness of an algorithm.
We will rely on certain standard algorithms developed for PRAMs, as explained in [59,
81]. One of them is for prefixtype computations. A typical situation in which it is applied
occurs when there is an array of m shared memory cells, each memory cell storing either 0
or 1. This may represent an array of bins where 1 stands for a nonempty bin while 0 for
an empty bin. Let the rank of a nonempty bin of address x be the number of nonempty
bins with addresses smaller than or equal to x. Ranks can be computed in time d(logm) by
using an auxiliary memory of 0(m) cells, assuming there is at least one processor assigned to
a nonempty bin, while other processors do not participate. The bins are associated with the
leaves of a binary tree. The processors traverse a binary tree from the leaves to the root and
16
back to the leaves. When updating information at a node, only the information stored at the parent, the sibling and the children is used. We may observe that the same memory can be used repeatedly when such computation needs to be performed multiple times. A possible approach is to verify if the information at a needed memory cell, representing either a parent, a sibling or a child of a visited node, is fresh or rather stale from previous executions. This could be accomplished in the following three steps by a processor. First, the processor erases a memory cell it needs to read by rewriting its present value by a blank value. Second, the processor writes again the value at node it visits, which may have been erased in the previous step by other processors that need the value. Finally, the processor reads again the memory cell it just erased, to see if it stays erased, which means its contents were stale, or not, which means its contents got rewritten so they are fresh.
Balls into bins. Assigning names to processors can be visualized as throwing balls into bins. Imagine that balls are handled by processors and bins are represented by either memory addresses or rounds in a segment of rounds. Throwing a ball means either writing into some memory address a value that represents a ball or choosing a round from a segment of rounds. A collision occurs when two balls end up in the same bin; this means that two processors wrote to the same memory address, not necessarily in the same round, or that they selected the same round. The rank of a bin containing a ball is the number of bins with smaller or equal names that contain balls. When each in a group of processors throws a ball and there is no collision then this in principle breaks symmetry in a manner that allows to assign unique names in the group, namely, ranks of selected bins may serve as names.
The following terms refer to the status of a bin in a given round. A bin is called empty where there are no balls in it. A bin is singleton when it contains a single ball. A bin is multiple when there are at least two balls in it. Finally, a bin with at least one ball is occupied.
The idea of representing attempts to assign names as throwing balls into bins is quite generic. In particular, it was applied by Egecioglu and Singh [39], who proposed a syn
17
chronous algorithm that repeatedly throws all balls together into all available bins, the selections of bins for balls made independently and uniformly at random. In their algorithm for n processors, we can use 7 n memory cells, where 7 > 1. Let us choose 7 = 3 for the following calculations to be specific. This algorithm has an exponential expectedtime performance. To see this, we estimate the probability that each bin is either singleton or empty. Let the balls be thrown one by one. After the first n/2 balls are in singleton bins, the probability to hit an empty bin is at most = ; we treat this as a success in a Bernoulli trial. The probability of n/2 such successes is at most ()ra/2, so the expected time to wait for the algorithm to terminate is at least which is exponential in n.
We consider related processes that could be as fast as (9(logn) in expected time, while still using only 0(n) shared memory cells, see Section 6.4. The idea is to let balls in singleton bins stay put and only move those that collided with other balls by landing in bins that became thereby multiple. To implement this on a Common PRAM, we need a way to detect collisions, which we explain next.
Collisions among balls. We will use a randomized procedure for Common PRAM to verify if a collision occurs in a bin, say, a bin x, which is executed by each processor that selected bin x. This procedure VerifyCollision is represented in Figure 4.1. There are two arrays TAILS and HEADS of shared memory cells. Bin x is verified by using memory cells TAILS [x] and HEADS [a?]. First, the memory cells TAILS [x] and HEADS [a?] are set to false each, and next one of these memory cells is selected randomly and set to true.
Lemma 1 For an integer x, procedure VerifyCollision (x) executed by one processor neuer detects a collision, and when multiple processors execute this procedure then a collision is detected with probability at least
Proof: When only one processor executes the procedure, then first the processor sets both Headsfx] and Tails [a?] to false and next only one of them to true. This guarantees that Heads[x] and Tails[x] store different values and so collision is not detected. When some
18
Procedure VerifyCollision (x)
initialize Heads [:r] Tails[x] false toss^ outcome of tossing a fair coin
if toss^ = tails then Tails[x] true else Heads[x] true
if Tailsfx] = Heads [:r] then return true else return false 2
Figure 4.1: A pseudocode for a processor v of a Common PRAM, where x is a positive integer. Heads and Tails are arrays of shared memory cells.
When the parameter x is dropped in a call then this means that x = 1. The procedure returns true when a collision has been detected.
m > 1 processors execute the procedure, then collision is not detected only when either all processors set Heads[x] to true or all processors set Tails[x] to true. This means that the processors generate the same outcome in their coin tosses. This occurs with probability
2m+1, which is at most
A beeping channel is related to multiple access channels [25]. It is a network consisting of some n stations connected to a communication medium. We consider synchronous beeping channels, in the sense that an execution of a communication algorithm is partitioned into consecutive rounds. All the stations start an execution together. In each round, a station may either beep or pause. When some station beeps in a round, then each station hears the beep, otherwise all the stations receive silence as feedback. When multiple stations beep together in a round then we call this a collision.
We say that a parameter of a communication network is known when it can be used in codes of algorithms. The relevant parameter used in this thesis is the number of stations n. We consider two cases, in which either n is known or it is not.
Randomized algorithms use random bits, understood as outcomes of tosses of a fair coin. All different random bits used by our algorithms are considered stochastically independent from each other.
Our naming algorithms have as their goal to assign unique identifiers to the stations,
19
Procedure DetectCollision toss^ outcome of a random coin toss
if toss^ = heads /* first round */
then beep else pause
if toss^ = tails /* second round */
then beep else pause
return (a beep was heard in each of the two rounds)
Figure 4.2: A pseudocode for a station v. The procedure takes two rounds to execute. It detect a collision and returns true when a beep is heard in each of the rounds, otherwise it does not detect a collision and returns false.
moreover we want names to be integers in the contiguous range {1,2,,n}, which we denote as [n]. The Monte Carlo naming algorithm that we develop has the property that the names it assigns make an interval of integers of the form [k] for k < n, so that when k < n then there are duplicate identifiers assigned as names, which is the only form of error that can occur.
We will use a procedure to detect collisions, called DetectCollision, whose pseudocode is in Figure 4.2. The procedure is executed by a group of stations, and they all start their executions simultaneously. The procedure takes two rounds. Each of the participating stations simulates the toss of a fair coin, with the outcomes independent among the participating stations. Depending on the outcome of a toss, a station beeps either in the first or the second of the allocated rounds. A collision is detected only when two consecutive beeps are heard.
Lemma 2 If k stations perform m timedisjoint calls of procedure DetectCollision, each station participating in exactly one call, then collision is not detected in any of these calls with probability 2~k+m.
Proof: Consider a call of DetectCollision performed concurrently by i stations, for
20
i > 1. We argue by deferred decisions. One of these stations tosses a coin and determines its outcome X. The other i 1 stations participating concurrently in this call also toss their coins; here we have i 1 > 0, so there could be no such a station. The only possibility not to detect a collision is for all of these i 1 stations also produce X. This happens with probability 2~t+1 in this one call. The probability of producing only false during the m calls is the product of these probabilities. When we multiply them out over m instances of the procedure being performed, then the outcome is 2~k+m, because numbers i sum up to k and the number of factors is m.
Pseudocode conventions and notations. We give pseudocode representations of algorithms, as in Figure 4.1. The conventions of pseudocode are summarized next.
We want that, at any round of an execution, all the processors that have not terminated yet to be at the same line of the pseudocode. In particular, when an instruction is conditional on a statement then a processor that does not meet the condition pauses as long as it would be needed for all the processors that meet the condition complete their instructions, even when there are no such processors.
A pseudocode for a processor refers to a number of variables, both shared and private. We use the following notational conventions to emphasize their relevant properties. Shared variables have names starting with a capital letter, while private variables have names all in small letters. When a variable x is a private variable that may have different values at different processors at the same time, then we denote this variable used by a processor v by xv. Private variables that have the same value at the same time in all the processors are usually used without subscripts, like variables controlling forloops.
Each station has its private copy of any among the variables used in the pseudocode. When the values of these copies may vary across the stations, then we add the stations name as a subscript of the variables name to emphasize that, and otherwise, when all the copies of a variable are kept equal across all the stations then no subscript is used.
An assignment instruction of the form y z a, where x,y,..., z are
21
variables and a is a value, means to assign a as the value to be stored in all the listed variables x, y,..., z.
We use three notations for logarithms. The notation lgx stands for the logarithm of x to the base 2. The notation lux denotes the natural logarithm of x. When the base of logarithms does not matter then we use logap like in the asymptotic notation 0(logx). Properties of naming algorithms. Naming algorithms in distributed environments involving multiwriter readwrite shared memory have to be randomized to break symmetry [6, 18]. An eventual assignment of proper names cannot be a sure event, because, in principle, two processors can generate the same strings of random bits in the course of an execution. We say that an event is almost sure, or occurs almost surely, when it occurs with probability 1. When n processors generate their private strings of random bits then it is an almost sure event that all these strings are eventually pairwise distinct. Therefore, a most advantageous scenario that we could expect, when a set of n processors is to execute a randomized naming algorithm, is that the algorithm eventually terminates almost surely and that at the moment of termination the output is correct, in that the assigned names are without duplicates and fill the whole interval [1 ,n\.
22
5. LOWER BOUNDS AND IMPOSSIBILITIES
In this section, we show impossibility results to justify methodological approach to naming algorithms we apply, and use lower bounds on performance metrics for such algorithms to argue about the optimality of the algorithms developed in subsequent sections.
5.1 Preliminaries
We start with basic definitions, terminologies, and theorems that are discussed throughout this section.
Lower bounds prove that certain problems cannot be solved efficiently without sufficient resources such as time or space. They also give us an idea about when to stop looking for better solutions. Impossibility results show that certain problems cannot be solved under certain assumptions. To understand the nature of naming problem it is necessary to understand lower bounds and impossibility results [14, 44],
The entropy [33] is the number of bits on average required to describe the random variable. The entropy of a random variable is a lower bound on the average number of bits required to represent the random variable. The entropy of a random variable X with a probability mass function p{x) is defined by
H{pc) = p(x) lgp(x).
X
Yaos Minimax Principle [91, 77] allows us to prove lower bounds on the performance of Las Vegas and Monte Carlo algorithms. Yaos Minimax Principle says that for an arbitrary chosen input distribution, the expected running time of the optimal deterministic algorithm is a lower bound on the expected running time of the optimal randomized algorithm. Yaos Minimax Principle for Las Vegas randomized algorithms as follows. Let V be a problem with a finite set X of inputs and a finite set A be the set of all possible deterministic algorithms that correctly solve the problem V. Let cost(X, A) be the running time of algorithm A for algorithm A e A and input X e X. Let p be a probability distribution over X and q over A.
23
Let Xp be a random input chosen according to p and Aq shows a random algorithm chosen according to q. For all distributions p over X and q over A,
minE \cost(Xp, A)] < maxE \cost(X, A)].
AeA xex q
Yaos Minimax Principle for Monte Carlo randomized algorithms state that the expected running time of any Monte Carlo algorithm that errs with probability A G [0, A],
5.2 Lower Bounds for a PRAM
We give algorithms that use the expected number of 0(n\ogn) random bits with a large probability. This amount of random information is necessary if an algorithm is to terminate almost surely. The following fact is essentially a folklore, but since we do not know if it was proved anywhere in the literature, we give a proof for completeness sake. Our arguments resort to the notions of information theory [33].
Proposition 1 If a randomized naming algorithm is correct with probability pn, when executed by n anonymous processors, then it requires Q(nlogn) random bits with probability at least pn. In particular, a Las Vegas naming algorithm for n processors uses Q(nlogn) random bits almost surely.
Proof: Let us assign conceptual identifiers to the processors, for the sake of argument. These unknown identifiers are known only to an external observer and not to algorithms. The purpose of executing the algorithm is to assign explicit identifiers, which we call given identifiers.
Let a processor with an unknown name Ui generate string of bits &*, for i = 1,... ,n. A distribution of given identifiers among the n anonymous processors, which results from executing the algorithm, is a random variable Xn with a uniform distribution on the set of all permutations of the unknown identifiers. This is because of symmetry: all processors execute the same code, without explicit private identifiers, and if we rearrange the strings generated bits bi among the processors *, then this results in the corresponding rearrangement of the given names.
24
The underlying probability space consists of n\ elementary events, each determined by an assignment of the given identifiers to the processors identified by the unknown identifiers. It follows that each of these events occurs with probability l/n\. The Shannon entropy of the random variable Xn is thus lg(u!) = 0(nlogn). The decision about which assignment of given names is produced is determined by the random bits, as they are the only source of entropy, so the expected number of random bits used by the algorithm needs to be as large as the entropy of the random variable Xn.
The property that all assigned names are distinct and in the interval [1 ,n\ holds with probability pn. An execution needs to generate a total of Q(nlogn) random bits with probability at least pn, because of the bound on entropy. A Las Vegas algorithm terminates almost surely, and returns correct names upon termination. This means that pn = 1 and so that Q(nlogn) random bits are used almost surely.
We consider two kinds of algorithmic naming problems, as determined by the amount of shared memory. One case is for a constant number of shared memory cells, for which we give an optimal lower bound on time for 0( 1) shared memory. The other case is when the number of shared memory cells and their capacity are unbounded, for which we give an absolute lower bound on time. We begin with lower bounds that reflect the amount of shared memory.
Intuitively, as processors generate random bits, these bits need to be made common knowledge through some implicit process that assigns explicit names. There is an underlying flow of information spreading knowledge among the processors through the available shared memory. Time is bounded from below by the rate of flow of information and the total amount of bits that need to be shared.
On the technical level, in order to bound the expected time of a randomized algorithm, we apply the Yaos minimax principle [91] to relate this expected time to the distributional expected time complexity. A randomized algorithm whose actions are determined by random bits can be considered as a probability distribution on deterministic algorithms. A determin
25
istic algorithm has strings of bits given to processors as their inputs, with some probability distribution on such inputs. The expected time of such a deterministic algorithm, give any specific probability distribution on the inputs, is a lower bound on the expected time of a randomized algorithm.
To make such interpretation of randomized algorithms possible, we consider strings of bits of equal length. With such a restriction on inputs, deterministic algorithm may not be able to assign proper names for some assignments of inputs, for example, when all the inputs are equal. We augment such deterministic algorithms in adding an option for the algorithm to withhold a decision on assignment of names and output no name for some processors. This is interpreted as the deterministic algorithm needing longer inputs, for which the given inputs are prefixes, and which for the randomized algorithm means that some processors need to generate more random bits.
Regarding probability distributions for inputs of a given length, it always will be the uniform distribution. This is because we will use an assessment of entropy of such a distribution.
Theorem 1 A randomized naming algorithm for a Common PRAM with n processors and C > 0 shared memory cells operates in Q(n\ogn/C) expected time when it is either a Las Vegas algorithm or a Monte Carlo algorithm with the probability of error smaller than 1/2.
Proof: We consider Las Vegas algorithms in this argument, the Monte Carlo case is similar, the difference is in applying Yaos principle for Monte Carlo algorithms. We interpret a randomized algorithm as a deterministic one working with all possible assignments of random bits as inputs with a uniform mass function on the inputs. The expected time of the deterministic algorithm is a lower bound on the expected time of the randomized algorithm.
There are n\ possible assignments of given names to the processors. Each of them occurs with the same probability 1 jn\ when the input bit strings are assigned uniformly at random. Therefore the entropy of name assignments, interpreted as a random variable, is
26
lg 77.! = Q(nlogn).
Next we consider executions of such a deterministic algorithm on the inputs with a uniform distribution. We may assume without loss of generality that an execution is structured into the following phases, each consisting of C + 1 rounds. In the first round of a phase, each processor either writes into a shared memory cell or pauses. In the following rounds of a phase, every processor learns the current values of each among the C memory cells. This may take C rounds for every processor to scan the whole shared memory, but we do not include this reading overhead as contributing to the lower bound. Instead, since this is a simulation anyway, we conservatively assume that the process of learning all the contents of shared memory cells at the end of a phase is instantaneous and complete.
The Common variant of PRAM requires that if a memory cell is written into concurrently then there is a common value that gets written by all the writers. Such a value needs to be determined by the code and the address of a memory cell. This means that, for each phase and any memory cell, a processor choosing to write into this memory cell knows the common value to be written. By the structure of execution, in which all processors read all the registers after a round of writing, any processor knows what value gets written into each available memory cell in a phase, if any is written into a particular cell. This implies that the contents written into shared memory cells may not convey any new information but are already implicit in the states of the processors represented by their private memories after reading the whole shared memory.
When a processor reads all the shared memory cells in a phase, then the only new information it may learn is the addresses of memory cells into which writes were performed and those into which there were no writes. This makes it possible obtain at most C bits of information per phase, because each register was either written into or not.
There are Q(n log n) bits of information that need to be settled and one phase changes the entropy by at most C bits. It follows that the expected number of phases of the deterministic algorithm is Q(nlogn/C). By the Yaos principle, Q(nlogn/C) is a lower bound on the
27
expected time of a randomized algorithm.
For Arbitrary PRAM, writing can spread information through the written values, because different processes can attempt to write distinct strings of bits. The rate of flow of information is constrained by the fact that when multiple writers attempt to write to the same memory cell then only one of them succeeds, if the values written are distinct. This intuitively means that the size of a group of processors writing to the same register determines how much information the writers learn by subsequent reading. These intuitions are made formal in the proof of the following Theorem 2.
Theorem 2 A randomized naming algorithm for an Arbitrary PRAM with n processors and C > 0 shared memory cells operates in fl(n/C) expected time when it is either a Las Vegas algorithm or a Monte Carlo algorithm with the probability of error smaller than 1/2.
Proof: We consider Las Vegas algorithms in this argument, the Monte Carlo case is similar, the difference is in applying Yaos principle for Monte Carlo algorithms. We again replace a given randomized algorithm by its deterministic version that works on assignments of strings of bits of the same length as inputs, with such inputs assigned uniformly at random to the processors. The goal is to use the property that the expected time of this deterministic algorithm, for a given probability distribution of inputs, is a lower bound on the expected time of the randomized algorithm. Next, we consider executions of this a deterministic algorithm.
Similarly as in the proof of Theorem 1, we observe that there are n\ assignments of given names to the processors and each of them occurs with the same probability 1 /n\, when the input bit strings are assigned uniformly at random. The entropy of name assignments is again lgn! = fl(nlogn). The algorithm needs to make the processors learn fl(nlogn) bits using the available C > 0 shared memory cells.
We may interpret an execution as structured into phases, such that each processor performs at most one write in a phase and then reads all the registers. The time of a phase is
28
assumed conservatively to be 0( 1). Consider a register and a group of processors that attempt to write their values into this register in a phase. The values attempted to be written are represented as strings of bits. If some of these values have 0 and some have 1 at some bit position among the strings, then this bit position may convey one bit of information. The maximum amount of information is provided by a write when the written string of bits facilitates identifying the writer by comparing its written value to the other values attempted to be written concurrently to the same memory cell. It follows that this amount is at most the binary logarithm of the size of this group of processors, so that each memory cell written to in a round contributes at most lg n bits of information because there may be at most n writers to it. So the maximum number of bits of information learnt by the processors in a phase is C\gn.
Since the entropy of the assignment of names is lg n\ = Q(n log n), the expected number of phases of the deterministic algorithm is VL(n\gn/(C\gn)) = Q(n/C). By the Yaos principle, this is also a lower bound on the expected time of a randomized algorithm.
Next, we consider absolute requirements on time for a PRAM to assign unique names to the n available processors. The generality of the lower bound we give stems from the weakness of assumptions. First, nothing is assumed about the knowledge of n. Second, concurrent writing is not constrained in any way. Third, shared memory cells are unbounded in their number and size. Kutten et al. [68] showed that any Las Vegas naming algorithm for asynchronous readwrite shared memory systems has expected time (log n) against a certain oblivious schedule.
We show next in Theorem 3 that any Las Vegas naming algorithm has 12 (log n) expected time for the synchronous schedule of events. The argument we give is in the spirit of similar arguments applied by Cook et al. [31] and Beame [19]. What these arguments share are a formalization of the notion of flow of information during an execution of an algorithm,combined with a recursive estimate of the rate of this flow.
The relation processor v knows processor w in round t is defined recursively as follows.
29
First, for any processor v, we have that v knows v in any round t > 0. Second, if a processor v writes to a shared memory cell R in a round t\ and a processor w reads from R in a round f2 > t\ such that there was no other write into this memory cell after t\ and prior to Ã‚Â£2 then processor w knows in round Ã‚Â£2 each processor that v knows in round t\. Finally, the relation is the smallest transitive relation that satisfies the two postulates formulated above. This means that it is the smallest relation such that if processor v knows processor w in round t\ and z knows v in round i2 such that i2 > C then processor z knows w in round i2 In particular, the knowledge accumulates with time, in that if a processor v knows processor z in round t\ and round i2 is such that f2 > 11 then v knows z in round i2 as well.
Lemma 3 Let A be a deterministic algorithm that assigns distinct names to the processors, with the possibility that some processors output no name for some inputs, when each node has an input string of bits of the same length. When algorithm A terminates with proper names assigned to all the processors then each processor knows all the other processors.
Proof: We may assume that n > 1 as otherwise one processors knows itself. Let us consider an assignment X of inputs that results in a proper assignment of distinct names to all the processors when algorithm A terminates. This implies that all the inputs in the assignment X are distinct strings of bits, as otherwise some two processors, say, v and w that obtain the same input string of bits would either assign themselves the same name or declare no name as output. Suppose that processor v does not know w when v halts for inputs from X. Consider an assignment of inputs J which is the same as X for processors different from w and such that the input of w is the same as input for v in X. Then the actions of processor v would be the same with J as with X, because v is not affected by the input of w, so that v would assign itself the same name with J as with X. But the actions of processor w would be the same in J as those of v, because their input strings of bits are identical under J. It follows that w would assign itself the name of v, resulting in duplicate names.
We will use Lemma 3 to asses running times by estimating the number of interleaved
30
reads and writes needed for processors to get to know all the processors. The rate of learning such information may depend on time, because we do not restrict the amount of shared memory, unlike in Theorems 1 and 2. Indeed, the rate may increase exponentially, under most conservative estimates.
The following Theorem 3 holds for both Common and Arbitrary PRAMs. The argument used in the proof is general enough not to depend on any specific semantics of writing.
Theorem 3 A randomized naming algorithm for a PRAM with n processors operates in 12 (log n) expected time when it is either a Las Vegas algorithm or a Monte Carlo algorithm with the probability of error smaller than 1/2.
Proof: The argument is for a Las Vegas algorithm, the Monte Carlo case is similar. A randomized algorithm can be interpreted as a probability distribution on a finite set of deterministic algorithms. Such an interpretation works when input strings for a deterministic algorithm are of the same length. We consider all such possible lengths for deterministic algorithms, similarly as in the previous proofs of lower bounds.
Let us consider a deterministic algorithm A, and let inputs be strings of bits of the same length. We may structure an execution of this algorithm A into phases as follows. A phase consists of two rounds. In the first round of a phase, each processor either writes to a shared memory cell or pauses. In the second round of a phase, each processor either reads from a shared memory cell or pauses. Such structuring can be done without loss of generality at the expense of slowing down an execution by a factor of at most 2. Observe that the knowledge in the first round of a phase is the same as in the last round of the preceding phase.
Phases are numbered by consecutively increasing integers, starting from 1. A phase i comprised pairs of rounds {2i 1, 2*}, for integers i > 1. In particular, the first phase consists of rounds 1 and 2. We also add phase 0 that represents the knowledge before any reads or writes were performed.
We show the following invariant, for i > 0: a processor knows at most 2* processors at
31
the end of phase i. The proof of this invariant is by induction on i.
The base case is for i = 0. The invariant follows from the fact that a processor knows only one processor in phase 0, namely itself, and 2 = 1.
To show the inductive step, suppose the invariant holds for a phase i > 0 and consider the next phase i + 1. A processor v may increase its knowledge by reading in the second round of phase i + l. Suppose the read is from a shared memory cell R. The latest write into this memory cell occurred by the first round of phase i + l. This means that the processor w that wrote to R by phase i+l, as the last one that did write, knew at most 2* processors in the round of writing, by the inductive assumption and the fact that what is written in phase i + l was learnt by the immediately preceding phase i. Moreover, by the semantics of writing, the value written to R by w in that round removed any previous information stored in R. Processor v starts phase i + l knowing at most T processors, and also learns of at most 2* other processors by reading in phase i + l, namely, those values known by the latest writer of the read contents. It follows that processor v knows at most 2* + 2* = 2t+1 processors by the end of phase i + l.
When proper names are assigned by such a deterministic algorithm, then each processor knows every other processor, by Lemma 3. A processor knows every other processor in a phase j such that 2J > n, by the invariant just proved. Such a phase number j satisfies j > Ign, and it takes 21gn rounds to complete lgn phases.
Let us consider inputs strings of bits assigned to processors uniformly at random. We need to estimate the expected running time of an algorithm A on such inputs. Let us observe that, in the context of interpreting deterministic executions for the sake to apply Yaos principle, terminating executions of A that do not result in names assigned to all the processors could be pruned from a bound on their expected running time, because such executions are determined by bounded input strings of bits that a randomized algorithm would extend to make them sufficiently long to assign proper names. In other words, from the perspective of randomized algorithms, such prematurely ending executions do not represent
32
real terminating ones.
The expected time of A, conditional on terminating with proper names assigned, is therefore at least 21gn. We conclude, by the Yaos principle, that any randomized naming algorithm has Q(logn) expected runtime.
The three lower bounds on time given in this Section may be applied in two ways. One is to infer optimality of time for a given amount of shared memory used. Another is to infer optimality of shared memory use given a time performance. This is summarized in the following Corollary 1.
Corollary 1 If the expected time of a naming Las Vegas algorithm is 0(n) on an Arbitrary PRAM with 0(1) shared memory, then this time performance is asymptotically optimal. If the expected time of a naming Las Vegas algorithm is 0(n\ogn) on a Common PRAM with 0(1) shared memory, then this time performance is asymptotically optimal. If a Las Vegas naming algorithm operates in time Offogn) on an Arbitrary PRAM using 0(n/logn) shared memory cells, then this amount of shared memory is asymptotically optimal. If a Las Vegas naming algorithm operates in time Oflogn) on a Common PRAM using 0(n) shared memory cells, then this amount of shared memory is optimal.
Proof: We verify that the lower bounds match the assumed upper bounds. By Theorem 2, a Las Vegas algorithm operates almost surely in Q(n) time on an Arbitrary PRAM when space is 0(1). By Theorem 1, a Las Vegas algorithm operates almost surely in Q(nlogn) time on a Common PRAM when space is 0(1). By Theorem 2, a Las Vegas algorithm operates almost surely in Q(logn) time on an Arbitrary PRAM when space is 0(n/\ogn). By Theorem 1, a Las Vegas algorithm operates almost surely in Q(logn) time on a Common PRAM when space is 0(n).
A naming algorithm cannot be Las Vegas when n is unknown, as was observed by Kutten et al. [68] in a more general case of asynchronous computations against an oblivious adversary. We show an analogous fact for synchronous computations.
33
Proposition 2 There is no Las Vegas naming algorithm for a PRAM with at least two processors that does not refer to the total number of processors.
Proof: Let us suppose, to arrive at a contradiction, that such a naming Las Vegas algorithm exists. Consider a system of n > 1 processors, when n is an arbitrary positive integer, and an execution Ã‚Â£ on these n processors that uses specific strings of random bits such that the algorithm terminates in Ã‚Â£ with these random bits. Such strings of random bits exist because the algorithm terminates almost surely.
Let V\ be a processor that halts latest in Ã‚Â£ among the n processors. Let ag be the string of random bits generated by processor v\ by the time it halts in Ã‚Â£. Consider an execution Ã‚Â£' on n + 1 > 2 processors such that n processors obtain the same strings of random bits as in Ã‚Â£ and an extra processor u2 obtains ag as its random bits. The executions Ã‚Â£ and Ã‚Â£' are indistinguishable for the n processors participating in Ã‚Â£, so they assign themselves the same names and halt. Processor u2 performs the same reads and writes as processor v\ and assigns itself the same name as processor v\ does and halts in the same round as processor v\. This is the termination round because by that time all the other processor have halted as well.
It follows that execution Ã‚Â£' results in a name being duplicated. The probability of duplication for n +1 processors is at least as large as the probability to generate the finite random strings for n processors as in Ã‚Â£, and additionally to generate ag for an extra processor u2, so this probability is positive.
If n is unknown, then the restriction 0(n\ogn) on the number of random bits makes it inevitable that the probability of error is at least polynomially bounded from below, as we show next.
Proposition 3 For unknown n, if a randomized naming algorithm is executed by n anonymous processors, then an execution is incorrect, in that duplicate names are assigned to distinct processors, with probability that is at least n~Q
34
Proof: Suppose the algorithm uses at most cnlgn random bits with a probability pn when executed by a system of n processors, for some constant c > 0. Then one of these processors uses at most c\gn bits with a probability pn, by the pigeonhole principle.
Consider an execution for n +1 processors. Let us distinguish a processor v. Consider the actions of the remaining n processors: one of them, say w, uses at most c\gn bits with the probability pn. Processor v generates the same string of bits with probability 2_clgra = n~c. The random bits generated by w and v are independent. Therefore duplicate names occur with probability at least n~cpn When we have a bound pn = 1 n~Q
5.3 Lower Bounds for a Channel with Beeping
We begin with an observation, formulated as Proposition 4, that if the system is sufficiently symmetric then randomness is necessary to break symmetry. The given argument is standard and is given for completeness sake; see [6, 14, 44],
Proposition 4 There is no deterministic naming algorithm for a synchronous channel with beeping vnth at least two stations, in which all stations are anonymous, such that it eventually terminates and assigns proper names.
Proof: We argue by contradiction. Suppose that there exists a deterministic algorithm that
eventually terminates with proper names assigned to the anonymous stations. Let all the
stations start initialized to the same initial state. The following invariant is maintained in
each round: the internal states of the stations are all equal. We proceed by induction on the
round number. The base of induction is satisfied by the assumption about the initialization.
For the inductive step, we assume that the stations are in the same state, by the inductive
assumption. Then either all of them pause or all of them beep in the next round, so that
either all of them hear their own beep or all of them pause and hear silence. This results
in the same internal state transition, which shows the inductive step. When the algorithm
eventually terminates, then each station assigns to itself the identifier determined by its
35
state. The identifier is the same in all stations because their states are the same, by the invariant. This violates the desired property of names to be distinct, because there are at least two stations with the same name.
Proposition 4 justifies developing randomized naming algorithms. We continue with entropy arguments; see the book by Cover and Thomas [33] for a systematic exposition of information theory. An execution of a naming algorithm coordinates and translated random bits into names. This same amount of entropy needs to be processed/communicated on the channel, by the Shannons noiseless coding theorem. An analogue of the following Proposition 5 was stated in Proposition 1 for the model of synchronized processors communicating by reading and writing to shared memory.
Proposition 5 If a randomized naming algorithm for a channel with beeping is executed by n anonymous stations and is correct with probability pn then it requires Q(nlogn) random bits in total to be generated with probability at least pn. In particular, a Las Vegas naming algorithm uses Q(nlogn) random bits almost surely.
One round of an execution of a naming algorithm allows the stations that do not transmit to learn at most one bit, because, from the perspective of these stations, a round is either silent or there is a beep. Intuitively, the running time is proportional to the amount of entropy that is needed to assign names. This intuition leads to Proposition 6. In its proof, we combine Shannons entropy [33] with Yaos principle [91].
Proposition 6 A randomized naming algorithm for a beeping channel with n stations operates in Q(nlogn) expected time, when it is either a Las Vegas algorithm or a Monte Carlo algorithm with the probability of error smaller than 1/2.
Proof: We apply the Yaos minimax principle to bound the expected time of a randomized algorithm by the distributional complexity of naming. We consider Las Vegas algorithms first.
36
A randomized algorithm using strings of random bits generated by stations can be considered as a deterministic algorithm V on all possible assignments of such (sufficiently long) strings of bits to stations as their inputs. We consider assignments of strings of bits of an equal length with the uniform distribution among all such assignments of strings of the same length. On a given assignment of input strings of bits to stations, the deterministic algorithms either assigns proper names or fails to do so. A failure to assign proper names with some input is interpreted as the randomized algorithm continuing to work with additional random bits, which comes at an extra time cost. This is justified by a combination of two factors. One is that the algorithm is Las Vegas and so it halts almost surely, and with a correct output. Another is that the probability to assign a specific finite sequence as a prefix of a used sequence of random bits is positive. So if starting with a specific string of bits, as a prefix of a possibly longer needed string, would mean inability to terminate with a positive probability, then the naming algorithm would not be Las Vegas.
The common length of these input strings is a parameter, and we consider all sufficiently large positive integer values for this parameter such that their exist strings of random bits of this length resulting in assignments of proper names. For a given length of input strings, we remove input assignments that do not result in assignment proper names and consider a uniform distribution of the remaining inputs. This is the same as the uniform distribution conditional on the algorithm terminating with input strings of bits of a given length.
Let us consider such a deterministic algorithm V assigning names, and using strings of bits at stations as inputs, these strings being of a fixed length, assigned under a uniform distribution for this length, and such that they result in termination. An execution of this algorithm produces a finite binary sequence of bits, where we translate the feedback from the channel round by round, say, with symbol 1 representing a beep and symbol 0 representing silence. Each such a sequence is a binary codeword representing a specific assignment of names. These codewords have also a uniform distribution, by the same symmetry argument as used in the proof of Proposition 1. The expected length of a word in this code is the
37
expected time of algorithm T>. The expected time of algorithm T> is therefore at least lgn! = fl(nlogn), by the Shannons noiseless coding theorem. We conclude that, by the Yaos principle, the original randomized Las Vegas algorithm has expected time that is fl(nlogn).
A similar argument, by the Yaos principle, applies to a Monte Carlo algorithm that is incorrect with a constant probability smaller than 1/2. The only difference in the argument is that when a given assignment of input sting bits does not result in a proper assignment of names, then either the algorithm continues to work with more bits for an extra time, or terminates with error.
Next, we consider facts that hold when the number of stations n is unknown. The following Proposition 7 is about the inevitability of error. Intuitively, when two computing/communicating agents generate the same string of bits, then their actions are the same, and so they get the same name assigned. In other words, we cannot distinguish the case when there is only one such an agent present from cases when at least two of them work in unison.
Proposition 7 For an unknown number of station n, if a randomized naming algorithm is executed by n anonymous stations, then an execution is incorrect, in that duplicate names are assigned to different stations, with probability that is at least n~Q
The proof of Proposition 7 given in Proposition 3 is for the model of synchronous distributed computing in which processors communicate among themselves by reading from and writing to shared registers. The same argument applies to a synchronous beeping channel, when we understand actions of stations as either beeping or pausing in a round.
We conclude this section with a fact about impossibility of developing a Las Vegas naming algorithm when the number of stations n is unknown.
38
Proposition 8 There is no Las Vegas naming algorithm for a channel with beeping with at least two stations such that it does not refer to the number of stations.
The proof of Proposition 8 given in Proposition 2 is for the model of synchronous distributed computing in which processors communicate among themselves by reading from and writing to shared registers. The proof given for Proposition 2 is general enough to be directly applicable here as well, as both models are synchronous. Proposition 8 justifies developing Monte Carlo algorithm for unknown n, which we do in Section 8.2.
39
6. PRAM: LAS VEGAS ALGORITHMS
We consider naming of anonymous processors of a PRAM when the number of processors
n is known. This problem is investigated in four specific cases, depending on the additional assumptions pertaining to the model, and we give an algorithm for each case. The two independent assumptions regard the amount of shared memory (constant versus unbounded) and the PRAM variant (Arbitrary versus Common).
6.1 Arbitrary with Constant Memory
We present a naming algorithm for Arbitrary PRAM in the case when there are a constant number of shared memory cells. It is called ArbitraryConstantLV.
During an execution of this algorithm, processors repeatedly write random strings of bits representing integers to a shared memory cell called Pad, and next read Pad to verify the outcome of writing. A processor v that reads the same value as it attempted to write increments the integer stored in a shared register Counter and uses the obtained number as a tentative name, which it stores in a private variable name^. The values of Counter could get incremented a total of less than n times, which occurs when some two processors chose the same random integer to write to the register Pad. The correctness of the assigned names is verified by the inequality Counter > n, because Counter was initialized to zero. When such a verification fails then this results in another iteration of a series of writes to register Pad, otherwise the execution terminates and the value stored at name^ becomes the final name of processor v. Pseudocode for algorithm ArbitraryConstantLV is given in Figure 6.1. It refers to a constant j3 > 0 which determines the bounded range [1, nP] from which processors select integers to write to the shared register Pad.
Balls into bins. The selection of random integers in the range [1, nP] by n processors can be interpreted as throwing n balls into nP bins, which we call f3process. A collision represents two processors assigning themselves the same name. Therefore an execution of the algorithm can be interpreted as performing such ball placements repeatedly until there is no collision. Lemma 4 For each a > 0 there exists f3 > 0 such that when n balls are thrown into nP bins during the f3process then the probability of a collision is at most n~a.
40
Algorithm ArbitraryConstantLV
repeat
initialize Counter name^ 0 biip, random integer in [l,n^] for i 1 to n do
if name^ = 0 then Pad + birq, if Pad = birq, then
Counter Counter + 1 name^ < Counter
until Counter = n
Figure 6.1: A pseudocode for a processor v of an Arbitrary PRAM, where the number of shared memory cells is a constant independent of n. The variables Counter and Pad are shared. The private variable name stores the acquired name. The constant f3 > 0 is parameter to be determined by analysis.
Proof: Consider the balls thrown one by one. When a ball is thrown, then at most n bins are already occupied, so the probability of the ball ending in an occupied bin is at most n/n13 = n/3+1. No collisions occur with probability that is at least
1 
/31
n
> 1 
n
131
1 n
13+2
n
(6.1)
by the Bernoullis inequality. If we take f3 > a + 2 then just one iteration of the repeatloop is sufficient with probability that is at least 1 n~a.
Next we summarize the performance of algorithm ArbitraryConstantLV as a Las Vegas algorithm.
Theorem 4 Algorithm ArbitraryConstantLV terminates almost surely and there is no error when it terminates. For any a > 0, there exist f3 > 0 and c > 0 and such that the algorithm terminates within time cn using at most cnlnn random bits with probability at least 1 n~a.
Proof: The algorithm assigns consecutive names from a continuous interval starting from 1,
41
by the pseudocode in Figure 6.1. It terminates after n different tentative names have been assigned, by the condition controlling the repeat loop in the pseudocode of Figure 6.1. This means that proper names have been assigned when the algorithm terminates.
We map an execution of the /3process on an execution of algorithm ArbitraryConstantLV in a natural manner. Under such an interpretation, Lemma 4 estimates the probability of the event that the n processors select different numbers in the interval [1, A3] as their values to write to Pad in one iteration of the repeatloop. This implies that just one iteration of the repeatloop is sufficient with the probability that is at least 1 n~a. The probability of the event that i iterations are not sufficient to terminate is at most n~m, which converges to 0 as i increases, so the algorithm terminates almost surely. One iteration of the repeatloop takes 0(n) rounds and it requires 0(nlogn) random bits.
Algorithm ArbitraryConstantLV is optimal among Las Vegas naming algorithms with respect to its expected running time 0(n), given the amount 0(1) of its available shared memory, by Corollary 1, and the expected number of random bits 0(nlogn), by Proposition 1 in Section 5.2.
6.2 Arbitrary with Unbounded Memory
We give an algorithm for Arbitrary PRAM in the case when there is an unbounded supply of initialized shared memory cells. This algorithm is called ArbitraryUnboundedLV.
The algorithm uses two arrays Bin and Counter of ^ shared memory cells each. An
execution proceeds by repeated attempts to assign names. During each such attempt, the
processors work to assign tentative names. Next, the number of distinct tentative names is
obtained and if the count equals n then the tentative names become final, otherwise another
attempt is made. We assume that each such attempt uses a new segment of memory cells
Counter initialized to Os; this is only to simplify the exposition and analysis, because this
memory can be reset to 0 with a straightforward randomized algorithm which is omitted. An
attempt to assign tentative names proceeds by each processor selecting two integers bin^ and
labels uniformly at random, where bin G [1, and label G [l,n^]. Next the processors
42
Algorithm ArbitraryUnboundedLV
repeat
allocate Counter [1, ^] /* fresh memory cells initialized to Os/*
initialize position^ G (0,0)
bin G a random integer in [1,
label G a random integer in [l,n^]
repeat
initialize AllNamed G true if position^ = (0,0) then Bin [bin] G label if Bin [bin] = label then
Counter [bin] G Counter [bin] + 1 position^ <(bin, Counter [bin]) else AllNamed G false
until AllNamed /* each processor has a tentative name /*
name^ G rank of position^
until n is the maximum name /* no duplicates among tentative names /*
Figure 6.2: A pseudocode for a processor v of an Arbitrary PRAM, where the number of shared memory cells is unbounded. The variables Bin and Counter denote arrays of shared memory cells each, the variable AllNamed is also shared. The private variable name stores the acquired name. The constant f3 > 0 is a parameter to be determined by analysis.
repeatedly attempt to write label into Binfbin], Each such a write is followed by a read and the lucky writer uses Counterfbin] to create a pair of numbers (bin, Counter [bin]), after first incrementing Counter [bin], which is called bins position and is stored in variable position. After all processors have their positions determined, we define their ranks as follows. To find the rank of position^, we arrange all such pairs in lexicographic order, comparing first on bin and then on Counter [bin], and the rank is the position of this entry in the resulting list, where the first entry has position 1, the second 2, and so on. Ranks can be computed using a prefixtype algorithm operating in time 0{\ogn). This algorithm first finds for each bin G [1, the sum s(bin) = JOi
43
they are used as tentative names. Pseudocode for algorithm ArbitraryUnboundedLV is given in Figure 6.2.
In the analysis of algorithm ArbitraryUnboundedLV we will refer to the following bound on independent Bernoulli trials. Let Sn be the number of successes in n independent Bernoulli trials, with p as the probability of success. Let b(i;n,p) be the probability of an occurrence of exactly i successes. For r > np, the following bound holds
Px(Sn > r) < b(r; n,p) , (6.2)
r np
see Feller [42],
Balls into bins. We consider throwing n balls into ^ bins. Each ball has a label assigned randomly from the range [1, A3], for f3 > 0. We say that a labeled collision occurs when there are two balls with the same labels in the same bin. We refer to this process as f3process. Lemma 5 For each a > 0 there exists f3 > 0 and c > 0 such that when n balls are labeled with random integers in [1, A3] and next are thrown into ^ bins during the f3process then there are at most clnn balls in euery bin and no labeled collision occurs with probability 1 n~a.
Proof: We estimate from above the probabilities of the event that there are more than clnn balls in some bin and that there is a labeled collision. We show that each of them can be made to be at most n~a/2, from which it follows that at least one of these two events occurs with probability at most n~a.
Let p denote the probability of selecting a specific bin when throwing a ball, which is P = ^r When we set r = clnn, for a sufficiently large c > 1, then
b(r; n,p)
n
clnn) V n
lll71\c^nn { 111 77 \ Fic Inn
n
(6.3)
Formula (6.3) translates (6.2) into the following bound Pr(5'ra > r) <
n
lnu\clnra/ lnu\raclnra clnn(l ^) clnnJ \ n J V n J clnn Inn
(6.4)
44
The righthand side of (6.4) can be estimated by the following upper bound:
<
<
en \clnra/km\clnra/^ Inn\n~cinn c
clnn/ \ n
gNclnra/ lnn\ra
cJ V n J \n Inn
n J c 1
Tl \clnra c
c 1
nccclnne~lnn
n
c In n
n Inn,/ c 1
n
c In cjc 1
for each sufficiently large n > 0. This is because
/ n \clnra / Inn \clnra / Cln2n \
(]) =(1 +])
Vn inn/ V n inn/ Vn inn/
which converges to 1. The probability that the number of balls in some bin is greater than
clnn is therefore at most n n~clYlc+c~l = n~c(inci)^ unjon pound. This probability
can be made smaller than n~a/2 for a sufficiently large c > e.
The probability of a labeled collision is at most that of a collision when n balls are
thrown into vP bins. This probability is at most n/3+2 by bound (6.1) used in the proof of
Lemma 4. This number can be made at most n~a/2 for a sufficiently large /3.
Next we summarize the performance of algorithm ArbitraryUnboundedLV as a
Las Vegas algorithm.
Theorem 5 Algorithm ArbitraryUnboundedLV terminates almost surely and there is no error when the algorithm terminates. For any a > 0, there exists f3 > 0 and c > 0 such that the algorithm assigns names within clnn time and generates at most cnlnn random bits with probability at least 1 n~a.
Proof: The algorithm terminates only when n different names have been assigned, which is
provided by the condition that controls the main repeatloop in Figure 6.2. This means that
there is no error when the algorithm terminates.
We map executions of the /3process on executions of algorithm ArbitraryUnbounded
LV in a natural manner. The main repeatloop ends after an iteration in which each group
45
of processors that select the same value for variable bin next select distinct values for label. We interpret the random selections in an execution as throwing n balls into ^ bins, where a number bin determines a bin. The number of iterations of the inner repeatloop equals the maximum number of balls in a bin.
For any a > 0, it follows that one iteration of the main repeatloop suffices with probability at least 1 n~a, for a suitable (3 > 0, by Lemma 5. It follows that i iterations are executed by termination with probability at most n~m, so the algorithm terminates almost surely.
Let us take c > 0 as in Lemma 5. It follows that an iteration of the main repeatloop takes at most cbm steps and one processor uses at most clnn random bits in this one iteration with probability at least 1 n~a.
Algorithm ArbitraryUnboundedLV is optimal among Las Vegas naming algorithms with respect to the following performance measures: the expected time (9(logn), by Theorem 3, the number of shared memory cells G(n/ logn) used to achieve this running time, by Corollary 1, and the expected number of used random bits 0(nlogn), by Proposition 1 in Section 5.2.
6.3 Common with Constant Memory
Now we consider the case of Common PRAM when the number of available shared memory cells is constant. We propose an algorithm called CommonConstantLV.
An execution of the algorithm is organized as repeated attempts to assign temporary
names. During such attempt, each processor without a name chooses uniformly at random an
integer in the interval [1, numberofbins], where numberofbins is a parameter initialized
to n; such a selection is interpreted in a probabilistic analysis as throwing a ball into number
ofbins many bins. Next, for each i G [1,numberofbins], the processors that selected i,
if any, verify if they are unique in their selection of i by executing procedure Verify
Collision (given in Figure 4.1 in Section 4) f31nn times, where [3 > 0 is a number that
is determined by analysis. After no collision has been detected, a processor that selected
46
Algorithm COMMONCONSTANTLV
repeat
initialize numberofbins n ; name^ LastName 0 ; nocollision^ < true
repeat
initialize CollisionDetected false if name^ = 0 then
bin^ random integer in [1, numberofbins] for j O 1 to numberofbins do for j 1 to f31nn do if bin^ = i then
if VERIFYCOLLISION then
CollisionDetected collision^ true
if bin^ = i and not collision^ then LastName LastName + 1 name^ LastName
if n LastName > f31nn
then numberofbins (n LastName)
else numberofbins n/(f3 In n)
until not CollisionDetected until LastName = n
Figure 6.3: A pseudocode for a processor v of a Common PRAM, where there is a constant number of shared memory cells. Procedure VerifyCollision has its pseudocode in Figure 4.1; lack of parameter means the default parameter 1. The variables CollisionDetected and LastName are shared. The private variable name stores the acquired name. The constant (3 is a parameter to be determined by analysis.
i assigns itself a consecutive name by reading and incrementing the shared variable LastName. It takes up to [3numberofbins Inn, verifications for collisions for all integers in [1,numberofbins]. When this is over, the value of variable numberofbins is modified by decrementing it by the number of new names just assigned, when working with the last numberofbins, unless such decrementing would result in a number numberofbins that is at most (3 In n, in which case variable numberofbins is set to n/(f3 In n). An attempt ends when all processors have tentative names assigned. These names become final when there
47
are a total of n of them, otherwise there are duplicates, so another attempt is performed. A pseudocode for algorithm CommonConstantLV is in Figure 6.3, in which the main repeat loop represents an attempt to assign tentative names to each processor. An iteration of the inner repeat loop during which numberofbins > n/(/3lnn) is called shrinking and otherwise it is called restored.
Balls into bins. As a preparation of analysis of performance of algorithm CommonConstantLV, we consider a related process of repeatedly throwing balls into bins, which we call f3process. The /3process proceeds through stages, each representing one iteration of the inner repeatloop in Figure 6.3. A stage results in some balls removed and some transitioning to the next stage, so that eventually no balls remain and the process terminates.
The balls that participate in a stage are called eligible for the stage. In the first stage, n balls are eligible and we throw n balls into n bins. Initially, we apply the principle that after all eligible balls have been placed into bins during a stage, the singleton bins along with the balls in them are removed. A stage after which bins are removed is called shrinking. There are k bins and k balls in a shrinking stage; we refer to k as the length of this stage. Given balls and bins for any stage, we choose a bin uniformly at random and independently for each ball in the beginning of a stage and next place the balls in their selected destinations. The bins that either are empty or multiple in a shrinking stage stay for the next stage. The balls from multiple bins become eligible for the next stage.
This continues until such a shrinking stage after which at most /3 In n balls remain. Then we restore bins for a total of n/([d\nn)) of them to be used in the following stages, during which we never remove any bin; these stages are called restored. In these final restored stages, we keep removing singleton balls at the end of a stage, while balls from multiple bins stay as eligible for the next restored stage. This continues until all balls are removed.
Lemma 6 For any a > 0, there exists f3 > 0 such that the sum of lengths of all shrinking stages in the f3process is at most 2en, where e is the base of natural logarithms, and there are at most fdlnn restored stages, both events holding vnth probability 1 n~a, for sufficiently
48
large n.
Proof: We consider two cases depending on the kind of analyzed stages. Let k < n denote the length of a stage.
In a shrinking stage, we throw k balls into k bins choosing bins independently and uniformly at random. The probability that a ball ends up singleton can be bounded from below as follows:
1 / 1 \ k1 i i
_k=l_k=l , 1_1 J_ 1 .,,2 1
e k fc2 = e i+fc fc + fc2 = ev > _
e e
where we used the inequality 1 x > e~x~x which holds for 0 < x <
Let Zk be the number of singleton balls after k balls are thrown into k bins. It follows that the expectancy of Zk satisfies E [Zk\ > k/e.
To estimate the deviation of Zk from its expected value, we use the bounded differences inequality [71, 75]. Let Bj be the bin of ball bj, for 1 < j < k. Then Zk is of the form Zk = h(Bi,..., Bk) where h satisfied the Lipschitz condition with constant 2, because moving one ball to a different bin results in changing the value of h by at most 2 with respect to the original value. The boundeddifferences inequality specialized to this instance is as follows, for any d > 0:
Pr(Zk < E [Zk] dVk) < exp(d2/8) . (6.5)
We use this inequality for d = Then (6.5) implies the following bound:
Prfz*
e 2e
2e
32e2
If we start a shrinking stage with k eligible balls then the number of balls eligible for the next stage is at most
(l) k=2^^k ,
V 2e) 2e
with probability at least 1 exp(fc/32e2). Let us continue shrinking stages as long as the inequality ^2 > 3aInn holds. We denote this inequality concisely as k > fd\nn for
49
(3 = 96e2a. Then the probability that every shrinking stage results in the size of the pool of
eligible balls decreasing by a factor of at least
2e 1 1 2e 7
is itself at least
__ g3alnri^
log^i
> 1
logj n
> 1 n~2a
5
for sufficiently large n, by Bernoullis inequality.
If all shrinking stages result in the size of the pool of eligible balls decreasing by a factor of at least 1//, then the total number of eligible balls summed over all such stages is at most
n f 1 = n x = 2en
In a restored stage, there are at most (3 Inn eligible balls. A restored stage happens to be the last one when all the balls become single after their placement, which occurs with probability at least
n/((3 Inn) (3 Inn n/{[3 In n)
by the Bernoullis inequality. It follows that there are more than (3 Inn restored stages with probability at most
/331 n3n\^_ n(M
n 7
This bound is at most n2 for sufficiently large n.
Both events, one about shrinking stages and the other about restored stages, hold with probability at least 1 2n_2 > 1 n_, for sufficiently large n.
Next we summarize the performance of algorithm CommonConstantLV as Las Vegas one. In its proof, we rely on mapping executions of the /3process on executions of algorithm CommonConstantLV in a natural manner.
B2 In2 n\filnn /33ln3n
1 >1
n
n
Theorem 6 Algorithm CommonConstantLV terminates almost surely and there is no
error when the algorithm terminates. For any a > 0 there exist (3 > 0 and c > 0 such that the
50
algorithm terminates within time cnlnn using at most cnlnn random bits with probability 1 n~a.
Proof: The condition controlling the main repeatloop guarantees that an execution terminates only when the assigned names fill the interval [1, n\ so they are distinct.
To analyze time performance, we consider the /3process of throwing balls into bins as considered in Lemma 6. Let j3\ > 0 be the number /3 specified in this Lemma, as determined by a replaced by 2a in its assumptions. This Lemma gives that the sum of all values of K summed over all shrinking stages is at most 2en with probability at least 1 n~2a.
For a given K and a number i e [1, K], procedure VerifyCollision is executed /3 In n times, where /3 is the parameter in Figure 6.3. If there is a collision then it is detected with probability at least 2_/31nra. We may take /?2 > /?i sufficiently large so that the inequality 2en 2/?2lnra < n~2a holds.
The total number of instances of executing VerifyCollision during an iteration of the main loop, while K is kept equal to n/(f3 Inn), is at most n. Observe that the inequality n 2/32lnra < n~2a holds with probability at most 1 n~2a because n < 2en.
If /3 is set in Figure 6.3 to /32 then one iteration of the outer repeatloop suffices with probability at least 1 2n~2a, for sufficiently large n. This is because verifications for collisions detect all existing collisions with this probability. Similarly, this one iteration takes 0(nlogn) time with probability that is at least 1 2n~2a, for sufficiently large n. The claimed performance holds therefore with probability at least 1 n~a, for sufficiently large n.
There are at least i iterations of the main repeatloop with probability at most n~m, so the algorithm terminates almost surely.
Algorithm CommonConstantLV is optimal among Las Vegas algorithms with respect to the following performance measures: the expected time 0(n log n), given the amount 0(1) of its available shared memory, by Corollary 1, and the expected number of random bits 0(nlogn), by Proposition 1 in Section 5.2.
51
6.4 Common with Unbounded Memory
Now we consider the last case when PRAM is of its Common variant and there is an unbounded amount of shared memory. We propose an algorithm called CommonUnboundedLV. The algorithm invokes procedure VerifyCollision, whose pseudocode is in Figure 4.1.
An execution proceeds as a sequence of attempts to assign temporary names. When such attempt results in assigning temporary names without duplicates then these transient names become final. An attempt begins from each processor selecting an integer from the interval [1, ([3 + l)n] uniformly at random and independently, where [3 is a parameter such that only [3 > 1 is assumed. Next, for lg n steps, each process executes procedure VerifyCollision(x) where x is the currently selected integer. If a collision is detected then a processor immediately selects another number in [1, ([3 + 1 )n] and continues verifying for a collision. After lg n such steps, the processors count the total number of selections of different integers. If this number equals exactly n then the ranks of the selected integers are assigned as names, otherwise another attempt to find names is repeated. Computing the number of selections and the ranks takes time (9(logn). In order to amortize this time 0{\ogn) by verifications, such a computation of ranks is performed only after lgn verifications. Here a rank of a selected x is the number of numbers that are at most x that were selected. A pseudocode for algorithm CommonUnboundedLV is given in Figure 6.4. Subroutines of prefixtype, like computing the number of selects and ranks of selected numbers are not included in this pseudocode.
Balls into bins. We consider auxiliary processes of placing balls into bins that abstracts operations on shared memory as performed by algorithm COMMONUNBOUNDEDLV.
The [3process is about placing n balls into (/3 + 1 )n bins. The process is structured as a sequence of stages. A stage represents an abstraction of one iteration of the inner forloop in Figure 6.4 performed as if collisions were detected instantaneously and with certainty. When a ball is moved then it is placed in a bin selected uniformly at random, all such selections
52
Algorithm CommonUnboundedLV
x random integer in [1, (/3 + 1 )n] /* throw a ball into bin x /*
repeat
for i 1 to lgn do
if VERIFYCOLLISION (x) then
x random integer in [1, (/3 + l)n]
numberoccupiedbins the total number of selected values for x
until numberoccupiedbins = n
name^ the rank of bin x among nonempty bins
Figure 6.4: A pseudocode for a processor v of a Common PRAM, where the number of shared memory cells is unbounded. The constant /3 is a parameter that satisfies the inequality f3 > 1. The private variable name stores the acquired name.
independent from one another. The stages are performed as follows. In the first stage, n balls are placed into (/3 + 1 )n bins. When a bin is singleton in the beginning of a stage then the ball in the bin stays put through the stage. When a bin is multiple in the beginning of a stage, then all the balls in this bin participate actively in this stage: they are removed from the bin and placed in randomlyselected bins. The process terminates after a stage in which all balls reside in singleton bins. It is convenient to visualize a stage as occurring by first removing all balls from multiple bins and then placing the removed balls in randomly selected bins one by one.
We associate the mimicking walk to each execution of the /3process. Such a walk is performed on points with integer coordinates on a line. The mimicking walk proceeds through stages, similarly as the ball process. When we are to relocate k balls in a stage of the ball process then this is represented by the mimicking walk starting the corresponding stage at coordinate k. Suppose that we process a ball in a stage and the mimicking walk is at some position i. Placing this ball in an empty bin decreases the number of balls for the next stage; the respective action in the mimicking walk is to decrement its position from i to i 1. Placing this ball in an occupied bin increases the number of balls for the next stage; the
53
respective action in the mimicking walk is to increment its position from i to i + 1. The mimicking walk gives a conservative estimates on the behavior of the ball process, as we show next.
Lemma 7 If a stage of the mimicking walk ends at a position k, then the corresponding stage of the ball fdprocess ends vnth at most k balls to be relocated into bins in the next stage.
Proof: The argument is broken into three cases, in which we consider what happens in the ball process and what are the corresponding actions in the mimicking walk. A number of balls in a bin in a stage is meant to be the final number of balls in this bin at the end of the stage.
In the first case, just one ball is placed in a bin that begins the stage as empty. Then this ball will not be relocated in the next stage. This means that the number of balls for the next stage decreases by 1. At the same time, the mimicking walk decrements its position by 1.
In the second case, some j > 1 balls land in a bin that is singleton at the start of this stage, so this ball was not eligible for the stage. Then the number of balls in the bin becomes j + 1 and these many balls will need to be relocated in the next stage. Observe that this contributes to incrementing the number of the eligible balls in the next stage by 1, because only the original ball residing in the singleton bin is added to the set of eligible balls, while the other balls participate in both stages. At the same time, the mimicking walk increments its position by 1 j times.
In the third and final case, some j > 2 balls land in a bin that is empty at the start of this stage. Then this does not contribute to a change in the number of balls eligible for relocation in the next stage, as these j balls participate in both stages. Let us consider these balls as placed in the bin one by one. The first ball makes the mimicking walk decrement its position. The second ball makes the walk increment its position, so that it returns to the original position as at the start of the stage. The following ball placements, if any, result in
54
the walk incrementing its positions.
Random walks. Next we consider a random walk which will estimate the behavior of a
ball process. One component of estimation is provided by Lemma 7, in that we will interpret a random walk as a mimicking walk for the ball process.
The random walk is represented as movements of a marker placed on the nonnegative side of the integer number line. The movements of the marker are by distance 1 and they are independent. The random (3walk has the markers position incremented with probability ^jjand decremented with probability This may be interpreted as a sequence of independent
Bernoulli trials, in which is chosen to be the probability of success. We will consider [3 > 1, for which AL > W, which means that the probability of success is greater than the probability of failure.
Such a random /3walk proceeds through stages, which are defined as follows. The first stage begins at position n. When a stage begins at a position k then it ends after k moves, unless it reaches the zero coordinate in the meantime. The zero point acts as an absorbing barrier, and when the walks position reaches it then the random walk terminates. This is the only way in which the walk terminates. A stage captures one round of PRAMs computation and the number of moves in a stage represents the number of writes processors perform in a round.
Lemma 8 For any numbers a > 0 and [3 > 1, there exists b > 0 such that the random (3walk starting at position n > 0 terminates within bInn stages with all of them comprising 0(n) moves with probability at least 1 n~a.
Proof: Suppose the random walk starts at position k > 0 when a stage begins. Let Xk be the number of moves towards 0 and Yk = k Xk be the number of moves away from 0 in such a stage. The total distance covered towards 0, which we call drift, is
L{k) = XkYk = Xk(k Xk) = 2 Xkk.
The expected value of Xk is E [Xk] = = pk. The event Xk < (1 e)pk holds with
55
probability at most exp(y//*,), by the ChernofT bound [75], so that Xk > (1 e)p,k occurs with the respective high probability. We say that such a stage is conforming when the event Xk > (1 e)pk holds.
If a stage is conforming then the following inequality holds:
Lwzw*trhk = LjTrlk
We want the inequality > 0 to hold, which is the case when e < yy. Let us hx such
e > 0. Now the distance from 0 after k steps starting at k is
kL^ = ^trrf1'>k = 2J^T1k
where < 1 for e < yy. Let p = 2(i+pe) > 1 Consecutive % conforming stages make
the distance from 0 decrease by at least a factor p~\
When we start the first stage at position n and the next log^n stages are conforming then after these many stages the random walk ends up at a position that is close to 0. For our purposes, it suffices that the position is of distance at most slnn from 0, for some s > 0, because of its impact on probability. Namely, the event that all these stages are conforming and the bound s Inn on distance from 0 holds, occurs with probability at least
e2 /3
1 logn exp(
Ã‚Â£2 13
2/3 + 1
s In n) > 1 logn n 2 13+1 s
Let us choose s > 0 such that
. + 1
log n n 2 /3+1 <  ,
2na
for sufficiently large n.
Having fixed s, let us take t > 0 such that the distance covered towards 0 is at least s In n when starting from k = tlnn and performing k steps. We interpret these movements as if this was a single conceptual stage for the sake of the argument, but its duration comprises all stages when we start from slnn until we terminate at 0. It follows that the conceptual stage comprises at most tlnn real stages, because a stage takes at least one round.
56
If this last conceptual stage is conforming then the distance covered towards 0 is bounded
by
We want this to be at least slnn for k = tlnn, which is equivalent to
f32(3el _
^;t > s.
/3 + 1
Now it is sufficient to take t > s /3_^_r This last conceptual stage is not conforming with
2 o
probability at most exp(Inn). Let us take t that is additionally big enough for the following inequality
exp(
e2 f3
\ Ã‚Â£ P f
tlnn) = n 2 '3+1 <
I
2 (3 + 1 " ~ 2na
to hold.
Having selected s and t, we can conclude that there are at most (s +1) In n stages with probability at least I n~a.
Now let us consider only the total number of moves to the left Xm and to the right Ym after m moves in total, when starting at position n. The event Xm < (1 e) m holds
with probability at most exp(f TTg'W*), by the Chernoff bound [75], so that Xm > m (~11
occurs with the respective high probability I exp(%y+7 m) ^ same time we have that the number of moves away from zero, which we denote Ym, can be estimated to be
Ym = m Xm < m m ^ ^
(I e)(3 1 + e(3
1 + [3
m .
This gives an estimate on the corresponding drift:
L(m) = XmYm> ^  m .
We want the inequality /3~^~1 > 0 to hold, which is the case when e < 3Y3. The drift is at
2/3
least n, with the corresponding large probability, when m = d n for d = The drift
is at least such with probability exponentially close to I in n, which is at least I n~a for sufficiently large n.
57
Lemma 9 For any numbers a > 0 and /3 > 1, there exists b > 0 such that the (3process starting at position n > 0 terminates within blnn stages after performing 0(n) ball throws with probability at least 1 n~a.
Proof: We estimate the behavior of the /3process on n balls by the behavior of the random /3walk starting at position n. The justification of the estimation is in two steps. One is the property of mimicking walks given as Lemma 7. The other is provided by Lemma 8 and is justified as follows. The probability of decrementing and incrementing position in the random /3walk are such that they reflect the probabilities of landing in an empty bin or in an occupied bin. Namely, we use the facts that during executing the /3process, there are at most n occupied bins and at least /3 n empty bins in any round. In the /3process, the probability of landing in an empty bin is at least = /ypj, and the probability of
landing in an occupied bin is at most = /ypp This means that the random /3walk is
consistent with Lemma 7 in providing estimates on the time of termination of the /3process from above.
Incorporating verifications. We consider the random (3walk with verifications, which is defined as follows. The process proceeds through stages, similarly as the regular random /3walk. For any round of the walk and a position at which the walk is at, we first perform a Bernoulli trial with the probability \ of success. Such a trial is referred to as a verification, which is positive when a success occurs otherwise it is negative. After a positive verification a movement of the marker occurs as in the regular /3walk, otherwise the walk pauses at the given position for this round.
Lemma 10 For any numbers a > 0 and (3 > 1, there exists b > 0 such that the random [3walk with verifications starting at position n > 0 terminates within bInn stages with all of them comprising the total of 0(n) moves with probability at least 1 n~a.
Proof: We provide an extension of the proof of Lemma 8, which states a similar property of regular random /3walks. That proof estimated times of stages and the number of moves.
58
Suppose the regular random /3walk starts at a position k, so that the stage takes k moves. There is a constant d < 1 such that the walk ends at a position at most dk with probability exponential in k.
Moreover, the proof of Lemma 8 is such that all the values of k considered are at least logarithmic in n, which provides at most a polynomial bound on error. A random walk with verifications is slowed down by negative verifications. Observe that a random walk with verifications that is performed 3k times undergoes at least k positive verifications with probability exponential in k by the Chernoff bound [75]. This means that the proof of Lemma 8 can be adapted to the case of random walks with verifications almost verbatim, with the modifications contributed by polynomial bounds on error of estimates of the number of positive verifications in stages.
Next, we consider a /3process with verifications, which is defined as follows. The process proceeds through stages, similarly as the regular ball process. The first stage starts with placing n balls into (/3 + 1 )n bins. For any following stage, we first go through multiple bins and, for each ball in such a bin, we perform a Bernoulli trial with the probability \ of success, which we call a verification. A success in a trial is referred to as a positive verification otherwise it is a negative one. If at least one positive verification occurs for a ball in a multiple bin then all the balls in this bin are relocated in this stage to bins selected uniformly at random and independently for each such a ball, otherwise the balls stay put in this bin until the next stage. The /3process terminates when all the balls are singleton.
Lemma 11 For any numbers a > 0 and (3 > 1, there exists b > 0 such that the /3process with verifications terminates within b\nn stages with all of them comprising the total ofO(n) ball throws with probability at least 1 n~a.
Proof: The argument proceeds by combining Lemma 7 with Lemma 10, similarly as the proof of Lemma 9 is proved by combining Lemma 7 with Lemma 8. The details follow.
For any execution of a ball process with verifications, we consider a mimicking random
59
walk, also with verifications, defined such that when a ball from a multiple bin is handled then the outcome of a random verification for this ball is mapped on a verification for the corresponding random walk. Observe that for a /3process with verifications just one positive verification is sufficient among j 1 trials when there are j > 1 balls in a multiple bin, so a random /3walk with verifications provides an upper bound on time of termination of the /3process with verifications. The probabilities of decrementing and incrementing position in the random /3walk with verifications are such that they reflect the probabilities of landing in an empty bin or in an occupied bin, similarly as without verifications. All this give a consistency of a /3walk with verifications with Lemma 7 in providing estimates on the time of termination of the /3process from above.
Next we summarize the performance of algorithm CommonUnboundedLV as Las Vegas one. The proof is based on mapping executions of the /3processes with verifications on executions of algorithm CommonUnboundedLV in a natural manner.
Theorem 7 Algorithm CommonUnboundedLV terminates almost surely and when the algorithm terminates then there is no error. For each a > 0 and any /3 > 1 in the pseudocode, there exists c > 0 such that the algorithm assigns proper names within time c\gn and using at most cnlgn random bits with probability at least 1 n~a.
Proof: The algorithm terminates when there are n different ranks, by the condition controlling the repeatloop. As ranks are distinct and each in the interval [1 ,n\, each name is unique, so there is no error. The repeatloop is executed 0(1) times with probability at least 1 n~a, by Lemma 11. The repeatloop is performed i times with probability that is at most n~m, so it converges to 0 with i increasing. It follows that the algorithm terminates almost surely.
An iteration of the repeatloop in Figure 6.4 takes 0{\ogn) steps. This is because of the following two facts. First, it consists of lgn iterations of the forloop, each taking 0(1) rounds. Second, it concludes with verifying the untilcondition, which is carried out by
60
counting nonempty bins by a prefixtype computation. It follows that time until termination is 0{\ogn) with probability 1 n~a.
By Lemma 11, the total number of ball throws is 0{n) with probability 1 n~a. Each placement of a ball requires 0{\ogn) random bits, so the number of used random bits is 0(n\ogn) with the same probability.
Algorithm CommonUnboundedLV is optimal among Las Vegas naming algorithms with respect to the following performance measures: the expected time (9(logn), by Theorem 3, the number of shared memory cells 0{n) used to achieve this running time, by Corollary 1, and the expected number of random bits 0(n\ogn), by Proposition 1.
6.5 Conclusion
We considered the naming problem for the anonymous synchronous PRAM when the number of processors n is known. We gave Las Vegas algorithms for four variants of the problem, which are determined by the suitable restrictions on concurrent writing and the amount of shared memory. Each of these algorithms is provably optimal for its case with respect to the natural performance metrics such as expected time (as determined by the amount of shared memory) and expected number of used random bits.
61
7. PRAM: MONTE CARLO ALGORITHMS
We consider naming of anonymous processors of a PRAM when the number of processors n is unknown. They are determined by two independent specifications the naming problems: the amount of shared memory and the PRAM variant.
7.1 Arbitrary with Constant Memory
We develop a naming algorithm for an Arbitrary PRAM with a constant number of shared memory cells. The algorithm is called ArbitraryBoundedMC.
The underlying idea is to have all processors repeatedly attempt to obtain tentative names and terminate when the probability of duplicate names is gauged to be sufficiently small. To this end, each processor writes an integer selected from a suitable selection range into a shared memory register and next reads this register to verify whether the write was successful or not. A successful write results in each such a processor getting a tentative name by reading and incrementing another shared register operating as a counter. One of the challenges here is to determine a selection range from which random integers are chosen for writing. A good selection range is large enough with respect to the number of writers, which is unknown, because when the range is too small then multiple processors may select the same integer and so all of them get the same tentative name after this integer gets written successfully. The algorithm keeps the size of a selection range growing with each failed attempt to assign tentative names.
There is an inherent tradeoff present, in that on the one hand we want to keep the size of used shared memory small, as a measure of efficiency of the algorithm, while at the same time the larger the range of memory the smaller the probability of collision of random selections from a selection range and so of the resulting duplicate names. Additionally, increasing the selection range repeatedly costs time for each such a repetition, while we also want to minimize the running as a metric of performance. The algorithm keeps increasing the selection range with a quadratic rate, which turns out to be sufficient to optimize all the performance metrics we measure. The algorithm terminates when the number of selected
62
Algorithm ArbitraryBoundedMC
initialize k 1 /* initial approximation of lg n */
repeat
initialize LastName name^ 0
k 2k
blip, random integer in [1, 2k] /* throw a ball into a bin */
repeat
AllNamed true if name^ = 0 then Pad birq, if Pad = birq, then
LastName LastName + 1 name^ LastName else
AllNamed false until AllNamed
until LastName < 2k^
Figure 7.1: A pseudocode for a processor v of an Arbitrary PRAM with a constant number of shared memory cells. The variables LastName, AllNamed and Pad are shared. The private variable name stores the acquired name. The constant f3 > 0 is a parameter to be determined by analysis.
integers from the current selection range makes a sufficiently small fraction of the size of the used range.
A pseudocode of algorithm ArbitraryBoundedMC is given in Figure 7.1. Its structure is determined by the main repeatloop. Each iteration of the main loop begins with doubling the variable k, which determines the selection range [1,2k). This means that the size of the selection range increases quadratically with consecutive iterations of the main repeatloop. A processor begins an iteration of the main loop by choosing an integer uniformly at random from the current selection range [1,2k]. There is an inner repeatloop, nested within the main loop, which assigns tentative names depending on the random selections just made.
All processors repeatedly write to a shared variable Pad and next read to verify if the
63
write was successful. It is possible that different processors attempt to write the same value and then verify that their write was successful. The shared variable LastName is used to progress through consecutive integers to provide tentative names to be assigned to the latest successful writers. When multiple processors attempt to write the same value to Pad and it gets written successfully, then all of them obtain the same tentative name. The variable LastName, at the end of each iteration of the inner repeatloop, equals the number of occupied bins. The shared variable AllNamed is used to verify if all processors have tentative names. The outer loop terminates when the number of assigned names, which is the same as the number of occupied bins, is smaller than or equal to 2fc//3, where [3 > 0 is a parameter to be determined in analysis.
Balls into bins. We consider the following auxiliary [3process of throwing balls into bins, for a parameter [3 > 0. The process proceeds through stages identified by consecutive positive integers. The ith stage has the number parameter k equal to k = 2l During a stage, we first throw n balls into the corresponding 2k bins and next count the number of occupied bins. A stage is last in an execution of the /3process, and so the /3process terminates, when the number of occupied bins is smaller than or equal to 2k^. We observe that the /3process always terminates. This is because, by its specification, the /3process terminates by the first stage in which the inequality n < 2k^ holds and n is an upper bound on the number of occupied bins in a stage. The inequality n < 2fc//3 is equivalent to nP < 2k and so to [3\gn < k. Since k goes through consecutive powers of 2, we obtain that the number of stages of the /3process with n balls is at most lg(/3 lgn) = lg/3 + lglgn.
We say that such a /3process is correct when upon termination each ball is in a separate bin, otherwise the process is incorrect.
Lemma 12 For any a > 0 there exists [3 > 0 such that the (3process is incorrect vnth probability that is at most n~a, for sufficiently large n.
Proof: The /3process is incorrect when there are collisions after the last stage. The probability of the intersection of the events /3process terminates and there are collisions is
64
bounded from above by the probability of any one of these events. Next we show that, for each pair of k and n, some of these two events occurs with probability that is at most n~a, for a suitable /3.
First, we consider the event that the /3process terminates. The probability that there are at most 2k^ occupied bins is at most
< e2k/l3 2fc(d/3~1)2fc/,3 2fc(/3~11)ra
< e2k/P 2fc(/3_11)ffi2fc/'3)
We estimate from above the natural logarithm of the righthand side of (7.1). We obtain the following upper bound:
2k'fi + k{{3~1
1 )(n
2fc//3) In 2 < 2k/fi 2k/P In 2
2~~
 ^(n 2k/fi)ln2
In 2 In 2 k/H
n+2fc//3
2 2
0 k/B 2 + In 2 n + 2 11 
(7.2)
for /3 > 4/3, as k > 2. The estimate (7.2) is at most n^ when 2k^ < n8, for 8 = 2(2+m2) > by a direct algebraic verification. These restrictions on k and /3 can be restated as
k < /31g(n8) and /3 > 4/3 .
(7.3)
When this condition (7.3) is satisfied, then the probability of at most 2k^ occupied bins is at most
( ln2\ _n
exp \n J < n
for sufficiently large n.
Next, let us consider the probability of collisions occurring. Collisions do not occur with probability that is at least
n \ n n
1 > 1 2kJ ~ 2k
65
by the Bernoullis inequality. It follows that the probability of collisions occurring can be
2
bounded from above by ^. This bound in turn is at most n~a when
k > (2 + a) \gn .
(7.4)
In order to have some of the inequalities (7.3) and (7.4) hold for any k and n, it is sufficient to have
(2 + a) lg n < /3 lg(n4) .
This determines /3 as follows:
(2 + a)]g^2 + a
lg n + lg 8
with n > oo. We obtain that the inequality f3 > 2 + a suffices, for n that is large enough.
Lemma 13 For each /3 > 0 there exists c > 0 such that when the /3process terminates then the number of bins ever needed is at most cn and the number of random bits ever generated is at most cnlnn.
Proof: The /3process terminates by the stage in which the inequality n < 2k^ holds, so k gets to be at most fd\gn. We partition the range [2,/31gn] of values of k into two subranges and consider them separately.
First, when k ranges from 2 to lg n through the stages, then the numbers of needed bins increase quadratically through the stages, because k is doubled with each transition to the next stage. This means that the total number of all these bins is 0(n). At the same time, the number of random bits increases geometrically through the stages, so the total number of random bits a processor uses is (9(logn).
Second, when k ranges from lg n to /31gn, the number of needed bins is at most n in each stage. There are only lg(/3 + 1) such stages, so the total number of all these bins is lg(/3 + 1) n. At the same time, a processor uses at most fd\gn random bits in each of these stages.
66
There is a direct correspondence between iterations of the outer repeatloop and stages of a /3process. The zth stage has the number k equal to the value of k during the zth iteration of the outer repeatloop of algorithm ArbitraryBoundedMC, that is, we have k = 2b We map an execution of the algorithm into a corresponding execution of a /3process in order to apply Lemmas 12 and 13 in the proof of the following Theorem, which summarizes the performance of algorithm ArbitraryBoundedMC and justifies that it is Monte Carlo.
Theorem 8 Algorithm ArbitraryBoundedMC always terminates, for any [3 > 0. For each a > 0 there exists (3 > 0 and c > 0 such that the algorithm assigns unique names, works in time at most cn, and uses at most cnlnn random bits, all this with probability at least 1 n~a.
Proof: The number of stages of the /3process with n balls is at most lg(/3 lgn) = lg /3+lglgn. This is also an upper bound on the number of iterations of the main repeatloop. We conclude that the algorithm always terminates.
The number of bins available in a stage is an upper bound on the number of bins occupied in this stage. The number of bins occupied in a stage equals the number of times the inner repeatloop is iterated, because executing instruction Pad bin eliminates one occupied bin. It follows that the number of bins ever needed is an upper bound on time of the algorithm. The number of iterations of the inner repeatloop is executed is recorded in the variable LastName, so the termination condition of the algorithm corresponds to the termination condition of the /3process.
When the /3process is correct then this means that the processors obtain distinct names. We conclude that Lemmas 12 and 13 apply when understood about the behavior of the algorithm. This implies the following: the names are correct and execution terminates in 0(n) time while 0(n\ogn) bits are used, all this with probability that is at least 1 n~a. Algorithm ArbitraryBoundedMC is optimal with respect to the following performance measures: the expected time 0(n), by Theorem 2, the expected number of random
67
bits 0(n\ogn), by Proposition 1, and the probability of error n 0^\ by Proposition 3.
7.2 Arbitrary with Unbounded Memory
We develop a naming algorithm for Arbitrary PRAM with an unbounded amount of shared registers. The algorithm is called ArbitraryUnboundedMC.
The underlying idea is to parallelize the process of selection of names applied in Section 7.1 in algorithm ArbitraryBoundedMC so that multiple processes could acquire information in the same round that later would allow them to obtain names. As algorithm ARBITRARYBOUNDEDMC used shared registers Pad and LastName, the new algorithm uses arrays of shared registers playing similar roles. The values readoff from LastName cannot be uses directly as names, because multiple processors can read the same values, so we need to distinguish between these values to assign names. To this end, we assign ranks to processors based on their lexicographic ordering by pairs of numbers determined by Pad and LastName.
A pseudocode for algorithm ArbitraryUnboundedMC is given in Figure 7.2. It is structured as a repeatloop. In the first iteration, the parameter k equals 1, and in subsequent ones is determined by iterations of an increasing integervalued function r(k), which is a parameter. We consider two instantiations of the algorithm, determined by r(k) = k + 1 and by r(k) = 2k. In one iteration of the main repeatloop, a processor uses two variables bin G [1,2k/((3k)] and label G [1,2^], which are selected independently and uniformly at random from the respective ranges.
We interpret bin as a bins number and label as a label for a ball. Processors write their
values label into the respective bin by instruction Pad [bin] G label and verify what value
got written. After a successful write, a processor increments LastName [bin] and assigns the
pair (bin, LastName [bin]) as its position. This is repeated /3k times by way of iterating the
inner forloop. This loop has a specific upper bound (3k on the number of iterations because
we want to ascertain that there are at most (3k balls in each bin. The main repeatloop
terminates when all values attempted to be written actually get written. Then processors
68
Algorithm ArbitraryUnboundedMC
initialize k 1 /* initial approximation of lg n */
repeat
initialize AllNamed true
initialize position^ (0,0)
k r(k)
biip, random integer in [1, 2k/(f3k)\ /* choose a bin for the ball */
labels random integer in [1, 2@k] /* choose a label for the ball */
for i 1 to /3k do
if position^ = (0,0) then Pad [bin^] labels if Pad [bin^] = labels then
LastName [birq] LastName [bin^] + 1 position^ (biip,,LastName [biip,])
if position^ = (0,0) then AllNamed false
until AllNamed
name^ the rank of position^
Figure 7.2: A pseudocode for a processor v of an Arbitrary PRAM, when the number of shared memory cells is unbounded. The variables Pad and LastName are arrays of shared memory cells, the variable AllNamed is shared as well. The private variable name stores the acquired name. The constant f3 > 0 and an increasing function r(k) are parameters.
assign themselves names according to the ranks of their positions. The array LastName is assumed to be initialized to 0s, and in each iteration of the repeatloop we use a fresh region of shared memory to allocate this array.
Balls into bins. We consider a related process of placing labeled balls into bins, which is referred to as f3process. Such a process proceeds through stages and is parametrized by a function r(k). In the first stage, we have k = 1, and given some value of k in a stage, the next stage has this parameter equal to r(k). In a stage with a given k, we place n balls into 2k/(f3k) bins, with labels from [1, 2^fc]. The selections of bins and labels are performed independently and uniformly at random. A stage terminates the /3process when there are
69
at most (3 k labels of balls in each bin.
Lemma 14 The (3process always terminates.
Proof: The /3process terminates by a stage in which the inequality n < (3k holds, because n is an upper bound on the number of balls in a bin. This always occurs when function r{k) is increasing.
We expect the /3process to terminate earlier, as the next Lemma states.
Lemma 15 For each a > 0, if k < lgn 2 and f3 > 1 + a then the probability of halting in the stage is smaller than n~a, for sufficiently large n.
Proof: We show that when k is suitably small then the probability of at most /3k different labels in each bin is small. There are n balls placed into 2k/((3k) bins, so there are at least balls in some bin, by the pigeonhole principle. We consider these balls and their labels. The probability that all these balls have at most (3k labels is at most
Bkn
(2?k\ / (3k \ ^ ^ / e2?k \ ?k (/3k) ^
\Pk)\2Pk) ~\[3k) '(2/3^
ei^2Wk^\[3k)
^f3k
tf3k_^F^
(7.5)
We want to show that this is at most n~a. We compare the logarithms (But the base of logarithms!) of n~a and the righthand side of (7.5), and want the following inequality to hold:
f3k +  (3kj (lg(f3k) [3k) < algn ,
which is equivalent to the following inequality, by algebra:
n 1 algn
Sk lg(Sk) + + mfik \g(pk))
(7.6)
Observe now that, assuming (3>a+l, if k< y/lgn, then the righthand side of (7.6) is at most 2 + lgn while the lefthand side is at least y/n, and when y/\%n < k < lgn 2 then
70
righthand side of (7.6) is at most 3 while the lefthand side is at least 4, for sufficiently large n.
We say that a label collision occurs, in a configuration produced by the process, if some bin contains two balls with the same label.
Lemma 16 For any a > 0, if k > \\gn and (3 > 4a + 7 then the probability of a label collision is smaller than n~a.
Proof: The number of pairs of a bin number and a label is 2k 2/3fc/(/3k). It follows that the probability of some two balls in the same bin obtaining different labels is at least
(lV >1
V 2k+Pk/(/3k)J ~ 2k+l3k/(/3k)
by the Bernoullis inequality. So the probability that two different balls obtain the same 2
label is at most 2k+^/{yk) We want the following inequality to hold
n2 a

2^703k)
This is equivalent to the inequality obtained by taking logarithms
(2 + a) \gn < (1 + (3)k lg((3k) ,
which holds when (2 + a) lgn < It follows that it is sufficient for k to satisfy
, 2(2 +a).
k > TT/T gn '
This inequality holds for k > \ lg n when (3 > 4a + 7.
We say that such a /3process is correct when upon termination no label collision occurs, otherwise the process is incorrect.
Lemma 17 For any a > 0, there exists (3 > 0 such that the [3process is incorrect with probability that is at most n~a, for sufficiently large n.
71
Proof: The /3process is incorrect when there is a label collision after the last stage. The probability of the intersection of the events /3process terminates and there are label collisions is bounded from above by the probability of any one of these events. Next we show that, for each pair of k and n, some of these two events occurs with probability that is at most n~a, for a suitable /3.
To this end we use Lemmas 15 and 16 in which we substitute 2a for a. We obtain that, on the one hand, if k < lg n 2 and /3 > 1 + 2a then the probability of halting is smaller than n~2a, and, on the other hand, that if k > \ \gn and /3 > 8a + 7 then the probability of a label collision is smaller than n~2a. It follows that some of the two considered events occurs with probability at most 2n~2a for sufficiently large /3 and any sufficiently large n. This probability is at most n~a, for sufficiently large n.
Lemma 18 For any a > 0, there exists /3 > 0 and c > 0 such that the following two facts about the [3process hold. Ifr(k) = k+1 then at most cn/lnn bins are euer needed and cn In2 n random bits are euer generated, each among these properties occurring with probability that is at least 1 n~a. If r(k) = 2k then at most cn2/\nn bins are euer needed and cnlnn random bits are euer generated, each among these properties occurring with probability that is at least 1 n~a.
Proof: We throw n balls into 2k/((3k) bins. As k keeps increasing, then the probability of termination increases as well, because both 2k/((3k) and (3k increase as functions of k. Let us take k = 1 + lg n so that the number of bins is We want to show that no bin contains more than (3k balls with a suitably small probability.
Let us consider a specific bin and let X be the number of balls in this bin. The expected number of balls in the bin is p = We use the Chernoff bound for a sequence of Bernoulli trials in the form of
Pr(X > (1 + e)/i) < exp(Ã‚Â£2/i/3) ,
which holds for 0 < e < 1, see [75]. Let us choose e = so that 1 + e =  and = [3k.
72
We obtain that
Q 1 fjU ft
Pr(V > ,3k) < Pr(X > Pk) < exp( = exp(A (1 + lgn)) ,
which can be made smaller than n~l~a for a (3 sufficiently large with respect to a, and sufficiently large n. Using the union bound, each of the n bins contains at most (3 k balls with probability at most n~a. This implies that termination occurs as soon as k reaches or surpasses k = 1 + lgn, with the corresponding large probability 1 n~a.
In the case of r(k) = k + 1, the consecutive integer values of k are tried, so the /3process terminates by the time k = 1 + lgn, and for this k the number of bins needed is 0(n/logn). To choose a bin for any value of k requires at most k random bits, so implementing such choices for k = 1, 2,..., 1 + lgn requires (9 (log2 n) random bits per processor.
In the case of r{k) = 2k, the /3process terminates by k equal to 2(1 + lgn), and for this value of k the number of bins needed is 0(n2/logn). As k progresses through consecutive powers of 2, the sum of these numbers is a sum of a geometric progression, and so is of the order of the maximum term, that is 0(logn), which is the number of random bits per processor.
There is a direct correspondence between iterations of the outer repeatloop of algorithm ArbitraryUnboundedMC and stages of the /3process. We map an execution of the algorithm into a corresponding execution of a /3process in order to apply Lemmas 17 and 18 in the proof of the following Theorem, which summarizes the performance of algorithm ArbitraryUnboundedMC and justifies that it is Monte Carlo.
Theorem 9 Algorithm ArbitraryUnboundedMC always terminates, for any (3 > 0.
For each a > 0 there exists (3 > 0 and c > 0 such that the algorithm assigns unique names
and has the following additional properties with probability 1 n~a. If r(k) = k + 1 then at
most cn/lnn memory cells are euer needed, cnIn2 n random bits are euer generated, and the
algorithm terminates in time 0(log2n). If r(k) = 2k then at most cn2/\nn memory cells
are euer needed, cnlnn random bits are euer generated, and the algorithm terminates in time
73
0(\ogn).
Proof: The algorithm always terminates by Lemma 14. By Lemma 17, the algorithm assigns correct names with probability that is at least 1 n~a. The remaining properties follow from Lemma 18, because the number of bins is proportional to the number of memory cells and the number of random bits per processor is proportional to time.
The instantiations of algorithm ArbitraryUnboundedMC are close to optimality with respect to some of the performance metrics we consider, depending on whether r{k) = k + 1 or r{k) = 2k. If r(k) = k + 1 then the algorithms use of shared memory would be optimal if its time were (9(logn), by Theorem 2, but it may miss space optimality by at most a logarithmic factor, since the algorithms time is (9(log2 n). Similarly, if r(k) = k + 1 then the number of random bits ever generated 0(n\og2 n) misses optimality by at most a logarithmic factor, by Proposition 1. On the pother hand, if r{k) = 2k then the expected time 0{\ogn) is optimal, by Theorem 3, the expected number of random bits 0(n\ogn) is optimal, by Proposition 1, and the probability of error nC,(A js optimal, by Proposition 3, but the amount of used shared memory misses optimality by at most a polynomial factor, by Theorem 2.
7.3 Common with Bounded Memory
Algorithm CommonBoundedMC solves the naming problem for Common PRAM with a constant number of shared readwrite registers. To make its exposition more modular, we use two procedures EstimateSize and ExtendNames. Procedure EstimateSize produces an estimate of the number n of processors. Procedure ExtendNames is iterated multiple times, each iteration is intended to assign names to a group of processors. This is accomplished by the processors selecting integer values at random, interpreted as throwing balls into bins, and verifying for collisions. Each selection of a bin is followed by a collision detection. A ball placement without a detected collision results in a name assigned, otherwise the involved processors try again to throw balls into a range of bins. The effectiveness of
74
the algorithm hinges of calibrating the number of bins to the expected number of balls to be thrown.
Algorithm CommonBoundedMC has its pseudocode in Figure 7.5. The private variables have the following meaning: size is an approximation of the number of processors n, and numberofbins determines the size of the range of bins. The pseudocodes of procedures EstimateSize and ExtendNames are given in Figures 7.3 and 7.4, respectively.
Balls into bins for the first time. The role of procedure EstimateSize, when called by algorithm CommonBoundedMC, is to estimate the unknown number of processors n, which is returned as size, to assign a value to variable numberofbins, and assign values to each private variable bin, which indicates the number of a selected bin in the range [1, numberofbins]. The procedure tries consecutive values of k as approximations of lgn. For a given k, an experiment is carried out to throw n balls into k2k bins. The execution stops when the number of occupied bins is at most 2k, and then 3 2fc is treated as an approximation of n and k2k is the returned number of bins.
Lemma 19 For n > 20 processors, procedure EstimateSize returns an estimate size of n such that the inequality size < Cm holds with certainty and the inequality n < size holds with probability 1 2_n(A>.
Proof: The procedure returns 3 2k, for some integer k > 0. We interpret selecting of values for variable bin in an iteration of the main repeatloop as throwing n balls into k2k bins; here k = j + 2 in the jth iteration of this loop, because the smallest value of k is 3. Clearly, n is an upper bound on the number of occupied bins.
If n is a power of 2, say n = 2\ then the procedure terminates by the time i = k, so that 2k < 2l+1 = 2n. Otherwise, the maximum possible k equals [lgn], because 2LlgnJ < n < 2ligral_ This gives 2^gral = 2llgrd+1 < 2n. We obtain that the inequality 2k < 2n occurs with certainty, and so 3 2k < Cm does.
Now we estimate the lower bound on 2k. Consider k such that 2k < . Then n balls fall
75
Procedure EstimateSize
initialize k 2 /* initial approximation of lg n */
repeat
k k + 1
biip, random integer in [1, k 2k] initialize NonemptyBins 0
for i 1 to k 2k do if bin^ = i then
NonemptyBins NonemptyBins +1 until NonemptyBins < 2k
return (32k,k2k) /* 3 2k is size, k 2k is numberofbins */
Figure 7.3: A pseudocode for a processor v of a Common PRAM. This procedure is invoked by algorithm CommonBoundedMC in Figure 7.5. The variable NonemptyBins is shared.
into at most 2k bins with probability that is at most
k2k\ / 2k y
2k ) ik^J
< (^ f. i = (<*)*.
V 2k
kr'
e2fc^2fcn < en/3k2n/3
(7.7)
The righthand side of (7.7) is at most e/3 when the inequality k > e holds. The smallest k considered in the pseudocode in Figure 7.3 is k = 3 > e. The inequality k > e is consistent with 2k <  when n > 20. The number of possible values for k is O{\ogn) so the probability of the procedure returning for 2k <  is e/3 0(\ogn) = 2~n(n\
Procedure ExtendNamess behavior can also be interpreted as throwing balls into bins, where a processor As ball is in a bin x when bin^ = x. The procedure first verifies the suitable range of bins [1, numberofbins] for collisions. A verification for collisions takes either just a constant time or 0(logn) time.
A constant verification occurs when there is no ball in the considered bin i, which is verified when the line if bin^ = i for some processor x in the pseudocode in Figure 7.4 is to be executed. Such a verification is performed by using a shared register initialized to 0,
76
Procedure ExtendNames
initialize CollisionDetected collision^ false
for j < 1 to numberofbins do
if bin^ = i for some processor x then if biip, = i then
for j 1 to [3 lgsize do
if VERIFYCOLLISION then
CollisionDetected collision^ true
if not collision^ then
LastName LastName + 1 name^ LastName birq, 0
if (numberofbins > size) then numberofbins < size
if collision^ then
biip, random integer in [1, numberofbins]
Figure 7.4: A pseudocode for a processor v of a Common PRAM. This procedure invokes procedure VerifyCollision, whose pseudocode is in Figure 4.1, and is itself invoked by algorithm CommonBoundedMC in Figure 7.5. The variables LastName and CollisionDetected are shared. The private variable name stores the acquired name. The constant f3 > 0 is to be determined in analysis.
into which all processors v with birg, = i write 1, then all the processors read this register, and if the outcome of reading is 1 then all write 0 again, which indicates that there is at least one ball in the bin, otherwise there is no ball.
A logarithmictime verification of collision occurs when there is some ball in the corresponding bin. This triggers calling procedure VerifyCollision precisely /?lgn times; notice that this procedure has the default parameter 1, as only one bin is verified at a time. Ultimately, when a collision is not detected for some processor v whose ball is the bin, then this processor increments LastName and assigns its new value as a tentative name. Otherwise, when a collision is detected, processor v places its ball in a new bin when the last line
77
in Figure 7.4 is executed. To prepare for this, the variable numberofbins may be reset. During one iteration of the main repeatloop of the pseudocode of algorithm CommonBoundedMC in Figure 7.5, the number of bins is first set to a value that is 0(nlogn) by procedure EstimateSize. Immediately after that, it is reset to 0(n) by the first call of procedure EXTENDNAMES, in which the instruction numberofbins size is performed. Here, we need to notice that numberofbins = @(nlogn) and size = @(n), by the pseudocodes in Figures 7.3 and 7.5 and Lemma 19.
Balls into bins for the second time. In the course of analysis of performance of procedure ExtendNames, we consider a ballsintobins process; we call it simply the ball process. It proceeds through stages so that in a stage we have a number of balls which we throw into a number of bins. The sets of bins used in different stages are disjoint. The number of balls and bins used in a stage are as determined in the pseudocode in Figure 7.4, which means that there are n balls and the numbers of bins are as determined by an execution of procedure ESTIMATESIZE, that is, the first stage uses numberofbins bins and subsequent stages use size bins, as returned by EstimateSize. The only difference from the actions of procedure ExtendNames is that collisions are detected with certainty in the ball process rather than being tested for, which implies that the parameter [3 is not involved. The ball process terminates in stage lgsize or earlier in the first stage in which no multiple bins are produced, when such a stage occurs.
Lemma 20 The ball process results in all balls ending singleton in their bins and the number of times a ball is thrown, summed ouer all the stages, being 0(n), both euents occurring with probability 1 nn(losn),
Proof: The argument leverages the property that, in each stage, the number of bins exceeds the number of balls by at least a logarithmic factor. We will denote the number of bins in a stage by m. This number will take on two values, first m = k2k returned as numberofbins by procedure EstimateSize and then m = 3 2k returned as size by the same procedure
78
EstimateSize, for k > 3. Because m = k2k in the first stage, and also size = 32k > n, by Lemma 19, we obtain that m >  lg  in the first stage, and that m is at least n in the following stages, with probability exponentially close to 1.
In the first stage, we throw Ã‚Â£\ = n balls into at least m = f lg f bins, with large probability. Conditional on the event that there are at least these many bins, the probability that a given ball ends the stage as a singleton in a bin is
m ( 1 >1> 1 ra1 n > 1 ; ,
m\ m/ m f lg f ign
for sufficiently large n, where we used the Bernoullis inequality. Let Yi be the number of singleton balls in the first stage. The expectancy of Yi satisfies
To estimate the deviation of Yi from its expected value E [lb] we use the bounded differences inequality [71, 75]. Let Bj be the bin of ball bj, for 1 < j < l\. Then Y\ is of the form Y\ = h(Bi,..., Ã‚Â£>q), where h satisfies the Lipschitz condition with constant 2, because moving one ball to a different bin results in changing the value of h by at most 2 with respect to the original value. The boundeddifferences inequality specialized to this instance is as follows, for any d > 0:
Pr(Yi < E [17] dyfh) < exp(d2/8) . (7.8)
We employ d = lg n, which makes the righthand side of (7.8) asymptotically equal to nQ(iogn)_ nurnpjer 0f balls l2 eligible for the second stage can be estimated as follows,
this bound holding with probability 1 ftn(logra):
for sufficiently large n.
In the second stage, we throw t2 balls into m > n bins, with large probability. Conditional on the bound (7.9) holding, the probability that a given ball ends up as a singleton in
79
a bin is
1 / 1 \*2i 41 5
m (1 >1>1 ,
m\ mJ m ig n
where we used the Bernoullis inequality. Let 4 be the number of singleton balls in the second stage. The expectancy of Y2 satisfies
E [Y2] > w)
V ig n/
To estimate the deviation of Y2 from its expected value E \Y2], we again use the bounded differences inequality, which specialized to this instance is as follows, for any d > 0:
Pr(V < E[n] (ivT) < exp(
(7.10)
We again employ d = Ign, which makes the righthand side of (7.10) asymptotically equal to n~n(logn\ The number of balls 4 eligible for the third stage can be bounded from above as follows, which holds with probability 1 nQQ&n) ^ :
4 <
54
lgn
lg n\JT2
54 / lg2 n \ < 6n lg n\ lg2 n
(7.11)
for sufficiently large n.
Next, we generalize these estimates. In stages i, for i > 2, among the first O{\ogn) ones, we throw balls into m> n bins with large probability. Let 4 be the number of balls eligible for such a stage i. We show by induction that Ã‚Â£i, for i > 3, can be estimated as follows:
4 < pir 23_i (7.12)
lg n
with probability 1 nn3&n)_ The estimate (7.11) provides the base of induction for i = 3. In the inductive step, we assume (7.12), and consider what happens during stage i > 3 in order to estimate the number of balls eligible for the next stage i + 1.
In stage i, we throw 4 balls into m> n bins, with large probability. Conditional on the bound (7.12), the probability that a given ball ends up single in a bin is
1 / 1 \*i 4 1
TO (1I > 1  > 1
to V mJ m
6 23_i lg2 n
80
by the inductive assumption, where we also used the Bernoullis inequality. If Yi is the number of singleton balls in stage i, then its expectation E [1^] satisfies
/ 6 23~i\
E[Ih] >h (12) . (7.13)
V ig n /
To estimate the deviation of Yi from its expected value E [1^], we again use the bounded differences inequality, which specialized to this instance is as follows, for any d > 0:
Pr(Ih < E \Yi\ dy/Ii) < exp(d2/8) . (7.14)
We employ d = lg n, which makes the righthand side of (7.14) asymptotically equal to n~n(\ogn)' qipg num]3er 0f balls Ã‚Â£i+\ eligible for the next stage i + 1 can be estimated from above in the following way, the estimate holding with probability 1 :
Ã‚Â£%+l <
6 23~i ii lg2 n
lg n\[ti
6 23~i Ã‚Â£i
lg2 n
<
<
<
<
6 23 1 6 n 23~i
lg2 n lg2 n
Cm 23~i ( 6 23_i
lg2 n lg2 n
Cm 23~i ( 6
9 H
lg2 n V lg n
Cm 93*1
lg2 n
( 2('% 3)/2lg4n\
(1+^ysH
2(3b/2 ig2 nx
+^H
lg^nx
y/&ri'
for sufficiently large n that does not depend on i. For the event Yi < E [1^] d\fÃ‚Â£i in the
estimate (7.14) to be meaningful, it is sufficient if the following estimate holds:
lg n \[ti = o(E[yri]) .
This is the case as long as Ã‚Â£i > lg3 n, because E [Y^] = Ã‚Â£i( 1 + o(l)) by (7.13).
To summarize at this point, as long as Ã‚Â£i is sufficiently large, that is, > lg3 n, the number of eligible balls decreases by at least a factor of 2 with probability that is at least 1 n~n(lgn). If follows that the total number of eligible balls, summed over these stages, is 0(n) with this probability.
81
Algorithm CommonBoundedMC repeat
initialize LastName 0
(size, numberofbins) ESTIMATESlZE for i 1 to lgsize do ExtendNames
if not CollisionDetected then return
Figure 7.5: A pseudocode for a processor v of a Common PRAM, where there is a constant number of shared memory cells. Procedures EstimateSize and ExtendNames have their pseudocodes in Figures 7.3 and 7.4, respectively. The variables LastName and CollisionDetected are shared.
After at most lg n such stages, the number of balls becomes at most lg3 n with probability 1 /7,n(logra). R remains to consider the stages when A < lg3 n, so that we throw at most lg3 n balls into at least n bins. They all end up in singleton bins with a probability that is at least
/n lg3 n\P3 ^ / lg 3n\lg3ra>1 lg6 n
V n / V n / ~ n
by the Bernoullis inequality. So the probability of a collision is at most One stage
without any collision terminates the process. If we repeat such stages lgn times, without even removing singleton balls, then the probability of collisions occurring in all these stages is at most
^lg n,yÃ‚Â§ra ^npogri)
The number of eligible balls summed over these final stages is only at most lg7 n = o{n). The following Theorem summarizes the performance of algorithm CommonBoundedMC (see the pseudocode in Figure 7.5) as a Monte Carlo one.
Theorem 10 Algorithm CommonBoundedMC terminates almost surely. For each a > 0 there exists f3 > 0 and c > 0 such that the algorithm assigns unique names, works in time at most cnlnn, and uses at most cnlnn random bits, each among these properties holding with probability at least 1 n~a.
82
Proof: One iteration of the main repeatloop suffices to assign names with probability 1 n~n(logn\ by Lemma 20. This means that the probability of not terminating by the zth iteration is at most (ffin(logra))*; which converges to 0 with i growing to infinity.
The algorithm returns duplicate names only when a collision occurs that is not detected by procedure VerifyCollision. For a given multiple bin, one iteration of this procedure does not detect collision with probability at most 1/2, by Lemma 1. Therefore /31gsize iterations do not detect collision with probability by Lemma 19. The number
of nonempty bins ever tested is at most dn, for some constant d > 0, by Lemma 20, with the suitably large probability. Applying the union bound results in estimate n~a on the probability of error for sufficiently large /3.
The duration of an iteration of the inner forloop is either constant, then we call is short, or it takes time C(logsize), then we call it long. First, we estimate the total time spent on short iterations. This time in the first iteration of the inner forloop is proportional to numberofbins returned by procedure ESTIMATESIZE, which is at most 6n lg(6n), by Lemma 19. Each of the subsequent iterations takes time proportional to size, which is at most Cm, again by Lemma 19. We obtain that the total number of short iterations is 0(nlogn) in the worst case. Next, we estimate the total time spent on long iterations. One such an iteration has time proportional to lg size, which is at most lg Cm with certainty. The number of such iterations is at most dn with probability 1 n~n(logn\ for some constant d > 0, by Lemma 20. We obtain that the total number of long iterations is 0(nlogn), with the correspondingly large probability. Combining the estimates for short and long iterations, we obtain 0(nlogn) as a bound on time of one iteration of the main repeatloop. One such an iteration suffices with probability 1 n~n(logn\ by Lemma 20.
Throwing one ball uses O{\ogn) random bits, by Lemma 19. The number of throws is 0(n) with the suitably large probability, by Lemma 20.
Algorithm CommonBoundedMC is optimal with respect to the following performance metrics: the expected time 0{n\ogri), by Theorem 1, the number of random bits
83
0(n\ogn), by Proposition 1, and the probability of error n 0^\ by Proposition 3.
7.4 Common with Unbounded Memory
We consider naming on a Common PRAM in the case when the amount of shared memory is unbounded. The algorithm we propose, called CommonUnboundedMC, is similar to algorithm CommonBoundedMC in Section 7.3, in that it involves a randomized experiment to estimate the number of processors of the PRAM. Such an experiment is then followed by repeatedly throwing balls into bins, testing for collisions, and throwing again if a collision is detected, until eventually no collisions are detected.
Algorithm CommonUnboundedMC has its pseudocode given in Figure 7.7. The algorithm is structured as a repeat loop. An iteration starts by invoking procedure GaugeSize, whose pseudocode is in Figure 7.6. This procedure returns size as an estimate of the number of processors n. Next, a processor chooses randomly a bin in the range [l,3size]. Then it keeps verifying for collisions /ilgsize, in such a manner that when a collision is detected then a new bin is selected form the same range. After such f3 lg size verifications and possible new selections of bins, another /ilgsize verifications follow, but without changing the selected bins. When no collision is detected in the second segment of /ilgsize verifications, then this terminates the repeatloop, which follows by assigning to each station the rank of the selected bin, by a prefixlike computation. If a collision is detected in the second segment of f3 lg size verifications, then this starts another iteration of the main repeatloop.
Procedure GaugeSizeMC returns an estimate of the number n of processors in the form 2fc, for some positive integer k. It operates by trying various values of k, and, for a considered k, by throwing n balls into 2k bins and next counting how many bins contain balls. Such counting is performed by a prefixlike computation, whose pseudocode is omitted in Figure 7.6. The additional parameter f3 > 0 is a number that affects the probability of underestimating n.
The way in which selections of numbers k is performed is controlled by function r(k),
which is a parameter. We will consider two instantiations of this function: one is func
84
Procedure GaugeSizeMC k 1
repeat
k r(k)
blip, random integer in [1, 2k]
until the number of selected values of variable bin is < 2k/f3 return ( \2k+1/fi] )
Figure 7.6: A pseudocode for a processor v of a Common PRAM, where the number of shared memory cells is unbounded. The constant fi > 0 is the same parameter as in Figure 7.7, and an increasing function r(k) is also a parameter.
tion r(k) = k + 1 and the other is function r(k) = 2k.
Lemma 21 Ifr(k) = k +1 then the value of size as returned by GaugeSizeMC satisfies size < 2n with certainty and the inequality size> n holds with probability 1 fi~n/3.
If r{k) = 2k then the value of size as returned by GaugeSizeMC satisfies size < 2fin2 with certainty and size > fin2/2 with probability 1 fi~n/3.
Proof: We model procedures execution by an experiment of throwing n balls into 2k bins. If the parameter function r{k) is r(k) = k + 1 then we consider all possible consecutive values of k starting from k = 2, such that k = i + 1 in the Ah iteration of the repeatloop. If parameter r{k) is function r(k) = 2k then k takes on only the powers of 2.
There are at most n bins occupied in any such an experiment. Therefore, the procedure returns by the time the inequality 2k/fi > n holds and k is considered as determining the range of bins. It follows that if r(k) = k + 1 then the returned value \2k+l/is at most 2n. If r{k) = 2k then the worst error in estimating occurs when 21 /fi = n 1 for some i that is a power of 2. Then the returned value is 22l/fi = (fi(n 1 ))2/fi, which is at most 2fin2, this occurring with probability 1 fi~n/3.
Given 2k bins, we estimate the probability that the number of occupied bins is at most
85
2fc//3. It is
2
02k/0)\ 2* } ~\2k/0) 0"
Next, we identify a range of values of k for which this probability is exponentially close to 0 with respect to n.
To this end, let 0 < p < 1 and let us consider the inequality
(e/3)
2V/3. a
[3~n < pn
(7.15)
It is equivalent to the following one
(1 + ln/3) nln/3 < nlnp ,
P
by taking logarithms of both sides. This in turn is equivalent to
(1 + ln/3) < n(ln/3 In ) .
[3 V pJ
Let us choose p = /31/2 in (7.16). Then (7.15) specialized to this particular p is equivalent
to the following inequality ^(1 + ln/3) < n^p. This in turn leads to the estimate
(7.16)
2k < n
ln/3 [3
/3
< n
2 1 + In /3 2
which means 2k+l /j3 < n. When k satisfies this inequality then the probability of returning is at most /3_ra/2. There are (9(logn) such values of k considered by the procedure, so it returns for one of them with probability at most
0(\ogn) /3ra/2 < /3ra/3
for sufficiently large n.
Therefore, with probability at least 1 [3~n7 the returned value ~2fc+1//3~ is at least as
large as determined the first considered k that satisfies 2k+l/[3 > n. If r(k) = k + 1 then all
the possible exponents k are considered, so the returned value ~2fc+1//3~ is at least n with
probability 1 (3~n^. If r(k) = 2k then the worst error of estimating n occurs when 21+1 /(3 =
n 1 for some i that is a power of 2. Then the returned value is 22t+1/(3 = 2 ((3(n l)/2)2//3,
which is is at least (3n2/2, this occurring with probability 1 f3~n^.
86
Algorithm CommonUnboundedMC repeat
size
biip, random integer in [1,3 size]
for j < 1 to /ilgsize do
if VERIFYCOLLISION (bin*,) then
birq, random number in [1, 3 size]
CollisionDetected false for j < 1 to /ilgsize do
if VERIFYCOLLISION (bin*,) then CollisionDetected true
until not CollisionDetected
name^ the rank of biip, among selected bins
Figure 7.7: A pseudocode for a processor v of a Common PRAM, where the number of shared memory cells is unbounded. The constant [3 > 0 is a parameter impacting the probability of error. The private variable name stores the acquired name.
We discuss performance of algorithm CommonUnboundedMC (see the pseudocode in Figure 7.7) by referring to analysis of a related algorithm CommonUnboundedLV given in Section 6.4. We consider a [3process with verifications, which is defined as follows. The process proceeds through stages. The first stage starts with placing n balls into 3 size bins. For any of subsequent stages, for each multiple bins and for each ball in such a bin we perform a Bernoulli trial with the probability \ of success, which represents the outcome of procedure VerifyCollision. A success in a trial is referred to as a positive verification otherwise it is a negative one. If at least one positive verification occurs for a ball in a multiple bin then all the balls in this bin are relocated in this stage to bins selected uniformly at random and independently for each such a ball, otherwise the balls stay put in this bin until the next stage. The process terminates when all balls are singleton.
Lemma 22 For any number a > 0 there exists [3 > 0 such that the [3process with verifica
87
tions terminates within /? lg n stages vnth all of them comprising the total of 0(n) ball throws vnth probability at least 1 n~a.
Proof: We use the respective Lemma 11 in Section 6.4. The constant 3 determining our /3process with verifications corresponds to 1 + /3 in Section 6.4. The corresponding /3process in verifications considered in Section 6.4 is defined by referring to known n. We use the approximation size instead, which is at least as large as n with probability 1 /3_ra/3, by Lemma 21 just proved. By Section 6.4, our /3process with verifications does not terminate within fd\gn stages when size > n with probability at most n~2a and the inequality size > n does not hold with probability at most /3ra/3. Therefore the conclusion we want to prove does not hold with probability at most n~2a + /3ra/3, which is at most n~2a for sufficiently large n.
The following Theorem summarizes the performance of algorithm CommonUnboundedMC (see the pseudocode in Figure 7.7) as a Monte Carlo one. Its proof relies on mapping an execution of the /3process with verifications on executions of algorithm CommonUnboundedMC in a natural manner.
Theorem 11 Algorithm CommonUnboundedMC terminates almost surely, for sufficiently large /3. For each a > 0 there exists /3 > 0 and c > 0 such that the algorithm assigns unique names and has the following additional properties vnth probability 1 n~a. If r{k) = k + 1 then at most cn memory cells are ever needed, cnln2 n random bits are ever generated, and the algorithm terminates in time (9(log2 n). If r{k) = 2k then at most cn2 memory cells are ever needed, cnlnn random bits are ever generated, and the algorithm terminates in time Offogn).
Proof: For a given a > 0, let us take /3 that exists by Lemma 22. When the /3process with verifications terminates then this models assigning unique names by the algorithm. It follows that one iteration of the repeatloop results in algorithm terminating with proper names assigned with probability 1 n~a. One iteration of the main repeatloop does not result in
termination with probability at most n~a, so i iterations are not sufficient to terminate with probability at most n~m. This converges to 0 with increasing i so the algorithm terminates almost surely.
The performance metrics rely mostly on Lemma 21. We consider two cases, depending on which function r{k) is used.
If r{k) = k + 1 then procedure GaugeSizeMC considers all the consecutive values of k up to lg n, and for each such k, throwing a ball requires k random bits. We obtain that procedure GaugeSizeMC uses 0(nlog2 n) random bits. Similarly, to compute the number of selected values in an iteration of the main repeatloop of this procedure takes time O(k), for the corresponding k, so this procedure takes 0(log2 n) time. The value of size satisfies size < 2n with certainty. Therefore, 0(n) memory registers are ever needed and one throw of a ball uses C(logu) random bits, after size has been computed. It follows that one iteration of the main repeatloop of the algorithm, after procedure GaugeSizeMC has been completed, uses 0(nlogn) random bits, by Lemmas 21 and 22, and takes C(logu) time. Since one iteration of the main repeatloop suffices with probability 1 n~a, the overall time is dominated by the time performance of procedure GaugeSizeMC.
If r{k) = 2k then procedure GaugeSizeMC considers all the consecutive powers of 2 as values of k up to lg n, and for each such k, throwing a ball requires k random bits. Since the values k form a geometric progression, procedure GaugeSizeMC uses C(logu) random bits per processor. Similarly, to compute the number of selected values in an iteration of the main repeatloop of this procedure takes time O(k), for the corresponding k that increase geometrically, so this procedure takes C(logu) time. The value of size satisfies size < 2n with certainty. By Lemma 21, 0(n2) memory registers are ever needed, so one throw of a ball uses C(logu) random bits. One iteration of the main repeatloop, after procedure GaugeSizeMC has been completed, uses 0(nlogn) random bits, by Lemmas 21 and 22, and takes C(logu) time.
The instantiations of algorithm CommonUnboundedMC are close to optimality with
89
respect to some of the performance metrics we consider, depending on whether r(k) = k + 1 or r(k) = 2k. If r(k) = k + 1 then the algorithms use of shared memory would be optimal if its time were O(\ogn), by Theorem 2, but it misses space optimality by at most a logarithmic factor, since the algorithms time is 0(log2 n). Similarly, for this case of r{k) = k + 1, the number of random bits ever generated 0(nlog2 n) misses optimality by at most a logarithmic factor, by Proposition 1. In the other case of r(k) = 2k, the expected time O{\ogn) is optimal, by Theorem 3, the expected number of random bits 0(n\ogn) is optimal, by Proposition 1, and the probability of error nC,(P is optimal, by Proposition 3, but the amount of used shared memory misses optimality by at most a polynomial factor, by Theorem 3.
7.5 Conclusion
We considered four variants of the naming problem for an anonymous PRAM when the number of processors n is unknown and developed Monte Carlo naming algorithms for each of them. The two algorithms for a bounded number of shared register are provably optimal with respect to the following three performance metrics: expected time, expected number of generated random bits and probability of error.
90
8. NAMING A CHANNEL WITH BEEPS
In this section, we consider anonymous channel with beeping. We present names can be assigned to the anonymous stations by a Las Vegas and a Monte Carlo naming algorithms.
8.1 A Las Vegas Algorithm
We give a Las Vegas naming algorithm for the case when n is known. The idea is to have stations choose rounds to beep from a segment of integers. As a convenient probabilistic interpretation, these integers are interpreted as bins, and after selecting a bin a ball is placed in the bin. The algorithm proceeds by considering all the consecutive bins. First, a bin is verified to be nonempty by making the owners of the balls in the bin beep. When no beep is heard then the next bin is considered, otherwise the nonempty bin is verified for collisions. Such a verification is performed by Offogn) consecutive calls of procedure DetectCollision. When a collision is not detected then the stations that placed their balls in this bin assign themselves the next available name, otherwise the stations whose balls are in this bin place their balls in a new set of bins. When each station has a name assigned, we verify if the maximum assigned name is n. If this is the case then the algorithm terminates, otherwise we repeat. The algorithm is called BeepNamingLV, its pseudocode is in Figure 8.1.
Algorithm BeepNamingLV is analyzed by modeling its executions by a process of throwing balls into bins, which we call the ball process. The process proceeds through stages. There are n balls in the first stage. When a stage begins and there are some i balls eligible for the stage then the number of used bins is ilgn. Each ball is thrown into a randomly selected bin. Next, balls that are singleton in their bins are removed and the remaining balls that participated in collisions advance to the next stage. The process terminates when no eligible balls remain.
Lemma 23 The number of times a ball is thrown into a bin during an execution of the ball process that starts with n balls is at most 3n with probability at least 1 e/4.
Proof: In each stage, we throw some k balls into at least klgn bins. The probability that
91
Algorithm BeepNamingLV
repeat
counter 0 ; left 1 ; right n\gn ; name^ null repeat
slot^ random number in the interval [left, right] for i left to right do if j = slot^ then beep
if a beep was just heard then collision false for j 1 to f31gn do
if DetectCollision then collision true if not collision then counter < counter + 1 name^ < counter if name^ = null then beep if a beep was just heard then left counter right {n counter) lgn until no beep was heard in the previous round until counter = n
Figure 8.1: A pseudocode for a station v. The number of stations n is
known. Constant f3 > 1 is a parameter determined in the analysis. Procedure DetectCollision has its pseudocode in Figure 4.2. The variable name is to store the assigned identifier.
a given ball ends up singleton in a bin is at least
i = i .
k\g n lg n
which we denote as p. A ball is thrown repeatedly in consecutive iterations until it lands single in a bin. Our immediate concern is the number of trials to have all balls as singletons in their bins.
Suppose that we perform some m independent Bernoulli trials, each with probability p of success, and let X be the number of successes. We show next that m = O(n) suffices with large probability to have the inequality X > n.
92

Full Text 
PAGE 1
DISTRIBUTEDALGORITHMSFORNAMINGANONYMOUSPROCESSES by MUHAMMEDTALO B.S.,InonuUniversity,Turkey,2005 M.S.,FiratUniversity,Turkey,2007 Athesissubmittedtothe FacultyoftheGraduateSchoolofthe UniversityofColoradoinpartialfulllment oftherequirementsforthedegreeof DoctorofPhilosophy ComputerScienceandInformationSystems 2016
PAGE 2
ThisthesisfortheDoctorofPhilosophydegreeby MuhammedTalo hasbeenapprovedforthe ComputerScienceandInformationSystemsProgram by BogdanS.Chlebus,Advisor EllenGethner,Chair MichaelMannino BurtSimon TamVu December17,2016 ii
PAGE 3
Talo,MuhammedPh.D.,ComputerScienceandInformationSystems DistributedAlgorithmsforNamingAnonymousProcesses ThesisdirectedbyAssociateProfessorBogdanS.Chlebus ABSTRACT Weinvestigateanonymousprocessorscomputinginasynchronousmannerandcommunicatingviareadwritesharedmemory.Thissystemisknownasaparallelrandomaccess machinePRAM.Itisparameterizedbyanumberofprocessors n andanumberofshared memorycells.Weconsidertheproblemofassigninguniqueintegernamesfromtheinterval [1 ;n ]toall n processorsofaPRAM.Wedevelopalgorithmsforeachoftheeightspecic casesdeterminedbywhichofthefollowingindependentpropertieshold:1concurrentlyattemptingtowritedistinctvaluesintothesamememorycelleitherisallowedornot,2the numberofsharedvariableseitherisunlimitedoritisaconstantindependentof n ,and 3thenumberofprocessors n eitherisknownoritisunknown.Ouralgorithmsterminate almostsurely,theyareLasVegaswhen n isknown,theyareMonteCarlowhen n isnot known.Weshowlowerboundsontime,dependingonwhethertheamountsofsharedmemoryareconstantorunlimited.Inviewoftheselowerbounds,alltheLasVegasalgorithms wedevelopareasymptoticallyoptimalwithrespecttotheirexpectedtime,asdetermined bytheavailablesharedmemory.OurMonteCarloalgorithmsarecorrectwithprobabilitiesthatare1 )]TJ/F18 11.9552 Tf 12.58 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [( ,whichisbestpossiblewhenterminatingalmostsurelyandusing O n log n randombits.Wealsoconsideracommunicationchannelinwhichtheonlypossiblecommunicationmodeistransmittingbeeps,whichreachallthenodesinstantaneously. Thealgorithmicgoalistorandomlyassignnamestotheanonymousnodesinsuchamanner thatthenamesmakeacontiguoussegmentofpositiveintegersstartingfrom1.Thealgorithmsareprovablyoptimalwithrespecttotheexpectedtime O n log n ,thenumberof usedrandombits O n log n ,andtheprobabilityoferror. iii
PAGE 4
Theformandcontentofthisabstractareapproved.Irecommenditspublication. Approved:BogdanS.Chlebus iv
PAGE 5
ACKNOWLEDGMENT Firstofall,IwouldliketoexpressmygratitudetomyadvisorBogdanS.Chlebus. Withouthisguidanceandmentorshipthisthesiswouldhavebeenimpossible.Notonlydid heguidemeovertheyearsofmygraduatestudies,alsohealwaystriedtounderstandme evenwhenIclearlywasnotmakinganysense.Youhavesetanexampleofexcellenceasa researcher,mentor,androlemodel. IwanttothankGianlucaDeMarcoforcollaborationonseveralresearchprojects. IwouldliketothankEllenGethnerandSarahMandosforallthesupportovertheyears. MyresearchandworkonthethesishavealsobeensponsoredbyTurkishGovernment andsupportedbyBogdanS.ChlebusthroughtheNationalScienceFoundation.Thiswould nothavebeenpossiblewithouttheirsupport. IwouldalsoliketothankthefacultyandmyfellowstudentsattheUniversityofColorado Denverforcreatingexcellentworkingconditions. Finally,IthankmydaughtersBetulandAise,mywifeDilek,andmyparentsfortheir patienceandcontinuingsupport. v
PAGE 6
TABLEOFCONTENTS 1.INTRODUCTION...................................1 2.THESUMMARYOFTHERESULTS........................5 3.PREVIOUSANDRELATEDWORK.........................7 4.TECHNICALPRELIMINARIES...........................15 5.LOWERBOUNDSANDIMPOSSIBILITIES....................23 5.1Preliminaries.....................................23 5.2LowerBoundsforaPRAM.............................24 5.3LowerBoundsforaChannelwithBeeping.....................35 6.PRAM:LASVEGASALGORITHMS........................40 6.1ArbitrarywithConstantMemory..........................40 6.2ArbitrarywithUnboundedMemory.........................42 6.3CommonwithConstantMemory..........................46 6.4CommonwithUnboundedMemory.........................52 6.5Conclusion.......................................61 7.PRAM:MONTECARLOALGORITHMS......................62 7.1ArbitrarywithConstantMemory..........................62 7.2ArbitrarywithUnboundedMemory.........................68 7.3CommonwithBoundedMemory..........................74 7.4CommonwithUnboundedMemory.........................84 7.5Conclusion.......................................90 8.NAMINGACHANNELWITHBEEPS.......................91 8.1ALasVegasAlgorithm................................91 8.2AMonteCarloAlgorithm..............................95 8.3Conclusion.......................................101 9.OPENPROBLEMSANDFUTUREWORK.....................102 References ..........................................103 vi
PAGE 7
TABLES Table 2.1Thelistofthenamingproblems'specicationsandthecorrespondingLasVegas algorithms.......................................6 2.2Thelistofthenamingproblems'specicationsandthecorrespondingMonte Carloalgorithms...................................6 vii
PAGE 8
FIGURES Figure 4.1ProcedureVerifyCollision..............................19 4.2ProcedureDetectCollision..............................20 6.1AlgorithmArbitraryConstantLV..........................41 6.2AlgorithmArbitraryUnboundedLV........................43 6.3AlgorithmCommonConstantLV..........................47 6.4AlgorithmCommonUnboundedLV.........................53 7.1AlgorithmArbitraryBoundedMC.........................63 7.2AlgorithmArbitraryUnboundedMC........................69 7.3ProcedureEstimateSize...............................76 7.4ProcedureExtendNames..............................77 7.5AlgorithmCommonBoundedMC..........................82 7.6ProcedureGaugeSizeMC..............................85 7.7AlgorithmCommonUnboundedMC........................87 8.1AlgorithmBeepNamingLV.............................92 8.2ProcedureNextString................................97 8.3AlgorithmBeepNamingMC............................98 viii
PAGE 9
1.INTRODUCTION Weconsideradistributedsysteminwhichsome n processorscommunicateusingreadwritesharedmemory.Itisassumedthatoperationsperformedonsharedmemoryoccur synchronously,inthatexecutionsofalgorithmsarestructuredassequencesofgloballysynchronizedrounds.Eachprocessorisanindependentrandomaccessmachinewithitsown privatememory.SuchasystemisknownasasynchronousParallelRandomAccessMachine PRAM.Weconsidertheproblemofassigningdistinctintegernamesfromtheinterval[1 ;n ] totheprocessorsofaPRAM,whenoriginallytheprocessorsdonothavedistinctidentiers. Theproblemtoassignuniquenamestoanonymousprocessesindistributedsystems canbeconsideredasastageineitherbuildingsuchsystemsormakingthemfullyoperational.Correspondingly,thismaybecategorizedaseitheranarchitecturalchallengeoran algorithmicone.Forexample,tightlysynchronizedmessagepassingsystemsaretypically consideredundertheassumptionthatprocessorsareequippedwithuniqueidentiersfrom acontiguoussegmentofintegers.Thisisbecausesuchsystemsimposestrongdemandson thearchitectureandthetaskofassigningidentierstoprocessorsismodestwhencompared toprovidingsynchrony.Similarly,whensynchronousparallelmachinesaredesigned,then processorsmaybeidentiedbyhowtheyareattachedtotheunderlyingcommunication network.Incontrasttothat,PRAMisavirtualmodelinwhichprocessorscommunicate viasharedmemory;seeanexpositionofPRAMasaprogrammingenvironmentgivenby Kelleretal.[62].Thismodeldoesnotassumeanyrelationbetweenthesharedmemoryand processorsthatidentiesindividualprocessors. Distributedsystemswithsharedreadwriteregistersareusuallyconsideredtobeasynchronous.Synchronyinsuchenvironmentscanbeaddedbysimulationsratherthanbya supportivearchitectureoranunderlyingcommunicationnetwork.Processesdonotneedto behardwarenodes,instead,theycanbevirtualcomputingagents.Whenasynchronous PRAMisconsidered,asobtainedbyasimulation,thentheunderlyingsystemarchitecture doesnotfacilitateidentifyingprocessors,andsowedonotnecessarilyexpectthatprocessors 1
PAGE 10
areequippedwithdistinctidentiersinthebeginningofasimulation. WeviewPRAMasanabstractconstructwhichprovidesadistributedenvironmentto developalgorithmswithmultipleagents/processorsworkingconcurrently;seeVishkin[89] foracomprehensiveexpositionofPRAMasavehiclefacilitatingparallelprogramingand harnessingthepowerofmulticorecomputerarchitectures.Assigningnamestoprocessors bythemselvesinadistributedmannerisaplausiblestageinanalgorithmicdevelopmentof suchenvironments,asitcannotbedelegatedtothestageofbuildinghardwareofaparallel machine. Whenprocessorsofadistributed/parallelsystemareanonymousthenthetaskofassigningauniqueidentiertoeachprocessorisakeysteptomakethesystemfullyoperational, becausenamesareneededforexecutingdeterministicalgorithms.Weconsider naming tobe thetaskofassigninguniqueintegersintherange[1 ;n ]toagivensetof n processorsastheir names.Distributedalgorithmsassigningnamestoanonymousprocessorsarecalled naming inthisthesis.Weassumethatanonymousprocessorsdonothaveanyfeaturesfacilitating identicationordistinguishing. Wedealwithtwokindsofrandomizednamingalgorithms,calledMonteCarloand LasVegas,whicharedenedasfollows.Arandomizedalgorithmis LasVegas whenit terminatesalmostsurelyandthealgorithmreturnsacorrectoutputupontermination.A randomizedalgorithmis MonteCarlo whenitterminatesalmostsurelyandanincorrect outputmaybeproducedupontermination,buttheprobabilityoferrorconvergestozero withthesizeofinputgrowingunbounded.Thenamingalgorithmswedevelophavequalities thatdependonwhether n isknownornot,accordingtothefollowingsimplerule:each algorithmforaknown n isLasVegaswhileeachalgorithmforanunknown n isMonte Carlo.OurMonteCarloalgorithmshavetheprobabilityoferrorconvergingtozerowitha ratethatispolynomialin n .Moreover,whenincorrectduplicatenamesareassigned,the setofintegersusedasnamesmakesacontiguoussegmentstartingatthesmallestname1. Wesaythataparameterofanalgorithmicproblemis known whenitcanbeusedina 2
PAGE 11
codeofanalgorithm.WeconsidertwogroupsofnamingproblemsforaPRAM,depending onwhetherthenumberofprocessors n isknownornot. Additionally,weconsidertwocategoriesofnamingproblemsdependingonhowmuch sharedmemoryisavailable.Inonecase,thereisaconstantnumberofmemorycells,which meansthattheamountofmemoryisindependentof n butaslargeasneededforalgorithm design.Intheothercase,thenumberofsharedmemorycellsisunbounded,buthowmuch isuseddependsonanalgorithmand n .Whenthereisanunboundedamountofmemory then O n memorycellsactuallysuceforthealgorithmswedevelop.Wealsocategorize namingproblemsdependingonwhetheritisanArbitraryPRAMdistinctvaluesmaybe concurrentlyattemptedtobewrittenintoaregister,anarbitraryoneofthemgetswritten oraCommonPRAMvariantonlyequalvaluesmaybeconcurrentlyattemptedtobewritten intoaregister. Next,weinvestigateanonymouschannelwithbeeping.Therearesome n stationsattachedtothechannelthataredevoidofanyidentiers.Communicationproceedsinsynchronousrounds.Allthestationsstarttogetherinthesameround.Thechannelprovides abinaryfeedbacktoalltheattachedstations:whennostationstransmitthennothingis sensedonthecommunicationmedium,andwhensomestationdoestransmitthenevery stationdetectsabeep. Abeepingchannelresemblesmultipleaccesschannels,inthatitcanbeinterpretedasa singlehopradionetwork.Thedierencebetweenthetwomodelsisinthefeedbackprovided byeachkindofchannel.Thetraditionalmultipleaccesschannelwithcollisiondetection providesthefollowingternaryfeedback:silenceoccurswhennostationtransmits,amessage isheardwhenexactlyonestationtransmits,andcollisionisproducedbymultiplestations transmittingsimultaneously,whichresultsinnomessageheardandcanbedetectedby carriersensingasdistinctfromsilence.Multipleaccesschannelsalsocomeinavariant withoutcollisiondetection.Insuchchannelsthebinaryfeedbackisasfollows:whenexactly onestationtransmitsthenthetransmittedmessageisheardbyeverystation,andotherwise, 3
PAGE 12
wheneithernostationormultiplestationstransmit,thenthisresultsinsilence.Achannel withbeepinghasitscommunicationcapabilitiesrestrictedonlytocarriersensing,without eventhefunctionalityoftransmittingspecicbitsasmessages.Theonlyapparentmodeof exchanginginformationonsuchasynchronouschannelwithbeepingistosuitablyencodeit bysequencesofbeepsandsilences. Modelingcommunicationbyamechanismaslimitedasbeepinghasbeenmotivatedby diverseaspectsofcommunicationanddistributedcomputing.Beepingprovidesadetectionof collisiononatransmittingmediumbysensingit.Communicationbyonlycarriersensingcan beplacedinageneralcontextofinvestigatingwirelesscommunicationonthephysicallevel andmodelinginterferenceofconcurrenttransmissions,ofwhichthesignaltointerferenceplusnoiseratioSINRmodelisamongthemostpopularandwellstudied;see[54,61,85]. Beepingisthenaverylimitedmodeofwirelesscommunication,withfeedbackintheform ofeitherinterferenceorlackthereof.Anothermotivationcomesfrombiologicalsystems,in whichagentsexchangeinformationinadistributedmanner,whiletheenvironmentseverely restrictshowsuchagentscommunicate;see[2,78,86].Finally,communicationwithbeeps belongstotheareaofdistributedcomputingbyweakdevices,wheretheinvolvedagents haverestrictedcomputationalandcommunicationcapabilities.Inthiscontext,thedevices aremodeledasnitestatemachinesthatcommunicateasynchronouslybyexchangingstates ormessagesfromanitealphabet.Examplesofthisapproachincludethepopulationprotocols"modelintroducedbyAngluinetat.[7],seealso[9,11,73],andthestoneage" distributedcomputingmodelproposedbyEmekandWattenhoer[41]. 4
PAGE 13
2.THESUMMARYOFTHERESULTS Weconsiderrandomizedalgorithmsexecutedbyanonymousprocessorsthatoperateina synchronousmannerusingreadwritesharedmemorywithagoaltoassignuniquenamesto theprocessors.Thisproblemisinvestigatedineightspeciccases,dependingonadditional assumptions,andwegiveanalgorithmforeachcase.Thethreeindependentassumptions regardthefollowing:theknowledgeof n ,theamountofsharedmemory,and3the PRAMvariant. LasVegasalgorithmshavebeensubmittedajournalpaperaretakenfrom[26].The namingalgorithmswegiveterminatewithprobability1.ThesealgorithmsareLasVegasfor aknownnumberofprocessors n andotherwisetheyareMonteCarlo.Allouralgorithmsuse theoptimumexpectednumber O n log n ofrandombits.Weshowthatnamingalgorithms with n processorsand C> 0sharedmemorycellsneedtooperatein n=C expectedtime onanArbitraryPRAM,andin n log n=C expectedtimeonaCommonPRAM.Weshow thatanynamingalgorithmneedstoworkintheexpectedtimelog n ;thisboundmatters whenthereisanunboundedsupplyofsharedmemory.Basedonthesefacts,allourLas Vegasalgorithmsforthecaseofknown n operateintheasymptoticallyoptimumtime,and whentheamountofmemoryisunlimited,theyuseonlyanexpectedamountofspacethat isprovablynecessary.Thelistofthenamingproblems'specicationsandthecorresponding LasVegasalgorithmswiththeirperformanceboundsissummarizedinTable2.1. MonteCarloalgorithmshavebeensubmittedajournalpaperaretakenfrom[27].We showthataMonteCarlonamingalgorithmthatuses O n log n randombitshastohave thepropertythatitfailstoassignuniquenameswithprobabilitythatis n )]TJ/F22 7.9701 Tf 6.586 0 Td [( .AllMonte Carloalgorithmsthatwegivehavetheoptimumpolynomialprobabilityoferror.Thelist ofthenamingproblems'specicationsandthecorrespondingMonteCarloalgorithmswith theirperformanceboundsaresummarizedinTable2.2. ALasVegasandaMonteCarlonamingalgorithmsforabeepingchannelhavebeen submittedajournalpaperaretakenfrom[28].Weconsideredassigningnamestoanonymous 5
PAGE 14
Table2.1: Fournamingproblems,asdeterminedbythePRAMmodelandthe availableamountofsharedmemory,withtherespectiveperformanceboundsoftheir solutions.Whenthenumberofsharedmemorycellsisnotaconstantthenthegiven usageistheexpectednumberofsharedmemorycellsthatareactuallyused. PRAMModel Memory Time Algorithm Arbitrary O O n ArbitraryConstantLV inSection6.1 Arbitrary O n= log n O log n ArbitraryUnboundedLV inSection6.2 Common O O n log n CommonConstantLV inSection6.3 Common O n O log n CommonUnboundedLV inSection6.4 Table2.2: Fournamingproblems,asdeterminedbythePRAMmodelandthe availableamountofsharedmemory,withtherespectiveperformanceboundsoftheir solutionsasfunctionsofthenumberofprocessors n .Whentimeismarkedaspolylog" thismeansthatthealgorithmcomesintwovariants,suchthatinonetheexpected timeis O log n andtheamountofusedsharedmemoryissuboptimal n O ,andinthe othertheexpectedtimeissuboptimal O log 2 n buttheamountofusedsharedmemory missesoptimalityonlybyatmostalogarithmicfactor. PRAMModel Memory Time Algorithm Arbitrary O O n ArbitraryBoundedMC inSection7.1 Arbitrary unbounded polylog ArbitraryUnboundedMC inSection7.2 Common O O n log n CommonBoundedMC inSection7.3 Common unbounded polylog CommonUnboundedMC inSection7.4 stationsattachedtoachannelthatallowsonlybeepstobeheard.WepresentaLasVegas namingalgorithmandaMonteCarloalgorithmandshowthatalgorithmsareprovably optimalwithrespecttothenumberofusedrandombitsOnlogn,theexpectedtimeOn logn,andtheprobabilityoferror. 6
PAGE 15
3.PREVIOUSANDRELATEDWORK Herewesurveythepreviousworkonanonymousnaming. LiptonandPark[70]werethersttoconsiderthenamingprobleminasynchronous sharedmemorysystems.Theystudiednaminginasynchronousdistributedsystemswith readwritesharedmemorycontrolledbyadaptiveschedulers;theyproposedasolutionthat terminateswithpositiveprobability,whichcanbemadearbitrarilycloseto1assuming aknown n .Theydevelopedarandomizedalgorithmthatsolvesnamingproblem.Their algorithmisnotguaranteedtoterminate;however,ifitterminatesnotwoprocessorswill obtainthesamenames.Oncetheprocessorsterminatedthegivennamescomprisecompletely theset f 1 ; 2 ;:::;n g Theiralgorithmoperatesasfollows.Inthebeginningofanexecution,allprocessors initializethecontentsofsharedregisterstozeros.Everyprocessorrandomlyselectsan integer i from f 1 ; 2 ;:::;n 2 g ,where n isthenumberofprocessors.Theneachprocessorwrites 1toselectedcell i .Allprocessorsrepeatthisprocedureuntilthereisatleastonerowwhich contains n ,1'sbitvalue.Finally,theintegerchosenbyeachprocessorisusedasaname.The algorithmpresentedin[70],uses O Ln 2 bitsandterminates O Ln 2 timewithprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(c L forsomeconstant c> 1. Teng[87]providedarandomizedtwolayersolutionforthenamingproblemconsidering thesamesettingasLiptonandPark,asynchronousprocessor,thealgorithmwouldwork regardlessofinitialcontentofsharedmemory,buthissolutionimprovedthefailureprobabilityanddecreasedthespaceto O n log 2 n sharedbits,withprobabilityatleast1 )]TJ/F22 7.9701 Tf 15.698 4.707 Td [(1 n c foraconstant c .Thealgorithmterminates O n log 2 n time. Theauthordevelopedasimplealgorithmforasynchronoussystemswithknownn,that isatrivialmodicationofLiptonandPark'salgorithm[70].Tengassumedthat n processors dividedinto K groupswithhighprobability,thatis,eachgrouphasabout n=K processors. Hence,hereducedtheproblemsizefrom n to n=K .Thenheusedthesimilartechnique ofLiptonandParkforasmallersizeproblem.Thenumberofprocessorsisunknownin 7
PAGE 16
eachgroup.Therefore,everyprocessorchecksifthesumofthenumberof1'sequalsto n in maximumsizerowsofeachgroup.Theauthoralsoobservedthatif n isunknownthenno algorithmguaranteedtoterminatewithcorrectnames. LimandPark[69]showedthatthenamingproblemcanbesolvedin O n space; however,theyusedwordoperationsinsteadofbitoperations.Theauthorsusedashared memoryarrayindexed1through n ,where n isthenumberofprocessors.Thebasicideaof theiralgorithmis,atthebeginningoftheexecution,allprocessorsinitializethecontentsof sharedregisterstozeros.TheneveryprocessorrandomlychoosesakeyandanID,whichis correspondingtotheindexofacell,andtriestostoreitskeytoclaimedcell.Whenthere ismorethanoneprocessorschoosethesamecelltowriteitsID,theprocessorwiththe maximumkeykeepstheclaimedIDandtherestoftheprocessorswithsmallkeysclaim anewID.Iftherearenozeroentriesinthearray,i.e.,eachprocessorconrmsitsown ID,thenthealgorithmterminates.Notethattheirprotocolcanfailwhenmorethanone processorchoosesthesameIDandthesamekey.Additionally,theauthorsansweredthe openquestionsinwhichTengpostedattheconclusionof[87]. EgeciogluandSingh[39]proposedasynchronousalgorithmthateachprocessorrepeatedlychoosesanewrandomindexvalue,whichselectionmadeindependentlyanduniformly atrandom,andsetsthecorrespondingsharedregisterto1.Thenitcountsthenumberof ones.Ifthetotalnumberofonesequalsto n thentheprocessorexitstheloopandassign itselfIDoftheindexvalue.Theexpectedterminationtimeforthesynchronousalgorithmis O n 2 .TheyalsoproposedapolynomialtimeLasVegasnamingalgorithmforasynchronous systemswithobliviousschedulingofeventsforaknown n underweaksharedreadwritememorysystem.Intuitively,theideaintheiralgorithmisasfollows.Ituses K copiesofanarray. Eachprocessorchoosesarandomindexvalueforeachcopiesofarray K ratherthanasingle arrayandsetstheselectedregisterto1.Theneveryprocessorreadsall K arrays.Processors performwriteoperationtoeach K copyofarraysinascendingorder,butafterwards,readall arraysindescendingorder.Aprocessorkeepsexecutingthealgorithmuntilthetotalnumber 8
PAGE 17
ofonesinarowequalstothenumberofprocessors.Processorsmayexitfromtheexecution afterdetectingasuccessfulreadbecauseofasynchrony.Theprocessorthatexitfromthe executionafterasuccessfulscanrecordstheIDsofthesucceededrowtoitsprivatememory. Becauseofasynchrony,therestoftheprocessorscannotbeabletoexecuteasuccessfulscan atthesametime.Ifthereisasuccessfulreadandthenumberofdetectedprocessorsare lessthan K ,therestofprocessorsrepeatthesamesequenceofstepsonadierentarray, withargument n )]TJ/F15 11.9552 Tf 11.971 0 Td [(1andsoon.Whenaprocessorexitsfromtheexecutionaftersuccessful readwaitsfortherestoftheprocessorstoexitandthentheyassignnamestothemselves inarangeof[1 ;n ].Theauthorsalsoshowedthatsymmetrycannotbebrokeniftheexact numberofprocessesisunknown.Moreover,theyobservedthattheparticipationofevery processorisnecessaryinordertoterminate. Kuttenetal.[68]consideredthenaminginasynchronoussystemsofsharedreadwrite memory.TheygaveaLasVegasalgorithmforanobliviousschedulerforthecaseofknown n whichworksintheexpectedtime O log n whileusing O n sharedregisters,andalso showedthatalogarithmictimeisrequiredtoassignnamestoanonymousprocesses.Authors providedanonterminatingdynamicalgorithm,whereprocessesmaystopandstarttaking stepsduringtheexecutionandthenaddedastaticterminationdetectionmechanismwhich workswhenthenumberofprocessors n isknown. Theirdynamicalgorithmoperatesasfollows.Whenaprocessisactive,itrandomly selectsanIDandalwayschecktoseeifthesameIDclaimedbyanyotherprocesses.A processrepeatedlyeitherreadstheclaimedregisterorwritesarandombitwhichischosen independentlyandrandomlyandrecordsthechosenvalue.Whenaprocessreadsaregister, itchecksoutifthevalueofregisterhaschangedsinceithaswrittentoitthelasttime.If theprocessobservesthathisclaimedIDisalsoselectedbyanyotherprocesses,itselects randomlyanewIDtoclaim.Notethatintheiralgorithmeachprocessdetectsthecollision inconstanttime.Ifaprocessobservesnochangeafterreadingthecontentsoftheshared registerthenitmovesonthenextiteration.Next,theyprovidedanalgorithmtomakethe 9
PAGE 18
dynamicalgorithmtoterminate.Theyassumethatallsharedregistersareinitialized.The terminationdetectionalgorithmemploysabinarytree,whereitsleavescorrespondingto claimedID.Eachprocesstraversesthebinarytreefromleavestorootbyupdatingthesum ofchildrenineachnode.Whenaprocessseesthattherootofthetreehasthevalue n it exitstheloop.Inotherwords,thesetofclaimedIDsisxedwhentherootofthebinary treehasthevalue n Theyusedthesizeof O log n bitsforsomeregisterswhereasthealgorithmof[39,70,87] usedsinglebitregister.Additionally,theyshowedthatif n isunknownthenaLasVegas namingalgorithmdoesnotexist,andanitestateLasVegasnamingalgorithmcanwork onlyforanobliviousscheduler,thatistosay,thereisnoterminatingalgorithmif n isnot knownortheschedulerisadaptive. TheauthorsalsogaveaLasVegasalgorithmwhichworksforunboundedspaceunder anyfairscheduler.Finally,theyprovidedadeterministicsolutionforthenamingproblem inreadmodifywritemodelbyusingjustoneregister.Thismodelisamuchadvanced computationalmodel,whereaprocesscanreadandupdateasharedvariableinjustone step. Panconesietal.[79]gavearandomizedwaitfreenamingalgorithminanonymoussystems withprocessespronetocrashesthatcommunicatebysinglewriterregisters.Theyassume dierentprocessesmayaddressaregisterwithdierentindexnumberandcanreadallother sharedvariables.Theygaveanalgorithmthatisbasedonwaitfreeimplementationof Test&SetOnceobjectsforanadaptiveschedulerforthecaseofknown n ,whichworksinthe expectedrunningtime O n log n loglog n bitoperationwithprobabilityatleast1 )]TJ/F18 11.9552 Tf 12.589 0 Td [(o whileusinganamespaceofsize+ n ,where > 0.Themodelconsideredinthatwork assignsuniqueregisterstonamelessprocessesandsohasapotentialtodefytheimpossibility ofwaitfreenamingforgeneralmultiwriterregistersasobservedbyKuttenetal.[68]. Buhrmanetal.[23]consideredtherelativecomplexityofnamingandconsensusproblems inasynchronoussystemswithsharedmemorythatarepronetocrashfailures,demonstrating 10
PAGE 19
thatnamingisharderthatconsensus. Nowwereviewworkonproblemsinanonymousdistributedsystemsdierentfromnaming.Aspnesetal.[10]gaveacomparativestudyofanonymousdistributedsystemswith dierentcommunicationmechanisms,includingbroadcastandsharedmemoryobjectsof variousfunctionalities,likereadwriteregistersandcounters.Alistarhetal.[5]gaverandomizedrenamingalgorithmsthatactlikenamingones,inthatprocessidentiersarenot referredto;formoreorrenamingsee[4,13,30].Aspnesetal.[12]consideredsolvingconsensusinanonymoussystemswithinnitelymanyprocesses.Attiyaetal.[15]andJayanti andToueg[60]studiedtheimpactofinitializationofsharedregistersonsolvabilityoftasks likeconsensusandwakeupinfaultfreeanonymoussystems.Bonnetetal.[21]considered solvabilityofconsensusinanonymoussystemswithprocessespronetocrashesbutaugmentedwithfailuredetectors.GuerraouiandRuppert[55]showedthatcertaintaskslike timestamping,snapshotsandconsensushavedeterministicsolutionsinanonymoussystems withsharedreadwriteregisterspronetoprocesscrashes.Ruppert[82]studiedtheimpact ofanonymityofprocessesonwaitfreecomputingandmutualimplementabilityoftypesof sharedobjects. LowerboundsonPRAMweregivenbyFichetal.[43],Cooketal.[31],andBeame[19], amongothers.Areviewoflowerboundsbasedoninformationtheoreticapproachisgivenby AttiyaandEllen[14].Yao'sminimaxprinciplewasgivenbyYao[91];thebookbyMotwani andRaghavan[77]givesexamplesofapplications. Theproblemofconcurrentcommunicationinanonymousnetworkswasrstconsidered byAngluin[6].Thatworkshowed,inparticular,thatrandomizationisneededinnamingalgorithmswhenexecutedinenvironmentsthatareperfectlysymmetric;otherrelated impossibilityresultsaresurveyedbyFichandRuppert[44]. Theworkaboutanonymousnetworksthatfollowedwaseitheronspecicnetworktopologiesoronproblemsingeneralmessagepassingsystems.Mostpopularspecictopologies includedthatofaringandhypercube.Inparticular,theringtopologywasinvestigated 11
PAGE 20
byAttiyaetal.[16,17],Flocchinietal.[45],Diksetal.[38],ItaiandRodeh[58],and Kranakisetal.[65],andthehypercubetopologywasstudiedbyKranakisandKrizanc[64] andKranakisandSantoro[67]. Workonalgorithmicproblemsinanonymousnetworksofgeneraltopologiesoranonymous/namedagentsinanonymous/namednetworksincludedthefollowingspeciccontributions.AfekandMatias[3]andSchieberandSnir[84]consideredleaderelection,ndingspanningtreesandnamingingeneralanonymousnetworks.Angluinetal.[8]studiedadversarial communicationbyanonymousagentsandAngluinetal.[9]consideredselfstabilizingprotocolsforanonymousasynchronousagentsdeployedinanetworkofunknownsize.Chalopinet al.[24]studiednamingandleaderelectioninasynchronousnetworkswhenanodeknowsthe mapofthenetworkbutitspositiononthemapisunknown.Chlebusetal.[29]investigated anonymouscompletenetworkswhoselinksandnodesaresubjecttorandomindependent failuresinwhichsinglefaultfreenodehastowakeupallnodesbypropagatingawakeup messagethroughthenetwork.DereniowskiandPelc[36]consideredleaderelectionamong anonymousagentsinanonymousnetworks.DieudonneandPec[37]studiedteamsofanonymousmobileagentsinnetworksthatexecutedeterministicalgorithmwiththegoaltoconvene atonenode.Fraigniaudetal.[48]considerednaminginanonymousnetworkswithonenode distinguishedasleader.G asieniecetal.[52]investigatedanonymousagentspursuingthe goaltomeetatanodeoredgeofaring.Glacetetal.[53]consideredleaderelectionin anonymoustrees.KowalskiandMalinowski[63]studiednamedagentsmeetinginanonymousnetworks.Kranakisetal.[66]investigatedcomputingbooleanfunctionsonanonymous networks.Metivieretal.[72]considerednaminganonymousunknowngraphs.Michailet al.[74]studiedtheproblemsofnamingandcountingnodesindynamicanonymousnetworks. Pelc[80]consideredactivatingananonymousadhocradionetworkfromasinglesourceby adeterministicalgorithm.YamashitaandKameda[90]investigatedtopologicalproperties ofanonymousnetworksthatallowfordeterministicsolutionsforrepresentativealgorithmic problems. 12
PAGE 21
Generalquestionsofcomputabilityinanonymousmessagepassingsystemsimplemented innetworkswerestudiedbyBoldiandVigna[20],Emeketal.[40],andSakamoto[83]. Next,wereviewworkonproblemsforBeepingNetworks.Themodelofcommunication bydiscretebeepingwasintroducedbyCornejoandKuhn[32],whoconsideredageneraltopologywirelessnetworkinwhichnodesuseonlycarriersensingtocommunicate,and developedalgorithmsfornodecoloring.Theywereinspiredbycontinuous"beepingstudied byDegesysetal.[35]andMotskinetal.[76],andbytheimplementationofcoordinationby carriersensinggivenbyFluryandWattenhofer[46]. Afeketal.[1]consideredtheproblemtondamaximalindependentsetofnodesina distributedmannerwhenthenodescanonlybeep,underadditionalassumptionsregarding theknowledgeofthesizeofthenetwork,wakingupthenetworkbybeeps,collisiondetection amongconcurrentbeeps,andsynchrony.Brandesetal.[22]studiedtheproblemofrandomly estimatingthenumberofnodesattachedtoasinglehopbeepingnetwork.Czumajand Davies[34]approachedsystematicallythetasksofdeterministicbroadcasting,gossiping, andmultibroadcastingonthebitlevelingeneraltopologysymmetricbeepingnetworks.In arelatedwork,HounkanliandPelc[56]studieddeterministicbroadcastinginasynchronous beepingnetworksofgeneraltopologywithvariouslevelsofknowledgeaboutthenetwork. F orsteretal.[47]consideredleaderelectionbydeterministicalgorithmsingeneralmultihop networkswithbeeping.GilbertandNewport[51]studiedtheeciencyofleaderelection inabeepingsinglehopchannelwhennodesarestatemachinesofconstantsizewitha specicprecisionofrandomizedstatetransitions.HuangandMoscibroda[57]considered theproblemsofidentifyingsubsetsofstationsconnectedtoabeepingchannelandcompared theircomplexitytothoseonmultipleaccesschannels.Yuetal.[92]consideredtheproblem ofconstructingaminimumdominatingsetinnetworkswithbeeping. Networksofnodescommunicatingbybeepingsharecommonfeatureswithradionetworks withcollisiondetection.GhaariandHaeupler[49]gaveecientleaderelectionalgorithm bytreatingcollisiondetectionasbeeping"andtransmittingmessagesasbitstrings.Their 13
PAGE 22
approachbywayofbeepwaves"wasadoptedtobroadcastinginnetworkswithbeeping byCzumajandDavies[34].Inarelatedwork,Ghaarietal.[50]developedrandomized broadcastingandmultibroadcastinginradionetworkswithcollisiondetection. 14
PAGE 23
4.TECHNICALPRELIMINARIES Asynchronoussharedmemorysysteminwhichsome n processorsoperateconcurrently istheassumedmodelofcomputation.Theessentialpropertiesofsuchsystemsareasfollows:1sharedmemorycellshaveonlyreading/writingcapabilities,and2operationsof accessingthesharedregistersaregloballysynchronizedsothatprocessorsworkinlockstep. Anexecutionofanalgorithmisstructuredasasequenceof rounds sothateachprocessor performseitherareadfromorawritetoasharedmemorycell,alongwithlocalcomputation. Weassumethataprocessorcarriesoutitsprivatecomputationinaroundinanegligible portionoftheround.Processorscangenerateasmanyprivaterandombitsperroundas needed;alltheserandombitsgeneratedinanexecutionareassumedtobeindependent. Eachsharedmemorycellisassumedtobeinitializedto0asadefaultvalue.This assumptionsimpliestheexposition,butitcanberemovedasanyalgorithmassumingsuch aninitializationcanbemodiedinarelativelystraightforwardmannertoworkwithdirty memory.Asharedmemorycellcanstoreanyvalueasneededinalgorithms,inparticular, integersofmagnitudethatmaydependon n ;allouralgorithmsrequireamemorycellto store O log n bits.Aninvocationofeitherreadingfromorwritingtoamemorylocation iscompletedintheroundofinvocation.Thismodelofcomputationisreferredinthe literatureasthe ParallelRandomAccessMachinePRAM [59,81].PRAMisusually denedasamodelwithunlimitednumberofsharedmemorycells,byanalogywiththe randomaccessmachineRAMmodel.Weconsiderthefollowingtwoinstantiationsofthe model,determinedbytheamountofsharedmemory.Inonesituation,thereisaconstant numberofsharedmemorycells,whichisindependentofthenumberofprocessors n butas largeasneededinthespecicalgorithm.Intheothercase,thenumberofsharedmemory cellsisunlimitedinprinciple,buttheexpectednumberofsharedregistersaccessedinan executiondependson n andissoughttobeminimized. A concurrentread occurswhenagroupofprocessorsreadfromthesamememorycell inthesameround;thisresultsineachoftheseprocessorsobtainingthevaluestoredinthe 15
PAGE 24
memorycellattheendoftheprecedinground.A concurrentwrite occurswhenagroup ofprocessorsinvokeawritetothesamememorycellinthesameround.Withoutlossof generality,wemayassumethataconcurrentreadofamemorycellandaconcurrentwrite tothesamememorycelldonotoccursimultaneously:thisisbecausewecoulddesignate roundsonlyforreadingandonlyforroundingdependingontheirparity,therebyslowingthe algorithmbyafactoroftwo.Aclaricationisneededregardingwhichvaluegetswrittentoa memorycellinaconcurrentwrite,whenmultipledistinctvaluesareattemptedtobewritten; suchstipulationsdeterminesuitablevariantsofthemodel.Wewillconsideralgorithmsfor thefollowingtwoPRAMvariantsdeterminedbytheirrespectiveconcurrentwritesemantics. CommonPRAM isdenedbythepropertythatwhenagroupofprocessorswanttowriteto thesamesharedmemorycellinaroundthenallthevaluesthatanyoftheprocessors wanttowritemustbeidentical,otherwisetheoperationisillegal.Concurrentattempts towritethesamevaluetoamemorycellresultinthisvaluegettingwritteninthis round. ArbitraryPRAM allowsattemptstowriteanylegitimatevaluestothesamememorycell inthesameround.Whenthisoccurs,thenoneofthesevaluesgetswritten,whilea selectionofthisvalueisarbitrary.Allpossibleselectionsofvaluesthatgetwritten needtobetakenintoaccountwhenarguingaboutcorrectnessofanalgorithm. WewillrelyoncertainstandardalgorithmsdevelopedforPRAMs,asexplainedin[59, 81].Oneofthemisforprextypecomputations.Atypicalsituationinwhichitisapplied occurswhenthereisanarrayof m sharedmemorycells,eachmemorycellstoringeither0 or1.Thismayrepresentanarrayofbinswhere1standsforanonemptybinwhile0for anemptybin.Lettherankofanonemptybinofaddress x bethenumberofnonempty binswithaddressessmallerthanorequalto x .Rankscanbecomputedintime O log m by usinganauxiliarymemoryof O m cells,assumingthereisatleastoneprocessorassignedto anonemptybin,whileotherprocessorsdonotparticipate.Thebinsareassociatedwiththe leavesofabinarytree.Theprocessorstraverseabinarytreefromtheleavestotherootand 16
PAGE 25
backtotheleaves.Whenupdatinginformationatanode,onlytheinformationstoredatthe parent,thesiblingandthechildrenisused.Wemayobservethatthesamememorycanbe usedrepeatedlywhensuchcomputationneedstobeperformedmultipletimes.Apossible approachistoverifyiftheinformationataneededmemorycell,representingeitheraparent, asiblingorachildofavisitednode,isfreshorratherstalefrompreviousexecutions.This couldbeaccomplishedinthefollowingthreestepsbyaprocessor.First,theprocessorerases amemorycellitneedstoreadbyrewritingitspresentvaluebyablankvalue.Second,the processorwritesagainthevalueatnodeitvisits,whichmayhavebeenerasedintheprevious stepbyotherprocessorsthatneedthevalue.Finally,theprocessorreadsagainthememory cellitjusterased,toseeifitstayserased,whichmeansitscontentswerestale,ornot,which meansitscontentsgotrewrittensotheyarefresh. Ballsintobins. Assigningnamestoprocessorscanbevisualizedasthrowingballsinto bins.Imaginethatballsarehandledbyprocessorsandbinsarerepresentedbyeithermemory addressesorroundsinasegmentofrounds.Throwingaballmeanseitherwritingintosome memoryaddressavaluethatrepresentsaballorchoosingaroundfromasegmentofrounds. A collision occurswhentwoballsendupinthesamebin;thismeansthattwoprocessors wrotetothesamememoryaddress,notnecessarilyinthesameround,orthattheyselected thesameround.The rank ofabincontainingaballisthenumberofbinswithsmaller orequalnamesthatcontainballs.Wheneachinagroupofprocessorsthrowsaballand thereisnocollisionthenthisinprinciplebreakssymmetryinamannerthatallowstoassign uniquenamesinthegroup,namely,ranksofselectedbinsmayserveasnames. Thefollowingtermsrefertothestatusofabininagivenround.Abiniscalled empty wheretherearenoballsinit.Abinis singleton whenitcontainsasingleball.Abin is multiple whenthereareatleasttwoballsinit.Finally,abinwithatleastoneballis occupied Theideaofrepresentingattemptstoassignnamesasthrowingballsintobinsisquite generic.Inparticular,itwasappliedbyEgeciogluandSingh[39],whoproposedasyn17
PAGE 26
chronousalgorithmthatrepeatedlythrowsallballstogetherintoallavailablebins,the selectionsofbinsforballsmadeindependentlyanduniformlyatrandom.Intheiralgorithm for n processors,wecanuse n memorycells,where > 1.Letuschoose =3for thefollowingcalculationstobespecic.Thisalgorithmhasanexponentialexpectedtime performance.Toseethis,weestimatetheprobabilitythateachbiniseithersingletonor empty.Lettheballsbethrownonebyone.Aftertherst n= 2ballsareinsingletonbins,the probabilitytohitanemptybinisatmost 2 : 5 n 3 n = 5 6 ;wetreatthisasasuccessinaBernoulli trial.Theprobabilityof n= 2suchsuccessesisatmost 5 6 n= 2 ,sotheexpectedtimetowait forthealgorithmtoterminateisatleast )]TJ 5.48 3.996 Td [(q 6 5 n ,whichisexponentialin n Weconsiderrelatedprocessesthatcouldbeasfastas O log n inexpectedtime,while stillusingonly O n sharedmemorycells,seeSection6.4.Theideaistoletballsinsingleton binsstayputandonlymovethosethatcollidedwithotherballsbylandinginbinsthat becametherebymultiple.ToimplementthisonaCommonPRAM,weneedawaytodetect collisions,whichweexplainnext. Collisionsamongballs. WewillusearandomizedprocedureforCommonPRAMto verifyifacollisionoccursinabin,say,abin x ,whichisexecutedbyeachprocessorthat selectedbin x .Thisprocedure VerifyCollision isrepresentedinFigure4.1.Thereare twoarraysTAILSandHEADSofsharedmemorycells.Bin x isveriedbyusingmemory cellsTAILS[ x ]andHEADS[ x ].First,thememorycellsTAILS[ x ]andHEADS[ x ]aresetto falseeach,andnextoneofthesememorycellsisselectedrandomlyandsettotrue. Lemma1 Foraninteger x ,procedure VerifyCollision x executedbyoneprocessor neverdetectsacollision,andwhenmultipleprocessorsexecutethisprocedurethenacollision isdetectedwithprobabilityatleast 1 2 Proof: Whenonlyoneprocessorexecutestheprocedure,thenrsttheprocessorsetsboth Heads [ x ]and Tails [ x ]tofalseandnextonlyoneofthemtotrue.Thisguaranteesthat Heads [ x ]and Tails [ x ]storedierentvaluesandsocollisionisnotdetected.Whensome 18
PAGE 27
Procedure VerifyCollision x initialize Heads [ x ] Tails [ x ] false toss v outcomeoftossingafaircoin iftoss v =tails thenTails [ x ] trueelseHeads [ x ] true ifTails [ x ]= Heads [ x ] thenreturntrueelsereturnfalse Figure4.1: Apseudocodeforaprocessor v ofaCommonPRAM,where x isapositiveinteger. Heads and Tails arearraysofsharedmemorycells. Whentheparameter x isdroppedinacallthenthismeansthat x =1.The procedurereturns true whenacollisionhasbeendetected. m> 1processorsexecutetheprocedure,thencollisionisnotdetectedonlywheneither allprocessorsset Heads [ x ]totrueorallprocessorsset Tails [ x ]totrue.Thismeansthat theprocessorsgeneratethesameoutcomeintheircointosses.Thisoccurswithprobability 2 )]TJ/F24 7.9701 Tf 6.587 0 Td [(m +1 ,whichisatmost 1 2 Abeepingchannelisrelatedtomultipleaccesschannels[25].Itisanetworkconsistingof some n stationsconnectedtoacommunicationmedium.Weconsidersynchronousbeeping channels,inthesensethatanexecutionofacommunicationalgorithmispartitionedinto consecutiverounds.Allthestationsstartanexecutiontogether.Ineachround,astation mayeitherbeeporpause.Whensomestationbeepsinaround,theneachstationhears thebeep,otherwiseallthestationsreceivesilenceasfeedback.Whenmultiplestationsbeep togetherinaroundthenwecallthisa collision Wesaythataparameterofacommunicationnetworkis known whenitcanbeusedin codesofalgorithms.Therelevantparameterusedinthisthesisisthenumberofstations n Weconsidertwocases,inwhicheither n isknownoritisnot. Randomizedalgorithmsuserandombits,understoodasoutcomesoftossesofafaircoin. Alldierentrandombitsusedbyouralgorithmsareconsideredstochasticallyindependent fromeachother. Ournamingalgorithmshaveastheirgoaltoassignuniqueidentierstothestations, 19
PAGE 28
Procedure DetectCollision toss v outcomeofarandomcointoss iftoss v = heads / rstround / thenbeepelsepause iftoss v = tails / secondround / thenbeepelsepause return abeepwasheardineachofthetworounds Figure4.2: Apseudocodeforastation v .Theproceduretakestworounds toexecute.Itdetectacollisionandreturnstrue"whenabeepisheardineach oftherounds,otherwiseitdoesnotdetectacollisionandreturnsfalse." moreoverwewantnamestobeintegersinthecontiguousrange f 1 ; 2 ;:::;n g ,whichwedenote as[ n ].TheMonteCarlonamingalgorithmthatwedevelophasthepropertythatthenames itassignsmakeanintervalofintegersoftheform[ k ]for k n ,sothatwhen k
PAGE 29
i 1.Wearguebydeferreddecisions."Oneofthesestationstossesacoinanddetermines itsoutcome X .Theother i )]TJ/F15 11.9552 Tf 11.188 0 Td [(1stationsparticipatingconcurrentlyinthiscallalsotosstheir coins;herewehave i )]TJ/F15 11.9552 Tf 12.679 0 Td [(1 0,sotherecouldbenosuchastation.Theonlypossibility nottodetectacollisionisforallofthese i )]TJ/F15 11.9552 Tf 11.944 0 Td [(1stationsalsoproduce X .Thishappenswith probability2 )]TJ/F24 7.9701 Tf 6.586 0 Td [(i +1 inthisonecall.Theprobabilityofproducingonly false duringthe m callsistheproductoftheseprobabilities.Whenwemultiplythemoutover m instancesof theprocedurebeingperformed,thentheoutcomeis2 )]TJ/F24 7.9701 Tf 6.586 0 Td [(k + m ,becausenumbers i sumupto k andthenumberoffactorsis m Pseudocodeconventionsandnotations. Wegivepseudocoderepresentationsofalgorithms,asinFigure4.1.Theconventionsofpseudocodearesummarizednext. Wewantthat,atanyroundofanexecution,alltheprocessorsthathavenotterminated yettobeatthesamelineofthepseudocode.Inparticular,whenaninstructionisconditional onastatementthenaprocessorthatdoesnotmeettheconditionpausesaslongasitwould beneededforalltheprocessorsthatmeettheconditioncompletetheirinstructions,even whentherearenosuchprocessors. Apseudocodeforaprocessorreferstoanumberofvariables,bothsharedandprivate. Weusethefollowingnotationalconventionstoemphasizetheirrelevantproperties.Shared variableshavenamesstartingwithacapitalletter,whileprivatevariableshavenamesall insmallletters.Whenavariable x isaprivatevariablethatmayhavedierentvaluesat dierentprocessorsatthesametime,thenwedenotethisvariableusedbyaprocessor v by x v .Privatevariablesthathavethesamevalueatthesametimeinalltheprocessorsare usuallyusedwithoutsubscripts,likevariablescontrollingforloops. Eachstationhasitsprivatecopyofanyamongthevariablesusedinthepseudocode. Whenthevaluesofthesecopiesmayvaryacrossthestations,thenweaddthestation'sname asasubscriptofthevariable'snametoemphasizethat,andotherwise,whenallthecopies ofavariablearekeptequalacrossallthestationsthennosubscriptisused. Anassignmentinstructionoftheform x y ::: z ,where x;y;:::;z are 21
PAGE 30
variablesand isavalue,meanstoassign asthevaluetobestoredinallthelisted variables x;y;:::;z Weusethreenotationsforlogarithms.Thenotationlg x standsforthelogarithmof x tothebase2.Thenotationln x denotesthenaturallogarithmof x .Whenthebaseof logarithmsdoesnotmatterthenweuselog x ,likeintheasymptoticnotation O log x Propertiesofnamingalgorithms. Namingalgorithmsindistributedenvironmentsinvolvingmultiwriterreadwritesharedmemoryhavetoberandomizedtobreaksymmetry[6,18].Aneventualassignmentofpropernamescannotbeasureevent,because,in principle,twoprocessorscangeneratethesamestringsofrandombitsinthecourseofan execution.Wesaythatanevent isalmostsure ,or occursalmostsurely ,whenitoccurs withprobability1.When n processorsgeneratetheirprivatestringsofrandombitsthenit isanalmostsureeventthatallthesestringsareeventuallypairwisedistinct.Therefore,a mostadvantageousscenariothatwecouldexpect,whenasetof n processorsistoexecute arandomizednamingalgorithm,isthatthealgorithmeventuallyterminatesalmostsurely andthatatthemomentofterminationtheoutputis correct ,inthattheassignednamesare withoutduplicatesandllthewholeinterval[1 ;n ]. 22
PAGE 31
5.LOWERBOUNDSANDIMPOSSIBILITIES Inthissection,weshowimpossibilityresultstojustifymethodologicalapproachtonamingalgorithmsweapply,anduselowerboundsonperformancemetricsforsuchalgorithms toargueabouttheoptimalityofthealgorithmsdevelopedinsubsequentsections. 5.1Preliminaries Westartwithbasicdenitions,terminologies,andtheoremsthatarediscussedthroughoutthissection. Lowerbounds provethatcertainproblemscannotbesolvedecientlywithoutsucient resourcessuchastimeorspace.Theyalsogiveusanideaaboutwhentostoplookingfor bettersolutions. Impossibilityresults showthatcertainproblemscannotbesolvedundercertainassumptions.Tounderstandthenatureofnamingproblemitisnecessarytounderstand lowerboundsandimpossibilityresults[14,44]. The entropy [33]isthenumberofbitsonaveragerequiredtodescribetherandom variable.The entropyofarandomvariable isalowerboundontheaveragenumberofbits requiredtorepresenttherandomvariable.Theentropyofarandomvariable X witha probabilitymassfunction p x isdenedby H x = )]TJ/F28 11.9552 Tf 11.291 11.358 Td [(X x p x lg p x : Yao'sMinimaxPrinciple [91,77]allowsustoprovelowerboundsontheperformanceof LasVegasandMonteCarloalgorithms.Yao'sMinimaxPrinciplesaysthatforanarbitrary choseninputdistribution,theexpectedrunningtimeoftheoptimaldeterministicalgorithm isalowerboundontheexpectedrunningtimeoftheoptimalrandomizedalgorithm.Yao's MinimaxPrincipleforLasVegasrandomizedalgorithmsasfollows.Let P beaproblemwith aniteset X ofinputsandaniteset A bethesetofallpossibledeterministicalgorithms thatcorrectlysolvetheproblem P .Let cost X;A betherunningtimeofalgorithm A for algorithm A 2A andinput X 2X .Let p beaprobabilitydistributionover X and q over A 23
PAGE 32
Let X p bearandominputchosenaccordingtopand A q showsarandomalgorithmchosen accordingto q .Foralldistributions p over X and q over A min A 2A E [ cost X p ;A ] max X 2X E [ cost X;A q ] : Yao'sMinimaxPrincipleforMonteCarlorandomizedalgorithmsstatethattheexpected runningtimeofanyMonteCarloalgorithmthaterrswithprobability 2 [0 ; 1 2 ]. 5.2LowerBoundsforaPRAM Wegivealgorithmsthatusetheexpectednumberof O n log n randombitswithalarge probability.Thisamountofrandominformationisnecessaryifanalgorithmistoterminate almostsurely.Thefollowingfactisessentiallyafolklore,butsincewedonotknowifitwas provedanywhereintheliterature,wegiveaproofforcompleteness'sake.Ourarguments resorttothenotionsofinformationtheory[33]. Proposition1 Ifarandomizednamingalgorithmiscorrectwithprobability p n ,whenexecutedby n anonymousprocessors,thenitrequires n log n randombitswithprobability atleast p n .Inparticular,aLasVegasnamingalgorithmfor n processorsuses n log n randombitsalmostsurely. Proof: Letusassignconceptualidentierstotheprocessors,forthesakeofargument. These unknownidentiers areknownonlytoanexternalobserverandnottoalgorithms. Thepurposeofexecutingthealgorithmistoassignexplicitidentiers,whichwecall given identiers Letaprocessorwithanunknownname u i generatestringofbits b i ,for i =1 ;:::;n Adistributionofgivenidentiersamongthe n anonymousprocessors,whichresultsfrom executingthealgorithm,isarandomvariable X n withauniformdistributiononthesetofall permutationsoftheunknownidentiers.Thisisbecauseofsymmetry:allprocessorsexecute thesamecode,withoutexplicitprivateidentiers,andifwerearrangethestringsgenerated bits b i amongtheprocessors u i ,thenthisresultsinthecorrespondingrearrangementofthe givennames. 24
PAGE 33
Theunderlyingprobabilityspaceconsistsof n !elementaryevents,eachdeterminedby anassignmentofthegivenidentierstotheprocessorsidentiedbytheunknownidentiers. Itfollowsthateachoftheseeventsoccurswithprobability1 =n !.TheShannonentropyof therandomvariable X n isthuslg n != n log n .Thedecisionaboutwhichassignment ofgivennamesisproducedisdeterminedbytherandombits,astheyaretheonlysourceof entropy,sotheexpectednumberofrandombitsusedbythealgorithmneedstobeaslarge astheentropyoftherandomvariable X n Thepropertythatallassignednamesaredistinctandintheinterval[1 ;n ]holdswith probability p n .Anexecutionneedstogenerateatotalof n log n randombitswithprobabilityatleast p n ,becauseoftheboundonentropy.ALasVegasalgorithmterminatesalmost surely,andreturnscorrectnamesupontermination.Thismeansthat p n =1andsothat n log n randombitsareusedalmostsurely. Weconsidertwokindsofalgorithmicnamingproblems,asdeterminedbytheamount ofsharedmemory.Onecaseisforaconstantnumberofsharedmemorycells,forwhich wegiveanoptimallowerboundontimefor O sharedmemory.Theothercaseiswhen thenumberofsharedmemorycellsandtheircapacityareunbounded,forwhichwegivean absolute"lowerboundontime.Webeginwithlowerboundsthatreecttheamountof sharedmemory. Intuitively,asprocessorsgeneraterandombits,thesebitsneedtobemadecommon knowledgethroughsomeimplicitprocessthatassignsexplicitnames.Thereisanunderlying owofinformationspreadingknowledgeamongtheprocessorsthroughtheavailableshared memory.Timeisboundedfrombelowbytherateofowofinformationandthetotalamount ofbitsthatneedtobeshared. Onthetechnicallevel,inordertoboundtheexpectedtimeofarandomizedalgorithm, weapplytheYao'sminimaxprinciple[91]torelatethisexpectedtimetothedistributional expectedtimecomplexity.Arandomizedalgorithmwhoseactionsaredeterminedbyrandom bitscanbeconsideredasaprobabilitydistributionondeterministicalgorithms.Adetermin25
PAGE 34
isticalgorithmhasstringsofbitsgiventoprocessorsastheirinputs,withsomeprobability distributiononsuchinputs.Theexpectedtimeofsuchadeterministicalgorithm,giveany specicprobabilitydistributionontheinputs,isalowerboundontheexpectedtimeofa randomizedalgorithm. Tomakesuchinterpretationofrandomizedalgorithmspossible,weconsiderstringsof bitsofequallength.Withsucharestrictiononinputs,deterministicalgorithmmaynotbe abletoassignpropernamesforsomeassignmentsofinputs,forexample,whenalltheinputs areequal.Weaugmentsuchdeterministicalgorithmsinaddinganoptionforthealgorithm towithholdadecisiononassignmentofnamesandoutputnoname"forsomeprocessors. Thisisinterpretedasthedeterministicalgorithmneedinglongerinputs,forwhichthegiven inputsareprexes,andwhichfortherandomizedalgorithmmeansthatsomeprocessors needtogeneratemorerandombits. Regardingprobabilitydistributionsforinputsofagivenlength,italwayswillbethe uniformdistribution.Thisisbecausewewilluseanassessmentofentropyofsuchadistribution. Theorem1 ArandomizednamingalgorithmforaCommonPRAMwith n processorsand C> 0 sharedmemorycellsoperatesin n log n=C expectedtimewhenitiseitheraLas VegasalgorithmoraMonteCarloalgorithmwiththeprobabilityoferrorsmallerthan 1 = 2 Proof: WeconsiderLasVegasalgorithmsinthisargument,theMonteCarlocaseissimilar, thedierenceisinapplyingYao'sprincipleforMonteCarloalgorithms.Weinterpreta randomizedalgorithmasadeterministiconeworkingwithallpossibleassignmentsofrandombitsasinputswithauniformmassfunctionontheinputs.Theexpectedtimeofthe deterministicalgorithmisalowerboundontheexpectedtimeoftherandomizedalgorithm. Thereare n !possibleassignmentsofgivennamestotheprocessors.Eachofthem occurswiththesameprobability1 =n !whentheinputbitstringsareassigneduniformlyat random.Thereforetheentropyofnameassignments,interpretedasarandomvariable,is 26
PAGE 35
lg n != n log n Nextweconsiderexecutionsofsuchadeterministicalgorithmontheinputswithauniformdistribution.Wemayassumewithoutlossofgeneralitythatanexecutionisstructured intothefollowingphases,eachconsistingof C +1rounds.Intherstroundofaphase, eachprocessoreitherwritesintoasharedmemorycellorpauses.Inthefollowingroundsof aphase,everyprocessorlearnsthecurrentvaluesofeachamongthe C memorycells.This maytake C roundsforeveryprocessortoscanthewholesharedmemory,butwedonot includethisreadingoverheadascontributingtothelowerbound.Instead,sincethisisa simulationanyway,weconservativelyassumethattheprocessoflearningallthecontentsof sharedmemorycellsattheendofaphaseisinstantaneousandcomplete. TheCommonvariantofPRAMrequiresthatifamemorycelliswrittenintoconcurrently thenthereisacommonvaluethatgetswrittenbyallthewriters.Suchavalueneedsto bedeterminedbythecodeandtheaddressofamemorycell.Thismeansthat,foreach phaseandanymemorycell,aprocessorchoosingtowriteintothismemorycellknowsthe commonvaluetobewritten.Bythestructureofexecution,inwhichallprocessorsreadall theregistersafteraroundofwriting,anyprocessorknowswhatvaluegetswrittenintoeach availablememorycellinaphase,ifanyiswrittenintoaparticularcell.Thisimpliesthat thecontentswrittenintosharedmemorycellsmaynotconveyanynewinformationbutare alreadyimplicitinthestatesoftheprocessorsrepresentedbytheirprivatememoriesafter readingthewholesharedmemory. Whenaprocessorreadsallthesharedmemorycellsinaphase,thentheonlynew informationitmaylearnistheaddressesofmemorycellsintowhichwriteswereperformed andthoseintowhichtherewerenowrites.Thismakesitpossibleobtainatmost C bitsof informationperphase,becauseeachregisterwaseitherwrittenintoornot. Thereare n log n bitsofinformationthatneedtobesettledandonephasechangesthe entropybyatmost C bits.Itfollowsthattheexpectednumberofphasesofthedeterministic algorithmis n log n=C .BytheYao'sprinciple, n log n=C isalowerboundonthe 27
PAGE 36
expectedtimeofarandomizedalgorithm. ForArbitraryPRAM,writingcanspreadinformationthroughthewrittenvalues,becausedierentprocessescanattempttowritedistinctstringsofbits.Therateofowof informationisconstrainedbythefactthatwhenmultiplewritersattempttowritetothe samememorycellthenonlyoneofthemsucceeds,ifthevalueswrittenaredistinct.This intuitivelymeansthatthesizeofagroupofprocessorswritingtothesameregisterdetermineshowmuchinformationthewriterslearnbysubsequentreading.Theseintuitionsare madeformalintheproofofthefollowingTheorem2. Theorem2 ArandomizednamingalgorithmforanArbitraryPRAMwith n processorsand C> 0 sharedmemorycellsoperatesin n=C expectedtimewhenitiseitheraLasVegas algorithmoraMonteCarloalgorithmwiththeprobabilityoferrorsmallerthan 1 = 2 Proof: WeconsiderLasVegasalgorithmsinthisargument,theMonteCarlocaseissimilar, thedierenceisinapplyingYao'sprincipleforMonteCarloalgorithms.Weagainreplacea givenrandomizedalgorithmbyitsdeterministicversionthatworksonassignmentsofstrings ofbitsofthesamelengthasinputs,withsuchinputsassigneduniformlyatrandomtothe processors.Thegoalistousethepropertythattheexpectedtimeofthisdeterministic algorithm,foragivenprobabilitydistributionofinputs,isalowerboundontheexpected timeoftherandomizedalgorithm.Next,weconsiderexecutionsofthisadeterministic algorithm. SimilarlyasintheproofofTheorem1,weobservethatthereare n !assignmentsofgiven namestotheprocessorsandeachofthemoccurswiththesameprobability1 =n !,whenthe inputbitstringsareassigneduniformlyatrandom.Theentropyofnameassignmentsis againlg n != n log n .Thealgorithmneedstomaketheprocessorslearn n log n bits usingtheavailable C> 0sharedmemorycells. Wemayinterpretanexecutionasstructuredintophases,suchthateachprocessorperformsatmostonewriteinaphaseandthenreadsalltheregisters.Thetimeofaphaseis 28
PAGE 37
assumedconservativelytobe O .Consideraregisterandagroupofprocessorsthatattempttowritetheirvaluesintothisregisterinaphase.Thevaluesattemptedtobewritten arerepresentedasstringsofbits.Ifsomeofthesevalueshave0andsomehave1atsome bitpositionamongthestrings,thenthisbitpositionmayconveyonebitofinformation. Themaximumamountofinformationisprovidedbyawritewhenthewrittenstringofbits facilitatesidentifyingthewriterbycomparingitswrittenvaluetotheothervaluesattempted tobewrittenconcurrentlytothesamememorycell.Itfollowsthatthisamountisatmost thebinarylogarithmofthesizeofthisgroupofprocessors,sothateachmemorycellwritten toinaroundcontributesatmostlg n bitsofinformationbecausetheremaybeatmost n writerstoit.Sothemaximumnumberofbitsofinformationlearntbytheprocessorsina phaseis C lg n Sincetheentropyoftheassignmentofnamesislg n != n log n ,theexpectednumberof phasesofthedeterministicalgorithmis n lg n= C lg n = n=C .BytheYao'sprinciple, thisisalsoalowerboundontheexpectedtimeofarandomizedalgorithm. Next,weconsiderabsolute"requirementsontimeforaPRAMtoassignuniquenames tothe n availableprocessors.Thegeneralityofthelowerboundwegivestemsfromthe weaknessofassumptions.First,nothingisassumedabouttheknowledgeof n .Second, concurrentwritingisnotconstrainedinanyway.Third,sharedmemorycellsareunbounded intheirnumberandsize.Kuttenetal.[68]showedthatanyLasVegasnamingalgorithm forasynchronousreadwritesharedmemorysystemshasexpectedtimelog n againsta certainobliviousschedule. WeshownextinTheorem3thatanyLasVegasnamingalgorithmhaslog n expected timeforthesynchronousscheduleofevents.Theargumentwegiveisinthespiritofsimilar argumentsappliedbyCooketal.[31]andBeame[19].Whattheseargumentsshareareaformalizationofthenotionofowofinformationduringanexecutionofanalgorithm,combined witharecursiveestimateoftherateofthisow. Therelation processor v knowsprocessor w inround t isdenedrecursivelyasfollows. 29
PAGE 38
First,foranyprocessor v ,wehavethat v knows v inanyround t> 0.Second,ifaprocessor v writestoasharedmemorycell R inaround t 1 andaprocessor w readsfrom R inaround t 2 >t 1 suchthattherewasnootherwriteintothismemorycellafter t 1 andpriorto t 2 then processor w knowsinround t 2 eachprocessorthat v knowsinround t 1 .Finally,therelation isthesmallesttransitiverelationthatsatisesthetwopostulatesformulatedabove.This meansthatitisthesmallestrelationsuchthatifprocessor v knowsprocessor w inround t 1 and z knows v inround t 2 suchthat t 2 >t 1 thenprocessor z knows w inround t 2 .In particular,theknowledgeaccumulateswithtime,inthatifaprocessor v knowsprocessor z inround t 1 andround t 2 issuchthat t 2 >t 1 then v knows z inround t 2 aswell. Lemma3 Let A beadeterministicalgorithmthatassignsdistinctnamestotheprocessors, withthepossibilitythatsomeprocessorsoutputnoname"forsomeinputs,wheneachnode hasaninputstringofbitsofthesamelength.Whenalgorithm A terminateswithproper namesassignedtoalltheprocessorstheneachprocessorknowsalltheotherprocessors. Proof: Wemayassumethat n> 1asotherwiseoneprocessorsknowsitself.Letusconsider anassignment I ofinputsthatresultsinaproperassignmentofdistinctnamestoallthe processorswhenalgorithm A terminates.Thisimpliesthatalltheinputsintheassignment I aredistinctstringsofbits,asotherwisesometwoprocessors,say, v and w thatobtainthe sameinputstringofbitswouldeitherassignthemselvesthesamenameordeclarenoname" asoutput.Supposethatprocessor v doesnotknow w when v haltsforinputsfrom I Consideranassignmentofinputs J whichisthesameas I forprocessorsdierentfrom w andsuchthattheinputof w isthesameasinputfor v in I .Thentheactionsofprocessor v wouldbethesamewith J aswith I ,because v isnotaectedbytheinputof w ,sothat v wouldassignitselfthesamenamewith J aswith I .Buttheactionsofprocessor w would bethesamein J asthoseof v ,becausetheirinputstringsofbitsareidenticalunder J .It followsthat w wouldassignitselfthenameof v ,resultinginduplicatenames. WewilluseLemma3toassesrunningtimesbyestimatingthenumberofinterleaved 30
PAGE 39
readsandwritesneededforprocessorstogettoknowalltheprocessors.Therateoflearning suchinformationmaydependontime,becausewedonotrestricttheamountofshared memory,unlikeinTheorems1and2.Indeed,theratemayincreaseexponentially,under mostconservativeestimates. ThefollowingTheorem3holdsforbothCommonandArbitraryPRAMs.Theargument usedintheproofisgeneralenoughnottodependonanyspecicsemanticsofwriting. Theorem3 ArandomizednamingalgorithmforaPRAMwith n processorsoperatesin log n expectedtimewhenitiseitheraLasVegasalgorithmoraMonteCarloalgorithm withtheprobabilityoferrorsmallerthan 1 = 2 Proof: TheargumentisforaLasVegasalgorithm,theMonteCarlocaseissimilar.A randomizedalgorithmcanbeinterpretedasaprobabilitydistributiononanitesetof deterministicalgorithms.Suchaninterpretationworkswheninputstringsforadeterministic algorithmareofthesamelength.Weconsiderallsuchpossiblelengthsfordeterministic algorithms,similarlyasinthepreviousproofsoflowerbounds. Letusconsideradeterministicalgorithm A ,andletinputsbestringsofbitsofthesame length.Wemaystructureanexecutionofthisalgorithm A into phases asfollows.Aphase consistsoftworounds.Intherstroundofaphase,eachprocessoreitherwritestoashared memorycellorpauses.Inthesecondroundofaphase,eachprocessoreitherreadsfroma sharedmemorycellorpauses.Suchstructuringcanbedonewithoutlossofgeneralityatthe expenseofslowingdownanexecutionbyafactorofatmost2.Observethattheknowledge intherstroundofaphaseisthesameasinthelastroundoftheprecedingphase. Phasesarenumberedbyconsecutivelyincreasingintegers,startingfrom1.Aphase i comprisedpairsofrounds f 2 i )]TJ/F15 11.9552 Tf 10.022 0 Td [(1 ; 2 i g ,forintegers i 1.Inparticular,therstphaseconsists ofrounds1and2.Wealsoaddphase0thatrepresentstheknowledgebeforeanyreadsor writeswereperformed. Weshowthefollowinginvariant,for i 0:aprocessorknowsatmost2 i processorsat 31
PAGE 40
theendofphase i .Theproofofthisinvariantisbyinductionon i Thebasecaseisfor i =0.Theinvariantfollowsfromthefactthataprocessorknows onlyoneprocessorinphase0,namelyitself,and2 0 =1. Toshowtheinductivestep,supposetheinvariantholdsforaphase i 0andconsider thenextphase i +1.Aprocessor v mayincreaseitsknowledgebyreadinginthesecond roundofphase i +1.Supposethereadisfromasharedmemorycell R .Thelatestwriteinto thismemorycelloccurredbytherstroundofphase i +1.Thismeansthattheprocessor w thatwroteto R byphase i +1,asthelastonethatdidwrite,knewatmost2 i processors intheroundofwriting,bytheinductiveassumptionandthefactthatwhatiswrittenin phase i +1waslearntbytheimmediatelyprecedingphase i .Moreover,bythesemanticsof writing,thevaluewrittento R by w inthatroundremovedanypreviousinformationstored in R .Processor v startsphase i +1knowingatmost2 i processors,andalsolearnsofatmost 2 i otherprocessorsbyreadinginphase i +1,namely,thosevaluesknownbythelatestwriter ofthereadcontents.Itfollowsthatprocessor v knowsatmost2 i +2 i =2 i +1 processorsby theendofphase i +1. Whenpropernamesareassignedbysuchadeterministicalgorithm,theneachprocessor knowseveryotherprocessor,byLemma3.Aprocessorknowseveryotherprocessorina phase j suchthat2 j n ,bytheinvariantjustproved.Suchaphasenumber j satises j lg n ,andittakes2lg n roundstocompletelg n phases. Letusconsiderinputsstringsofbitsassignedtoprocessorsuniformlyatrandom.We needtoestimatetheexpectedrunningtimeofanalgorithm A onsuchinputs.Letus observethat,inthecontextofinterpretingdeterministicexecutionsforthesaketoapply Yao'sprinciple,terminatingexecutionsof A thatdonotresultinnamesassignedtoall theprocessorscouldbeprunedfromaboundontheirexpectedrunningtime,becausesuch executionsaredeterminedbyboundedinputstringsofbitsthatarandomizedalgorithm wouldextendtomakethemsucientlylongtoassignpropernames.Inotherwords,fromthe perspectiveofrandomizedalgorithms,suchprematurelyendingexecutionsdonotrepresent 32
PAGE 41
realterminatingones. Theexpectedtimeof A ,conditionalonterminatingwithpropernamesassigned,is thereforeatleast2lg n .Weconclude,bytheYao'sprinciple,thatanyrandomizednaming algorithmhaslog n expectedruntime. ThethreelowerboundsontimegiveninthisSectionmaybeappliedintwoways.One istoinferoptimalityoftimeforagivenamountofsharedmemoryused.Anotheristo inferoptimalityofsharedmemoryusegivenatimeperformance.Thisissummarizedinthe followingCorollary1. Corollary1 IftheexpectedtimeofanamingLasVegasalgorithmis O n onanArbitrary PRAMwith O sharedmemory,thenthistimeperformanceisasymptoticallyoptimal. IftheexpectedtimeofanamingLasVegasalgorithmis O n log n onaCommonPRAM with O sharedmemory,thenthistimeperformanceisasymptoticallyoptimal.IfaLas Vegasnamingalgorithmoperatesintime O log n onanArbitraryPRAMusing O n= log n sharedmemorycells,thenthisamountofsharedmemoryisasymptoticallyoptimal.IfaLas Vegasnamingalgorithmoperatesintime O log n onaCommonPRAMusing O n shared memorycells,thenthisamountofsharedmemoryisoptimal. Proof: Weverifythatthelowerboundsmatchtheassumedupperbounds.ByTheorem2,a LasVegasalgorithmoperatesalmostsurelyin n timeonanArbitraryPRAMwhenspace is O .ByTheorem1,aLasVegasalgorithmoperatesalmostsurelyin n log n timeona CommonPRAMwhenspaceis O .ByTheorem2,aLasVegasalgorithmoperatesalmost surelyinlog n timeonanArbitraryPRAMwhenspaceis O n= log n .ByTheorem1, aLasVegasalgorithmoperatesalmostsurelyinlog n timeonaCommonPRAMwhen spaceis O n AnamingalgorithmcannotbeLasVegaswhen n isunknown,aswasobservedbyKutten etal.[68]inamoregeneralcaseofasynchronouscomputationsagainstanobliviousadversary. Weshowananalogousfactforsynchronouscomputations. 33
PAGE 42
Proposition2 ThereisnoLasVegasnamingalgorithmforaPRAMwithatleasttwo processorsthatdoesnotrefertothetotalnumberofprocessors. Proof: Letussuppose,toarriveatacontradiction,thatsuchanamingLasVegasalgorithm exists.Considerasystemof n 1processors,when n isanarbitrarypositiveinteger,and anexecution E onthese n processorsthatusesspecicstringsofrandombitssuchthatthe algorithmterminatesin E withtheserandombits.Suchstringsofrandombitsexistbecause thealgorithmterminatesalmostsurely. Let v 1 beaprocessorthathaltslatestin E amongthe n processors.Let E bethestring ofrandombitsgeneratedbyprocessor v 1 bythetimeithaltsin E .Consideranexecution E 0 on n +1 2processorssuchthat n processorsobtainthesamestringsofrandombitsas in E andanextraprocessor v 2 obtains E asitsrandombits.Theexecutions E and E 0 are indistinguishableforthe n processorsparticipatingin E ,sotheyassignthemselvesthesame namesandhalt.Processor v 2 performsthesamereadsandwritesasprocessor v 1 andassigns itselfthesamenameasprocessor v 1 doesandhaltsinthesameroundasprocessor v 1 .This istheterminationroundbecausebythattimealltheotherprocessorhavehaltedaswell. Itfollowsthatexecution E 0 resultsinanamebeingduplicated.Theprobabilityofduplicationfor n +1processorsisatleastaslargeastheprobabilitytogeneratetheniterandom stringsfor n processorsasin E ,andadditionallytogenerate E foranextraprocessor v 2 ,so thisprobabilityispositive. If n isunknown,thentherestriction O n log n onthenumberofrandombitsmakesit inevitablethattheprobabilityoferrorisatleastpolynomiallyboundedfrombelow,aswe shownext. Proposition3 Forunknown n ,ifarandomizednamingalgorithmisexecutedby n anonymousprocessors,thenanexecutionisincorrect,inthatduplicatenamesareassignedto distinctprocessors,withprobabilitythatisatleast n )]TJ/F22 7.9701 Tf 6.587 0 Td [( ,assumingthatthealgorithmuses O n log n randombitswithprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [( 34
PAGE 43
Proof: Supposethealgorithmusesatmost cn lg n randombitswithaprobability p n when executedbyasystemof n processors,forsomeconstant c> 0.Thenoneoftheseprocessors usesatmost c lg n bitswithaprobability p n ,bythepigeonholeprinciple. Consideranexecutionfor n +1processors.Letusdistinguishaprocessor v .Considerthe actionsoftheremaining n processors:oneofthem,say w ,usesatmost c lg n bitswiththe probability p n .Processor v generatesthesamestringofbitswithprobability2 )]TJ/F24 7.9701 Tf 6.586 0 Td [(c lg n = n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c Therandombitsgeneratedby w and v areindependent.Thereforeduplicatenamesoccur withprobabilityatleast n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c p n .Whenwehaveabound p n =1 )]TJ/F18 11.9552 Tf 10.056 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [( ,thentheprobability ofduplicatenamesisatleast n )]TJ/F24 7.9701 Tf 6.587 0 Td [(c )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [( = n )]TJ/F22 7.9701 Tf 6.586 0 Td [( 5.3LowerBoundsforaChannelwithBeeping Webeginwithanobservation,formulatedasProposition4,thatifthesystemissucientlysymmetricthenrandomnessisnecessarytobreaksymmetry.Thegivenargumentis standardandisgivenforcompletenesssake;see[6,14,44]. Proposition4 Thereisnodeterministicnamingalgorithmforasynchronouschannelwith beepingwithatleasttwostations,inwhichallstationsareanonymous,suchthatiteventually terminatesandassignspropernames. Proof: Wearguebycontradiction.Supposethatthereexistsadeterministicalgorithmthat eventuallyterminateswithpropernamesassignedtotheanonymousstations.Letallthe stationsstartinitializedtothesameinitialstate.Thefollowinginvariantismaintainedin eachround:theinternalstatesofthestationsareallequal.Weproceedbyinductiononthe roundnumber.Thebaseofinductionissatisedbytheassumptionabouttheinitialization. Fortheinductivestep,weassumethatthestationsareinthesamestate,bytheinductive assumption.Theneitherallofthempauseorallofthembeepinthenextround,sothat eitherallofthemheartheirownbeeporallofthempauseandhearsilence.Thisresults inthesameinternalstatetransition,whichshowstheinductivestep.Whenthealgorithm eventuallyterminates,theneachstationassignstoitselftheidentierdeterminedbyits 35
PAGE 44
state.Theidentieristhesameinallstationsbecausetheirstatesarethesame,bythe invariant.Thisviolatesthedesiredpropertyofnamestobedistinct,becausethereareat leasttwostationswiththesamename. Proposition4justiesdevelopingrandomizednamingalgorithms.Wecontinuewith entropy"arguments;seethebookbyCoverandThomas[33]forasystematicexpositionof informationtheory.Anexecutionofanamingalgorithmcoordinatesandtranslatedrandom bitsintonames.Thissameamountofentropyneedstobeprocessed/communicatedonthe channel,bytheShannon'snoiselesscodingtheorem.AnanalogueofthefollowingProposition5wasstatedinProposition1forthemodelofsynchronizedprocessorscommunicating byreadingandwritingtosharedmemory. Proposition5 Ifarandomizednamingalgorithmforachannelwithbeepingisexecutedby n anonymousstationsandiscorrectwithprobability p n thenitrequires n log n random bitsintotaltobegeneratedwithprobabilityatleast p n .Inparticular,aLasVegasnaming algorithmuses n log n randombitsalmostsurely. Oneroundofanexecutionofanamingalgorithmallowsthestationsthatdonottransmit tolearnatmostonebit,because,fromtheperspectiveofthesestations,aroundiseither silentorthereisabeep.Intuitively,therunningtimeisproportionaltotheamountof entropythatisneededtoassignnames.ThisintuitionleadstoProposition6.Initsproof, wecombineShannon'sentropy[33]withYao'sprinciple[91]. Proposition6 Arandomizednamingalgorithmforabeepingchannelwith n stationsoperatesin n log n expectedtime,whenitiseitheraLasVegasalgorithmoraMonteCarlo algorithmwiththeprobabilityoferrorsmallerthan 1 = 2 Proof: WeapplytheYao'sminimaxprincipletoboundtheexpectedtimeofarandomized algorithmbythedistributionalcomplexityofnaming.WeconsiderLasVegasalgorithms rst. 36
PAGE 45
Arandomizedalgorithmusingstringsofrandombitsgeneratedbystationscanbeconsideredasadeterministicalgorithm D onallpossibleassignmentsofsuchsucientlylong stringsofbitstostationsastheirinputs.Weconsiderassignmentsofstringsofbitsofan equallengthwiththeuniformdistributionamongallsuchassignmentsofstringsofthesame length.Onagivenassignmentofinputstringsofbitstostations,thedeterministicalgorithmseitherassignspropernamesorfailstodoso.Afailuretoassignpropernameswith someinputisinterpretedastherandomizedalgorithmcontinuingtoworkwithadditional randombits,whichcomesatanextratimecost.Thisisjustiedbyacombinationoftwo factors.OneisthatthealgorithmisLasVegasandsoithaltsalmostsurely,andwitha correctoutput.Anotheristhattheprobabilitytoassignaspecicnitesequenceasaprex ofausedsequenceofrandombitsispositive.Soifstartingwithaspecicstringofbits,asa prexofapossiblylongerneededstring,wouldmeaninabilitytoterminatewithapositive probability,thenthenamingalgorithmwouldnotbeLasVegas. Thecommonlengthoftheseinputstringsisaparameter,andweconsiderallsuciently largepositiveintegervaluesforthisparametersuchthattheirexiststringsofrandombits ofthislengthresultinginassignmentsofpropernames.Foragivenlengthofinputstrings, weremoveinputassignmentsthatdonotresultinassignmentpropernamesandconsidera uniformdistributionoftheremaininginputs.Thisisthesameastheuniformdistribution conditionalonthealgorithmterminatingwithinputstringsofbitsofagivenlength. Letusconsidersuchadeterministicalgorithm D assigningnames,andusingstringsof bitsatstationsasinputs,thesestringsbeingofaxedlength,assignedunderauniform distributionforthislength,andsuchthattheyresultintermination.Anexecutionofthis algorithmproducesanitebinarysequenceofbits,wherewetranslatethefeedbackfromthe channelroundbyround,say,withsymbol1representingabeepandsymbol0representing silence.Eachsuchasequenceisabinarycodewordrepresentingaspecicassignmentof names.Thesecodewordshavealsoauniformdistribution,bythesamesymmetryargument asusedintheproofofProposition1.Theexpectedlengthofawordinthiscodeisthe 37
PAGE 46
expectedtimeofalgorithm D .Theexpectedtimeofalgorithm D isthereforeatleast lg n != n log n ,bytheShannon'snoiselesscodingtheorem.Weconcludethat,bythe Yao'sprinciple,theoriginalrandomizedLasVegasalgorithmhasexpectedtimethatis n log n Asimilarargument,bytheYao'sprinciple,appliestoaMonteCarloalgorithmthatis incorrectwithaconstantprobabilitysmallerthan1 = 2.Theonlydierenceintheargument isthatwhenagivenassignmentofinputstingbitsdoesnotresultinaproperassignment ofnames,theneitherthealgorithmcontinuestoworkwithmorebitsforanextratime,or terminateswitherror. Next,weconsiderfactsthatholdwhenthenumberofstations n isunknown.The followingProposition7isabouttheinevitabilityoferror.Intuitively,whentwocomputing/communicatingagentsgeneratethesamestringofbits,thentheiractionsarethesame, andsotheygetthesamenameassigned.Inotherwords,wecannotdistinguishthecase whenthereisonlyonesuchanagentpresentfromcaseswhenatleasttwoofthemworkin unison. Proposition7 Foranunknownnumberofstation n ,ifarandomizednamingalgorithmis executedby n anonymousstations,thenanexecutionisincorrect,inthatduplicatenames areassignedtodierentstations,withprobabilitythatisatleast n )]TJ/F22 7.9701 Tf 6.587 0 Td [( ,assumingthatthe algorithmuses O n log n randombitswithprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [( TheproofofProposition7giveninProposition3isforthemodelofsynchronousdistributedcomputinginwhichprocessorscommunicateamongthemselvesbyreadingfromand writingtosharedregisters.Thesameargumentappliestoasynchronousbeepingchannel, whenweunderstandactionsofstationsaseitherbeepingorpausinginaround. WeconcludethissectionwithafactaboutimpossibilityofdevelopingaLasVegas namingalgorithmwhenthenumberofstations n isunknown. 38
PAGE 47
Proposition8 ThereisnoLasVegasnamingalgorithmforachannelwithbeepingwithat leasttwostationssuchthatitdoesnotrefertothenumberofstations. TheproofofProposition8giveninProposition2isforthemodelofsynchronousdistributedcomputinginwhichprocessorscommunicateamongthemselvesbyreadingfromand writingtosharedregisters.TheproofgivenforProposition2isgeneralenoughtobedirectly applicablehereaswell,asbothmodelsaresynchronous.Proposition8justiesdeveloping MonteCarloalgorithmforunknown n ,whichwedoinSection8.2. 39
PAGE 48
6.PRAM:LASVEGASALGORITHMS WeconsidernamingofanonymousprocessorsofaPRAMwhenthenumberofprocessors nisknown.Thisproblemisinvestigatedinfourspeciccases,dependingontheadditional assumptionspertainingtothemodel,andwegiveanalgorithmforeachcase.Thetwo independentassumptionsregardtheamountofsharedmemoryconstantversusunbounded andthePRAMvariantArbitraryversusCommon. 6.1ArbitrarywithConstantMemory WepresentanamingalgorithmforArbitraryPRAMinthecasewhenthereareaconstantnumberofsharedmemorycells.Itiscalled ArbitraryConstantLV Duringanexecutionofthisalgorithm,processorsrepeatedlywriterandomstringsof bitsrepresentingintegerstoasharedmemorycellcalled Pad ,andnextread Pad toverify theoutcomeofwriting.Aprocessor v thatreadsthesamevalueasitattemptedtowrite incrementstheintegerstoredinasharedregister Counter andusestheobtainednumber asatentativename,whichitstoresinaprivatevariable name v .Thevaluesof Counter couldgetincrementedatotaloflessthan n times,whichoccurswhensometwoprocessors chosethesamerandomintegertowritetotheregister Pad .Thecorrectnessoftheassigned namesisveriedbytheinequality Counter n ,because Counter wasinitializedtozero. Whensuchavericationfailsthenthisresultsinanotheriterationofaseriesofwritesto register Pad ,otherwisetheexecutionterminatesandthevaluestoredat name v becomesthe nalnameofprocessor v .Pseudocodeforalgorithm ArbitraryConstantLV isgiven inFigure6.1.Itreferstoaconstant > 0whichdeterminestheboundedrange[1 ;n ]from whichprocessorsselectintegerstowritetothesharedregister Pad Ballsintobins. Theselectionofrandomintegersintherange[1 ;n ]by n processorscanbe interpretedasthrowing n ballsinto n bins,whichwecall process .Acollisionrepresents twoprocessorsassigningthemselvesthesamename.Thereforeanexecutionofthealgorithm canbeinterpretedasperformingsuchballplacementsrepeatedlyuntilthereisnocollision. Lemma4 Foreach a> 0 thereexists > 0 suchthatwhen n ballsarethrowninto n bins duringthe processthentheprobabilityofacollisionisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a 40
PAGE 49
Algorithm ArbitraryConstantLV repeat initialize Counter name v 0 bin v randomintegerin[1 ;n ] for i 1 to n do ifname v =0 then Pad bin v ifPad = bin v then Counter Counter +1 name v Counter untilCounter = n Figure6.1: Apseudocodeforaprocessor v ofanArbitraryPRAM,wherethe numberofsharedmemorycellsisaconstantindependentof n .Thevariables Counter and Pad areshared.Theprivatevariable name storestheacquired name.Theconstant > 0isparametertobedeterminedbyanalysis. Proof: Considertheballsthrownonebyone.Whenaballisthrown,thenatmost n bins arealreadyoccupied,sotheprobabilityoftheballendinginanoccupiedbinisatmost n=n = n )]TJ/F24 7.9701 Tf 6.587 0 Td [( +1 .Nocollisionsoccurwithprobabilitythatisatleast 1 )]TJ/F15 11.9552 Tf 21.958 8.088 Td [(1 n )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 n 1 )]TJ/F18 11.9552 Tf 21.39 8.088 Td [(n n )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 =1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [( +2 ; 6.1 bytheBernoulli'sinequality.Ifwetake a +2thenjustoneiterationoftherepeatloop issucientwithprobabilitythatisatleast1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Nextwesummarizetheperformanceofalgorithm ArbitraryConstantLV asaLas Vegasalgorithm. Theorem4 Algorithm ArbitraryConstantLV terminatesalmostsurelyandthereis noerrorwhenitterminates.Forany a> 0 ,thereexist > 0 and c> 0 andsuchthat thealgorithmterminateswithintime cn usingatmost cn ln n randombitswithprobability atleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Proof: Thealgorithmassignsconsecutivenamesfromacontinuousintervalstartingfrom1, 41
PAGE 50
bythepseudocodeinFigure6.1.Itterminatesafter n dierenttentativenameshavebeen assigned,bytheconditioncontrollingtherepeatloopinthepseudocodeofFigure6.1.This meansthatpropernameshavebeenassignedwhenthealgorithmterminates. Wemapanexecutionofthe processonanexecutionofalgorithm ArbitraryConstantLV inanaturalmanner.Undersuchaninterpretation,Lemma4estimates theprobabilityoftheeventthatthe n processorsselectdierentnumbersintheinterval [1 ;n ]astheirvaluestowriteto Pad inoneiterationoftherepeatloop.Thisimpliesthat justoneiterationoftherepeatloopissucientwiththeprobabilitythatisatleast1 )]TJ/F18 11.9552 Tf 11.16 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Theprobabilityoftheeventthat i iterationsarenotsucienttoterminateisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(ia whichconvergesto0as i increases,sothealgorithmterminatesalmostsurely.Oneiteration oftherepeatlooptakes O n roundsanditrequires O n log n randombits. Algorithm ArbitraryConstantLV isoptimalamongLasVegasnamingalgorithms withrespecttoitsexpectedrunningtime O n ,giventheamount O ofitsavailable sharedmemory,byCorollary1,andtheexpectednumberofrandombits O n log n ,by Proposition1inSection5.2. 6.2ArbitrarywithUnboundedMemory WegiveanalgorithmforArbitraryPRAMinthecasewhenthereisanunboundedsupply ofinitializedsharedmemorycells.Thisalgorithmiscalled ArbitraryUnboundedLV Thealgorithmusestwoarrays Bin and Counter of n ln n sharedmemorycellseach.An executionproceedsbyrepeatedattemptstoassignnames.Duringeachsuchattempt,the processorsworktoassigntentativenames.Next,thenumberofdistincttentativenamesis obtainedandifthecountequals n thenthetentativenamesbecomenal,otherwiseanother attemptismade.Weassumethateachsuchattemptusesanewsegmentofmemorycells Counter initializedto0s;thisisonlytosimplifytheexpositionandanalysis,becausethis memorycanberesetto0withastraightforwardrandomizedalgorithmwhichisomitted.An attempttoassigntentativenamesproceedsbyeachprocessorselectingtwointegers bin v and label v uniformlyatrandom,where bin 2 [1 ; n ln n ]and label 2 [1 ;n ].Nexttheprocessors 42
PAGE 51
Algorithm ArbitraryUnboundedLV repeat allocate Counter [1 ; n ln n ]/ freshmemorycellsinitializedto0s/ initialize position v ; 0 bin arandomintegerin[1 ; n ln n ] label arandomintegerin[1 ;n ] repeat initialize AllNamed true ifposition v = ; 0 then Bin [ bin ] label ifBin [ bin ]= labelthen Counter [ bin ] Counter [ bin ]+1 position v bin ; Counter [ bin ] elseAllNamed false untilAllNamed / eachprocessorhasatentativename/ name v rankof position v until n isthemaximumname/ noduplicatesamongtentativenames/ Figure6.2: Apseudocodeforaprocessor v ofanArbitraryPRAM,wherethe numberofsharedmemorycellsisunbounded.Thevariables Bin and Counter denotearraysof n ln n sharedmemorycellseach,thevariable AllNamed isalso shared.Theprivatevariable name storestheacquiredname.Theconstant > 0isaparametertobedeterminedbyanalysis. repeatedlyattempttowrite label into Bin [ bin ].Eachsuchawriteisfollowedbyaread andtheluckywriteruses Counter [ bin ]tocreateapairofnumbers bin ; Counter [ bin ], afterrstincrementing Counter [ bin ],whichiscalled bin 's position andisstoredinvariable position .Afterallprocessorshavetheirpositionsdetermined,wedenetheirranksas follows.Tondthe rank of position v ,wearrangeallsuchpairsinlexicographicorder, comparingrston bin andthenon Counter [ bin ],andtherankisthepositionofthisentry intheresultinglist,wheretherstentryhasposition1,thesecond2,andsoon.Rankscan becomputedusingaprextypealgorithmoperatingintime O log n .Thisalgorithmrst ndsforeach bin 2 [1 ; n ln n ]thesum s bin = P 1 i< bin Counter [ i ].Next,eachprocess v with aposition bin v ;c assignstoitself s bin v + c asitsrank.Afterrankshavebeencomputed, 43
PAGE 52
theyareusedastentativenames.Pseudocodeforalgorithm ArbitraryUnboundedLV isgiveninFigure6.2. Intheanalysisofalgorithm ArbitraryUnboundedLV wewillrefertothefollowing boundonindependentBernoullitrials.Let S n bethenumberofsuccessesin n independent Bernoullitrials,with p astheprobabilityofsuccess.Let b i ; n;p betheprobabilityofan occurrenceofexactly i successes.For r>np ,thefollowingboundholds Pr S n r b r ; n;p r )]TJ/F18 11.9552 Tf 11.956 0 Td [(p r )]TJ/F18 11.9552 Tf 11.955 0 Td [(np ; 6.2 seeFeller[42]. Ballsintobins. Weconsiderthrowing n ballsinto n ln n bins.Eachballhasalabelassigned randomlyfromtherange[1 ;n ],for > 0.Wesaythata labeledcollision occurswhenthere aretwoballswiththesamelabelsinthesamebin.Werefertothisprocessas process Lemma5 Foreach a> 0 thereexists > 0 and c> 0 suchthatwhen n ballsarelabeled withrandomintegersin [1 ;n ] andnextarethrowninto n ln n binsduringthe processthen thereareatmost c ln n ballsineverybinandnolabeledcollisionoccurswithprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Weestimatefromabovetheprobabilitiesoftheeventthattherearemorethan c ln n ballsinsomebinandthatthereisalabeledcollision.Weshowthateachofthemcanbe madetobeatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a = 2,fromwhichitfollowsthatatleastoneofthesetwoeventsoccurs withprobabilityatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Let p denotetheprobabilityofselectingaspecicbinwhenthrowingaball,whichis p = ln n n .Whenweset r = c ln n ,forasucientlylarge c> 1,then b r ; n;p = n c ln n ln n n c ln n 1 )]TJ/F15 11.9552 Tf 13.151 8.087 Td [(ln n n n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c ln n : 6.3 Formula6.3translates6.2intothefollowingbound Pr S n r n c ln n ln n n c ln n 1 )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(ln n n n )]TJ/F24 7.9701 Tf 6.587 0 Td [(c ln n c ln n )]TJ/F22 7.9701 Tf 13.15 4.707 Td [(ln n n c ln n )]TJ/F15 11.9552 Tf 11.956 0 Td [(ln n : 6.4 44
PAGE 53
Therighthandsideof6.4canbeestimatedbythefollowingupperbound: en c ln n c ln n ln n n c ln n 1 )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(ln n n n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c ln n c c )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 = e c c ln n 1 )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(ln n n n n n )]TJ/F15 11.9552 Tf 11.956 0 Td [(ln n c ln n c c )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 n c c )]TJ/F24 7.9701 Tf 6.586 0 Td [(c ln n e )]TJ/F22 7.9701 Tf 7.998 0 Td [(ln n n n )]TJ/F15 11.9552 Tf 11.955 0 Td [(ln n c ln n c c )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c ln c + c )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 ; foreachsucientlylarge n> 0.Thisisbecause n n )]TJ/F15 11.9552 Tf 11.956 0 Td [(ln n c ln n = 1+ ln n n )]TJ/F15 11.9552 Tf 11.955 0 Td [(ln n c ln n exp c ln 2 n n )]TJ/F15 11.9552 Tf 11.956 0 Td [(ln n ; whichconvergesto1.Theprobabilitythatthenumberofballsinsomebinisgreaterthan c ln n isthereforeatmost n n )]TJ/F24 7.9701 Tf 6.586 0 Td [(c ln c + c )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 = n )]TJ/F24 7.9701 Tf 6.587 0 Td [(c ln c )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 ,bytheunionbound.Thisprobability canbemadesmallerthan n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a = 2forasucientlylarge c>e Theprobabilityofalabeledcollisionisatmostthatofacollisionwhen n ballsare throwninto n bins.Thisprobabilityisatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [( +2 bybound6.1usedintheproofof Lemma4.Thisnumbercanbemadeatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a = 2forasucientlylarge Nextwesummarizetheperformanceofalgorithm ArbitraryUnboundedLV asa LasVegasalgorithm. Theorem5 Algorithm ArbitraryUnboundedLV terminatesalmostsurelyandthere isnoerrorwhenthealgorithmterminates.Forany a> 0 ,thereexists > 0 and c> 0 such thatthealgorithmassignsnameswithin c ln n timeandgeneratesatmost cn ln n random bitswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Thealgorithmterminatesonlywhen n dierentnameshavebeenassigned,whichis providedbytheconditionthatcontrolsthemainrepeatloopinFigure6.2.Thismeansthat thereisnoerrorwhenthealgorithmterminates. Wemapexecutionsofthe processonexecutionsofalgorithm ArbitraryUnboundedLV inanaturalmanner.Themainrepeatloopendsafteraniterationinwhicheachgroup 45
PAGE 54
ofprocessorsthatselectthesamevalueforvariable bin nextselectdistinctvaluesfor label Weinterprettherandomselectionsinanexecutionasthrowing n ballsinto n ln n bins,where anumber bin determinesabin.Thenumberofiterationsoftheinnerrepeatloopequals themaximumnumberofballsinabin. Forany a> 0,itfollowsthatoneiterationofthemainrepeatloopsuceswithprobabilityatleast1 )]TJ/F18 11.9552 Tf 12.324 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forasuitable > 0,byLemma5.Itfollowsthat i iterationsare executedbyterminationwithprobabilityatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(ia ,sothealgorithmterminatesalmost surely. Letustake c> 0asinLemma5.Itfollowsthataniterationofthemainrepeatloop takesatmost c ln n stepsandoneprocessorusesatmost c ln n randombitsinthisone iterationwithprobabilityatleast1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Algorithm ArbitraryUnboundedLV isoptimalamongLasVegasnamingalgorithmswithrespecttothefollowingperformancemeasures:theexpectedtime O log n byTheorem3,thenumberofsharedmemorycells O n= log n usedtoachievethisrunningtime,byCorollary1,andtheexpectednumberofusedrandombits O n log n ,by Proposition1inSection5.2. 6.3CommonwithConstantMemory NowweconsiderthecaseofCommonPRAMwhenthenumberofavailableshared memorycellsisconstant.Weproposeanalgorithmcalled CommonConstantLV Anexecutionofthealgorithmisorganizedasrepeatedattempts"toassigntemporary names.Duringsuchattempt,eachprocessorwithoutanamechoosesuniformlyatrandoman integerintheinterval[1 ; numberofbins ],where numberofbins isaparameterinitialized to n ;suchaselectionisinterpretedinaprobabilisticanalysisasthrowingaballinto numberofbins manybins.Next,foreach i 2 [1 ; numberofbins ],theprocessorsthatselected i ifany,verifyiftheyareuniqueintheirselectionof i byexecutingprocedure VerifyCollision giveninFigure4.1inSection4 ln n times,where > 0isanumberthat isdeterminedbyanalysis.Afternocollisionhasbeendetected,aprocessorthatselected 46
PAGE 55
Algorithm CommonConstantLV repeat initialize numberofbins n ; name v LastName 0; nocollision v true repeat initialize CollisionDetected false ifname v =0 then bin v randomintegerin[1 ; numberofbins ] for i 1 tonumberofbinsdo for j 1 to ln n do ifbin v = i then if VerifyCollision then CollisionDetected collision v true ifbin v = i andnotcollision v then LastName LastName +1 name v LastName if n )]TJ/F57 11.9552 Tf 11.955 0 Td [(LastName > ln n thennumberofbins n )]TJ/F57 11.9552 Tf 11.955 0 Td [(LastName elsenumberofbins n= ln n untilnotCollisionDetected untilLastName = n Figure6.3: Apseudocodeforaprocessor v ofaCommonPRAM,where thereisaconstantnumberofsharedmemorycells.Procedure VerifyCollision hasitspseudocodeinFigure4.1;lackofparametermeansthe defaultparameter1.Thevariables CollisionDetected and LastName are shared.Theprivatevariable name storestheacquiredname.Theconstant isaparametertobedeterminedbyanalysis. i assignsitselfaconsecutivenamebyreadingandincrementingthesharedvariable LastName .Ittakesupto numberofbins ln n vericationsforcollisionsforallintegersin [1 ; numberofbins ].Whenthisisover,thevalueofvariable numberofbins ismodied bydecrementingitbythenumberofnewnamesjustassigned,whenworkingwiththelast numberofbins ,unlesssuchdecrementingwouldresultinanumber numberofbins that isatmost ln n ,inwhichcasevariable numberofbins issetto n= ln n .Anattemptends whenallprocessorshavetentativenamesassigned.Thesenamesbecomenalwhenthere 47
PAGE 56
areatotalof n ofthem,otherwisethereareduplicates,soanotherattemptisperformed. Apseudocodeforalgorithm CommonConstantLV isinFigure6.3,inwhichthemain repeatlooprepresentsanattempttoassigntentativenamestoeachprocessor.Aniteration oftheinnerrepeatloopduringwhich numberofbins >n= ln n iscalled shrinking and otherwiseitiscalled restored Ballsintobins. Asapreparationofanalysisofperformanceofalgorithm CommonConstantLV ,weconsiderarelatedprocessofrepeatedlythrowingballsintobins,which wecall process .The processproceedsthrough stages ,eachrepresentingoneiteration oftheinnerrepeatloopinFigure6.3.Astageresultsinsomeballsremovedandsome transitioningtothenextstage,sothateventuallynoballsremainandtheprocessterminates. Theballsthatparticipateinastagearecalled eligible forthestage.Intherststage, n ballsareeligibleandwethrow n ballsinto n bins.Initially,weapplytheprinciplethatafter alleligibleballshavebeenplacedintobinsduringastage,thesingletonbinsalongwiththe ballsinthemareremoved.Astageafterwhichbinsareremovediscalled shrinking .There are k binsand k ballsinashrinkingstage;wereferto k asthe length ofthisstage.Given ballsandbinsforanystage,wechooseabinuniformlyatrandomandindependentlyfor eachballinthebeginningofastageandnextplacetheballsintheirselecteddestinations. Thebinsthateitherareemptyormultipleinashrinkingstagestayforthenextstage.The ballsfrommultiplebinsbecomeeligibleforthenextstage. Thiscontinuesuntilsuchashrinkingstageafterwhichatmost ln n ballsremain.Then werestorebinsforatotalof n= ln n ofthemtobeusedinthefollowingstages,during whichweneverremoveanybin;thesestagesarecalled restored .Inthesenalrestoredstages, wekeepremovingsingletonballsattheendofastage,whileballsfrommultiplebinsstay aseligibleforthenextrestoredstage.Thiscontinuesuntilallballsareremoved. Lemma6 Forany a> 0 ,thereexists > 0 suchthatthesumoflengthsofallshrinking stagesinthe processisatmost 2 en ,where e isthebaseofnaturallogarithms,andthere areatmost ln n restoredstages,botheventsholdingwithprobability 1 )]TJ/F18 11.9552 Tf 10.502 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsuciently 48
PAGE 57
large n Proof: Weconsidertwocasesdependingonthekindofanalyzedstages.Let k n denote thelengthofastage. Inashrinkingstage,wethrow k ballsinto k binschoosingbinsindependentlyand uniformlyatrandom.Theprobabilitythataballendsupsingletoncanbeboundedfrom belowasfollows: k 1 k 1 )]TJ/F15 11.9552 Tf 13.469 8.088 Td [(1 k k )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 e )]TJ/F23 5.9776 Tf 7.993 3.258 Td [(1 k )]TJ/F23 5.9776 Tf 10.068 3.258 Td [(1 k 2 k )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 = e )]TJ/F25 5.9776 Tf 7.782 3.361 Td [(k )]TJ/F23 5.9776 Tf 5.756 0 Td [(1 k )]TJ/F25 5.9776 Tf 7.782 3.361 Td [(k )]TJ/F23 5.9776 Tf 5.757 0 Td [(1 k 2 = e )]TJ/F22 7.9701 Tf 6.586 0 Td [(1+ 1 k )]TJ/F23 5.9776 Tf 7.993 3.258 Td [(1 k + 1 k 2 = 1 e e 1 =k 2 1 e ; whereweusedtheinequality1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(x e )]TJ/F24 7.9701 Tf 6.587 0 Td [(x )]TJ/F24 7.9701 Tf 6.586 0 Td [(x 2 ,whichholdsfor0 x 1 2 Let Z k bethenumberofsingletonballsafter k ballsarethrowninto k bins.Itfollows thattheexpectancyof Z k satises E [ Z k ] k=e Toestimatethedeviationof Z k fromitsexpectedvalue,weusetheboundeddierences inequality[71,75].Let B j bethebinofball b j ,for1 j k .Then Z k isoftheform Z k = h B 1 ;:::;B k where h satisedtheLipschitzconditionwithconstant2,becausemoving oneballtoadierentbinresultsinchangingthevalueof h byatmost2withrespecttothe originalvalue.Theboundeddierencesinequalityspecializedtothisinstanceisasfollows, forany d> 0: Pr Z k E [ Z k ] )]TJ/F18 11.9552 Tf 11.955 0 Td [(d p k exp )]TJ/F18 11.9552 Tf 9.299 0 Td [(d 2 = 8 : 6.5 Weusethisinequalityfor d = p k 2 e .Then6.5impliesthefollowingbound: Pr Z k k e )]TJ/F18 11.9552 Tf 15.545 8.088 Td [(k 2 e =Pr Z k k 2 e exp )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 8 p k 2 e 2 =exp )]TJ/F18 11.9552 Tf 18.181 8.088 Td [(k 32 e 2 : Ifwestartashrinkingstagewith k eligibleballsthenthenumberofballseligibleforthe nextstageisatmost 1 )]TJ/F15 11.9552 Tf 15.863 8.087 Td [(1 2 e k = 2 e )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 2 e k; withprobabilityatleast1 )]TJ/F15 11.9552 Tf 12.941 0 Td [(exp )]TJ/F18 11.9552 Tf 9.298 0 Td [(k= 32 e 2 .Letuscontinueshrinkingstagesaslongas theinequality k 32 e 2 > 3 a ln n holds.Wedenotethisinequalityconciselyas k> ln n for 49
PAGE 58
=96 e 2 a .Thentheprobabilitythateveryshrinkingstageresultsinthesizeofthepoolof eligibleballsdecreasingbyafactorofatleast 2 e )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 2 e = 1 f isitselfatleast 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(e )]TJ/F22 7.9701 Tf 6.587 0 Td [(3 a ln n log f n 1 )]TJ/F15 11.9552 Tf 13.151 9.035 Td [(log f n n )]TJ/F22 7.9701 Tf 6.587 0 Td [(3 a 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 a ; forsucientlylarge n ,byBernoulli'sinequality. Ifallshrinkingstagesresultinthesizeofthepoolofeligibleballsdecreasingbyafactor ofatleast1 =f ,thenthetotalnumberofeligibleballssummedoverallsuchstagesisatmost n X i 0 f )]TJ/F24 7.9701 Tf 6.586 0 Td [(i = n 1 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(f )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 =2 en: Inarestoredstage,thereareatmost ln n eligibleballs.Arestoredstagehappensto bethelastonewhenalltheballsbecomesingleaftertheirplacement,whichoccurswith probabilityatleast n= ln n )]TJ/F18 11.9552 Tf 11.955 0 Td [( ln n n= ln n ln n = 1 )]TJ/F18 11.9552 Tf 13.151 8.088 Td [( 2 ln 2 n n ln n 1 )]TJ/F18 11.9552 Tf 13.151 8.088 Td [( 3 ln 3 n n ; bytheBernoulli'sinequality.Itfollowsthattherearemorethan ln n restoredstageswith probabilityatmost 3 ln 3 n n ln n = n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n : Thisboundisatmost n )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 a forsucientlylarge n Bothevents,oneaboutshrinkingstagesandtheotheraboutrestoredstages,holdwith probabilityatleast1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 n )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 a 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a ,forsucientlylarge n Nextwesummarizetheperformanceofalgorithm CommonConstantLV asLasVegasone.Initsproof,werelyonmappingexecutionsofthe processonexecutionsof algorithm CommonConstantLV inanaturalmanner. Theorem6 Algorithm CommonConstantLV terminatesalmostsurelyandthereisno errorwhenthealgorithmterminates.Forany a> 0 thereexist > 0 and c> 0 suchthatthe 50
PAGE 59
algorithmterminateswithintime cn ln n usingatmost cn ln n randombitswithprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Theconditioncontrollingthemainrepeatloopguaranteesthatanexecutionterminatesonlywhentheassignednameslltheinterval[1 ;n ]sotheyaredistinct. Toanalyzetimeperformance,weconsiderthe processofthrowingballsintobinsas consideredinLemma6.Let 1 > 0bethenumber speciedinthisLemma,asdetermined by a replacedby2 a initsassumptions.ThisLemmagivesthatthesumofallvaluesof K summedoverallshrinkingstagesisatmost2 en withprobabilityatleast1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a Foragiven K andanumber i 2 [1 ;K ],procedure VerifyCollision isexecuted ln n times,where istheparameterinFigure6.3.Ifthereisacollisionthenitisdetectedwith probabilityatleast2 )]TJ/F24 7.9701 Tf 6.586 0 Td [( ln n .Wemaytake 2 1 sucientlylargesothattheinequality 2 en 2 )]TJ/F24 7.9701 Tf 6.587 0 Td [( 2 ln n
PAGE 60
6.4CommonwithUnboundedMemory NowweconsiderthelastcasewhenPRAMisofitsCommonvariantandthereis anunboundedamountofsharedmemory.Weproposeanalgorithmcalled CommonUnboundedLV .Thealgorithminvokesprocedure VerifyCollision ,whosepseudocode isinFigure4.1. Anexecutionproceedsasasequenceofattempts"toassigntemporarynames.When suchattemptresultsinassigningtemporarynameswithoutduplicatesthenthesetransient namesbecomenal.Anattemptbeginsfromeachprocessorselectinganintegerfromthe interval[1 ; +1 n ]uniformlyatrandomandindependently,where isaparametersuch thatonly > 1isassumed.Next,forlg n steps,eachprocessexecutesprocedure VerifyCollision x where x isthecurrentlyselectedinteger.Ifacollisionisdetectedthena processorimmediatelyselectsanothernumberin[1 ; +1 n ]andcontinuesverifyingfora collision.Afterlg n suchsteps,theprocessorscountthetotalnumberofselectionsofdierent integers.Ifthisnumberequalsexactly n thentheranksoftheselectedintegersareassigned asnames,otherwiseanotherattempttondnamesisrepeated.Computingthenumberof selectionsandtherankstakestime O log n .Inordertoamortizethistime O log n by verications,suchacomputationofranksisperformedonlyafterlg n verications.Here arankofaselected x isthenumberofnumbersthatareatmost x thatwereselected.A pseudocodeforalgorithm CommonUnboundedLV isgiveninFigure6.4.Subroutines ofprextype,likecomputingthenumberofselectsandranksofselectednumbersarenot includedinthispseudocode. Ballsintobins. Weconsiderauxiliaryprocessesofplacingballsintobinsthatabstracts operationsonsharedmemoryasperformedbyalgorithm CommonUnboundedLV The process isaboutplacing n ballsinto +1 n bins.Theprocessisstructuredasa sequenceofstages.Astagerepresentsanabstractionofoneiterationoftheinnerforloopin Figure6.4performedasifcollisionsweredetectedinstantaneouslyandwithcertainty.When aballismovedthenitisplacedinabinselecteduniformlyatrandom,allsuchselections 52
PAGE 61
Algorithm CommonUnboundedLV x randomintegerin[1 ; +1 n ]/ throwaballintobin x / repeat for i 1 to lg n do if VerifyCollision x then x randomintegerin[1 ; +1 n ] numberoccupiedbins thetotalnumberofselectedvaluesfor x untilnumberoccupiedbins = n name v therankofbin x amongnonemptybins Figure6.4: Apseudocodeforaprocessor v ofaCommonPRAM,where thenumberofsharedmemorycellsisunbounded.Theconstant isaparameterthatsatisestheinequality > 1.Theprivatevariable name storesthe acquiredname. independentfromoneanother.Thestagesareperformedasfollows.Intherststage, n ballsareplacedinto +1 n bins.Whenabinissingletoninthebeginningofastagethen theballinthebinstaysputthroughthestage.Whenabinismultipleinthebeginning ofastage,thenalltheballsinthisbinparticipateactivelyinthisstage:theyareremoved fromthebinandplacedinrandomlyselectedbins.Theprocessterminatesafterastagein whichallballsresideinsingletonbins.Itisconvenienttovisualizeastageasoccurringby rstremovingallballsfrommultiplebinsandthenplacingtheremovedballsinrandomly selectedbinsonebyone. Weassociatethe mimickingwalk toeachexecutionofthe process.Suchawalkis performedonpointswithintegercoordinatesonaline.Themimickingwalkproceedsthrough stages,similarlyastheballprocess.Whenwearetorelocate k ballsinastageoftheball processthenthisisrepresentedbythemimickingwalkstartingthecorrespondingstageat coordinate k .Supposethatweprocessaballinastageandthemimickingwalkisatsome position i .Placingthisballinanemptybindecreasesthenumberofballsforthenextstage; therespectiveactioninthemimickingwalkistodecrementitspositionfrom i to i )]TJ/F15 11.9552 Tf 12.806 0 Td [(1. Placingthisballinanoccupiedbinincreasesthenumberofballsforthenextstage;the 53
PAGE 62
respectiveactioninthemimickingwalkistoincrementitspositionfrom i to i +1.The mimickingwalkgivesaconservativeestimatesonthebehavioroftheballprocess,aswe shownext. Lemma7 Ifastageofthemimickingwalkendsataposition k ,thenthecorresponding stageoftheball processendswithatmost k ballstoberelocatedintobinsinthenext stage. Proof: Theargumentisbrokenintothreecases,inwhichweconsiderwhathappensinthe ballprocessandwhatarethecorrespondingactionsinthemimickingwalk.Anumberof ballsinabininastageismeanttobethenalnumberofballsinthisbinattheendofthe stage. Intherstcase,justoneballisplacedinabinthatbeginsthestageasempty.Then thisballwillnotberelocatedinthenextstage.Thismeansthatthenumberofballsforthe nextstagedecreasesby1.Atthesametime,themimickingwalkdecrementsitsposition by1. Inthesecondcase,some j 1ballslandinabinthatissingletonatthestartofthis stage,sothisballwasnoteligibleforthestage.Thenthenumberofballsinthebinbecomes j +1andthesemanyballswillneedtoberelocatedinthenextstage.Observethatthis contributestoincrementingthenumberoftheeligibleballsinthenextstageby1,because onlytheoriginalballresidinginthesingletonbinisaddedtothesetofeligibleballs,while theotherballsparticipateinbothstages.Atthesametime,themimickingwalkincrements itspositionby1 j times. Inthethirdandnalcase,some j 2ballslandinabinthatisemptyatthestart ofthisstage.Thenthisdoesnotcontributetoachangeinthenumberofballseligiblefor relocationinthenextstage,asthese j ballsparticipateinbothstages.Letusconsiderthese ballsasplacedinthebinonebyone.Therstballmakesthemimickingwalkdecrement itsposition.Thesecondballmakesthewalkincrementitsposition,sothatitreturnstothe originalpositionasatthestartofthestage.Thefollowingballplacements,ifany,resultin 54
PAGE 63
thewalkincrementingitspositions. Randomwalks. Nextweconsiderarandomwalkwhichwillestimatethebehaviorofa ballprocess.OnecomponentofestimationisprovidedbyLemma7,inthatwewillinterpret arandomwalkasamimickingwalkfortheballprocess. Therandomwalkisrepresentedasmovementsofamarkerplacedonthenonnegative sideoftheintegernumberline.Themovementsofthemarkerarebydistance1andtheyare independent.The random walk hasthemarker'spositionincrementedwithprobability 1 +1 anddecrementedwithprobability +1 .Thismaybeinterpretedasasequenceofindependent Bernoullitrials,inwhich +1 ischosentobetheprobabilityofsuccess.Wewillconsider > 1,forwhich +1 > 1 +1 ,whichmeansthattheprobabilityofsuccessisgreaterthanthe probabilityoffailure. Sucharandom walkproceedsthrough stages ,whicharedenedasfollows.Therst stagebeginsatposition n .Whenastagebeginsataposition k thenitendsafter k moves, unlessitreachesthezerocoordinateinthemeantime.Thezeropointactsasanabsorbing barrier,andwhenthewalk'spositionreachesitthentherandomwalkterminates.Thisisthe onlywayinwhichthewalkterminates.AstagecapturesoneroundofPRAM'scomputation andthenumberofmovesinastagerepresentsthenumberofwritesprocessorsperformina round. Lemma8 Foranynumbers a> 0 and > 1 ,thereexists b> 0 suchthattherandom walkstartingatposition n> 0 terminateswithin b ln n stageswithallofthemcomprising O n moveswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Proof: Supposetherandomwalkstartsatposition k> 0whenastagebegins.Let X k be thenumberofmovestowards0and Y k = k )]TJ/F18 11.9552 Tf 12.303 0 Td [(X k bethenumberofmovesawayfrom0in suchastage.Thetotaldistancecoveredtowards0,whichwecall drift ,is L k = X k )]TJ/F18 11.9552 Tf 11.955 0 Td [(Y k = X k )]TJ/F15 11.9552 Tf 11.955 0 Td [( k )]TJ/F18 11.9552 Tf 11.955 0 Td [(X k =2 X k )]TJ/F18 11.9552 Tf 11.956 0 Td [(k: Theexpectedvalueof X k is E [ X k ]= k +1 = k .Theevent X k < )]TJ/F18 11.9552 Tf 12.205 0 Td [(" k holdswith 55
PAGE 64
probabilityatmostexp )]TJ/F24 7.9701 Tf 10.494 4.707 Td [(" 2 2 k ,bytheChernobound[75],sothat X k )]TJ/F18 11.9552 Tf 11.958 0 Td [(" k occurs withtherespectivehighprobability.Wesaythatsuchastageis conforming whentheevent X k )]TJ/F18 11.9552 Tf 11.955 0 Td [(" k holds. Ifastageisconformingthenthefollowinginequalityholds: L k 2 )]TJ/F18 11.9552 Tf 11.955 0 Td [(" k +1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(k = )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 +1 k: Wewanttheinequality )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 +1 > 0tohold,whichisthecasewhen "< )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 2 .Letusxsuch "> 0.Nowthedistancefrom0after k stepsstartingat k is k )]TJ/F18 11.9552 Tf 11.955 0 Td [(L k = )]TJ/F18 11.9552 Tf 13.151 8.088 Td [( )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 +1 k = 2+ +1 k; where 2+ +1 < 1for "< )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 2 .Let = +1 2+ > 1.Consecutive i conformingstagesmake thedistancefrom0decreasebyatleastafactor )]TJ/F24 7.9701 Tf 6.586 0 Td [(i Whenwestarttherststageatposition n andthenextlog n stagesareconforming thenafterthesemanystagestherandomwalkendsupatapositionthatiscloseto0.For ourpurposes,itsucesthatthepositionisofdistanceatmost s ln n from0,forsome s> 0, becauseofitsimpactonprobability.Namely,theeventthatallthesestagesareconforming andthebound s ln n ondistancefrom0holds,occurswithprobabilityatleast 1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(log n exp )]TJ/F18 11.9552 Tf 10.494 8.088 Td [(" 2 2 +1 s ln n 1 )]TJ/F15 11.9552 Tf 11.956 0 Td [(log n n )]TJ/F25 5.9776 Tf 7.782 3.258 Td [(" 2 2 +1 s : Letuschoose s> 0suchthat log n n )]TJ/F25 5.9776 Tf 7.782 3.259 Td [(" 2 2 +1 s 1 2 n a ; forsucientlylarge n Havingxed s ,letustake t> 0suchthatthedistancecoveredtowards0isatleast s ln n whenstartingfrom k = t ln n andperforming k steps.Weinterpretthesemovementsasif thiswasasingleconceptualstageforthesakeoftheargument,butitsdurationcomprises allstageswhenwestartfrom s ln n untilweterminateat0.Itfollowsthattheconceptual stagecomprisesatmost t ln n realstages,becauseastagetakesatleastoneround. 56
PAGE 65
Ifthislastconceptualstageisconformingthenthedistancecoveredtowards0isbounded by L k )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 +1 k: Wewantthistobeatleast s ln n for k = t ln n ,whichisequivalentto )]TJ/F15 11.9552 Tf 11.956 0 Td [(2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 +1 t>s: Nowitissucienttotake t>s +1 )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 .Thislastconceptualstageisnotconformingwith probabilityatmostexp )]TJ/F24 7.9701 Tf 10.494 4.707 Td [(" 2 2 +1 t ln n .Letustake t thatisadditionallybigenoughforthe followinginequality exp )]TJ/F18 11.9552 Tf 10.494 8.088 Td [(" 2 2 +1 t ln n = n )]TJ/F25 5.9776 Tf 7.782 3.258 Td [(" 2 2 +1 t 1 2 n a tohold. Havingselected s and t ,wecanconcludethatthereareatmost s + t ln n stageswith probabilityatleast1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Nowletusconsideronlythetotalnumberofmovestotheleft X m andtotheright Y m after m movesintotal,whenstartingatposition n .Theevent X m < )]TJ/F18 11.9552 Tf 11.826 0 Td [(" 1+ m holds withprobabilityatmostexp )]TJ/F24 7.9701 Tf 10.494 4.708 Td [(" 2 2 1+ m ,bytheChernobound[75],sothat X m m )]TJ/F24 7.9701 Tf 6.587 0 Td [(" 1+ occurswiththerespectivehighprobability1 )]TJ/F15 11.9552 Tf 11.853 0 Td [(exp )]TJ/F24 7.9701 Tf 10.494 4.707 Td [(" 2 2 1+ m .Atthesametimewehave thatthenumberofmovesawayfromzero,whichwedenote Y m ,canbeestimatedtobe Y m = m )]TJ/F18 11.9552 Tf 11.956 0 Td [(X m )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 +1 m: Wewanttheinequality )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 +1 > 0tohold,whichisthecasewhen "< )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 2 .Thedriftisat least n ,withthecorrespondinglargeprobability,when m = d n for d = +1 )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 .Thedrift isatleastsuchwithprobabilityexponentiallycloseto1in n ,whichisatleast1 )]TJ/F18 11.9552 Tf 12.187 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a for sucientlylarge n 57
PAGE 66
Lemma9 Foranynumbers a> 0 and > 1 ,thereexists b> 0 suchthatthe process startingatposition n> 0 terminateswithin b ln n stagesafterperforming O n ballthrows withprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Weestimatethebehaviorofthe processon n ballsbythebehavioroftherandom walkstartingatposition n .Thejusticationoftheestimationisintwosteps.Oneis thepropertyofmimickingwalksgivenasLemma7.TheotherisprovidedbyLemma8 andisjustiedasfollows.Theprobabilityofdecrementingandincrementingpositionin therandom walkaresuchthattheyreecttheprobabilitiesoflandinginanemptybin orinanoccupiedbin.Namely,weusethefactsthatduringexecutingthe process,there areatmost n occupiedbinsandatleast n emptybinsinanyround.Inthe process, theprobabilityoflandinginanemptybinisatleast n +1 n = +1 ,andtheprobabilityof landinginanoccupiedbinisatmost n +1 n = 1 +1 .Thismeansthattherandom walkis consistentwithLemma7inprovidingestimatesonthetimeofterminationofthe process fromabove. Incorporatingverications. Weconsiderthe random walkwithverications ,whichis denedasfollows.Theprocessproceedsthroughstages,similarlyastheregularrandom walk.Foranyroundofthewalkandapositionatwhichthewalkisat,werstperforma Bernoullitrialwiththeprobability 1 2 ofsuccess.Suchatrialisreferredtoasa verication whichis positive whenasuccessoccursotherwiseitis negative .Afterapositiveverication amovementofthemarkeroccursasintheregular walk,otherwisethewalkpausesatthe givenpositionforthisround. Lemma10 Foranynumbers a> 0 and > 1 ,thereexists b> 0 suchthattherandom walkwithvericationsstartingatposition n> 0 terminateswithin b ln n stageswithallof themcomprisingthetotalof O n moveswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: WeprovideanextensionoftheproofofLemma8,whichstatesasimilarproperty ofregularrandom walks.Thatproofestimatedtimesofstagesandthenumberofmoves. 58
PAGE 67
Supposetheregularrandom walkstartsataposition k ,sothatthestagetakes k moves. Thereisaconstant d< 1suchthatthewalkendsatapositionatmost dk withprobability exponentialin k Moreover,theproofofLemma8issuchthatallthevaluesof k consideredareatleast logarithmicin n ,whichprovidesatmostapolynomialboundonerror.Arandomwalk withvericationsissloweddownbynegativeverications.Observethatarandomwalk withvericationsthatisperformed3 k timesundergoesatleast k positivevericationswith probabilityexponentialin k bytheChernobound[75].Thismeansthattheproofof Lemma8canbeadaptedtothecaseofrandomwalkswithvericationsalmostverbatim, withthemodicationscontributedbypolynomialboundsonerrorofestimatesofthenumber ofpositivevericationsinstages. Next,weconsidera processwithverications ,whichisdenedasfollows.Theprocess proceedsthroughstages,similarlyastheregularballprocess.Therststagestartswith placing n ballsinto +1 n bins.Foranyfollowingstage,werstgothroughmultiple binsand,foreachballinsuchabin,weperformaBernoullitrialwiththeprobability 1 2 ofsuccess,whichwecalla verication .Asuccessinatrialisreferredtoasa positive verication otherwiseitisa negative one.Ifatleastonepositivevericationoccursfora ballinamultiplebinthenalltheballsinthisbinarerelocatedinthisstagetobinsselected uniformlyatrandomandindependentlyforeachsuchaball,otherwisetheballsstayputin thisbinuntilthenextstage.The processterminateswhenalltheballsaresingleton. Lemma11 Foranynumbers a> 0 and > 1 ,thereexists b> 0 suchthatthe process withvericationsterminateswithin b ln n stageswithallofthemcomprisingthetotalof O n ballthrowswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Proof: TheargumentproceedsbycombiningLemma7withLemma10,similarlyasthe proofofLemma9isprovedbycombiningLemma7withLemma8.Thedetailsfollow. Foranyexecutionofaballprocesswithverications,weconsideramimickingrandom 59
PAGE 68
walk,"alsowithverications,denedsuchthatwhenaballfromamultiplebinishandled thentheoutcomeofarandomvericationforthisballismappedonavericationforthe correspondingrandomwalk.Observethatfora processwithvericationsjustonepositive vericationissucientamong j )]TJ/F15 11.9552 Tf 12.231 0 Td [(1trialswhenthereare j> 1ballsinamultiplebin,so arandom walkwithvericationsprovidesanupperboundontimeofterminationofthe processwithverications.Theprobabilitiesofdecrementingandincrementingpositionin therandom walkwithvericationsaresuchthattheyreecttheprobabilitiesoflanding inanemptybinorinanoccupiedbin,similarlyaswithoutverications.Allthisgivea consistencyofa walkwithvericationswithLemma7inprovidingestimatesonthetime ofterminationofthe processfromabove. Nextwesummarizetheperformanceofalgorithm CommonUnboundedLV asLas Vegasone.Theproofisbasedonmappingexecutionsofthe processeswithverications onexecutionsofalgorithm CommonUnboundedLV inanaturalmanner. Theorem7 Algorithm CommonUnboundedLV terminatesalmostsurelyandwhenthe algorithmterminatesthenthereisnoerror.Foreach a> 0 andany > 1 inthepseudocode, thereexists c> 0 suchthatthealgorithmassignspropernameswithintime c lg n andusing atmost cn lg n randombitswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Thealgorithmterminateswhenthereare n dierentranks,bytheconditioncontrollingtherepeatloop.Asranksaredistinctandeachintheinterval[1 ;n ],eachnameis unique,sothereisnoerror.Therepeatloopisexecuted O timeswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 10.695 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a ,byLemma11.Therepeatloopisperformed i timeswithprobabilitythatisatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(ia ,soitconvergesto0with i increasing.Itfollowsthatthealgorithmterminatesalmost surely. AniterationoftherepeatloopinFigure6.4takes O log n steps.Thisisbecauseof thefollowingtwofacts.First,itconsistsoflg n iterationsoftheforloop,eachtaking O rounds.Second,itconcludeswithverifyingtheuntilcondition,whichiscarriedoutby 60
PAGE 69
countingnonemptybinsbyaprextypecomputation.Itfollowsthattimeuntiltermination is O log n withprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a ByLemma11,thetotalnumberofballthrowsis O n withprobability1 )]TJ/F18 11.9552 Tf 12.081 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .Each placementofaballrequires O log n randombits,sothenumberofusedrandombitsis O n log n withthesameprobability. Algorithm CommonUnboundedLV isoptimalamongLasVegasnamingalgorithms withrespecttothefollowingperformancemeasures:theexpectedtime O log n ,byTheorem3,thenumberofsharedmemorycells O n usedtoachievethisrunningtime,by Corollary1,andtheexpectednumberofrandombits O n log n ,byProposition1. 6.5Conclusion WeconsideredthenamingproblemfortheanonymoussynchronousPRAMwhenthe numberofprocessors n isknown.WegaveLasVegasalgorithmsforfourvariantsofthe problem,whicharedeterminedbythesuitablerestrictionsonconcurrentwritingandthe amountofsharedmemory.Eachofthesealgorithmsisprovablyoptimalforitscasewith respecttothenaturalperformancemetricssuchasexpectedtimeasdeterminedbythe amountofsharedmemoryandexpectednumberofusedrandombits. 61
PAGE 70
7.PRAM:MONTECARLOALGORITHMS WeconsidernamingofanonymousprocessorsofaPRAMwhenthenumberofprocessors nisunknown.Theyaredeterminedbytwoindependentspecicationsthenamingproblems: theamountofsharedmemoryandthePRAMvariant. 7.1ArbitrarywithConstantMemory WedevelopanamingalgorithmforanArbitraryPRAMwithaconstantnumberof sharedmemorycells.Thealgorithmiscalled ArbitraryBoundedMC Theunderlyingideaistohaveallprocessorsrepeatedlyattempttoobtaintentative namesandterminatewhentheprobabilityofduplicatenamesisgaugedtobesuciently small.Tothisend,eachprocessorwritesanintegerselectedfromasuitableselectionrange" intoasharedmemoryregisterandnextreadsthisregistertoverifywhetherthewritewas successfulornot.Asuccessfulwriteresultsineachsuchaprocessorgettingatentative namebyreadingandincrementinganothersharedregisteroperatingasacounter.Oneof thechallengeshereistodetermineaselectionrangefromwhichrandomintegersarechosen forwriting.Agoodselectionrangeislargeenoughwithrespecttothenumberofwriters, whichisunknown,becausewhentherangeistoosmallthenmultipleprocessorsmayselect thesameintegerandsoallofthemgetthesametentativenameafterthisintegergets writtensuccessfully.Thealgorithmkeepsthesizeofaselectionrangegrowingwitheach failedattempttoassigntentativenames. Thereisaninherenttradeopresent,inthatontheonehandwewanttokeepthe sizeofusedsharedmemorysmall,asameasureofeciencyofthealgorithm,whileatthe sametimethelargertherangeofmemorythesmallertheprobabilityofcollisionofrandom selectionsfromaselectionrangeandsooftheresultingduplicatenames.Additionally, increasingtheselectionrangerepeatedlycoststimeforeachsucharepetition,whilewealso wanttominimizetherunningasametricofperformance.Thealgorithmkeepsincreasing theselectionrangewithaquadraticrate,whichturnsouttobesucienttooptimizeallthe performancemetricswemeasure.Thealgorithmterminateswhenthenumberofselected 62
PAGE 71
Algorithm ArbitraryBoundedMC initialize k 1/ initialapproximationoflg n / repeat initialize LastName name v 0 k 2 k bin v randomintegerin[1 ; 2 k ]/ throwaballintoabin / repeat AllNamed true ifname v =0 then Pad bin v ifPad = bin v then LastName LastName +1 name v LastName else AllNamed false untilAllNamed untilLastName 2 k= Figure7.1: Apseudocodeforaprocessor v ofanArbitraryPRAMwitha constantnumberofsharedmemorycells.Thevariables LastName AllNamed and Pad areshared.Theprivatevariable name storestheacquiredname.The constant > 0isaparametertobedeterminedbyanalysis. integersfromthecurrentselectionrangemakesasucientlysmallfractionofthesizeofthe usedrange. Apseudocodeofalgorithm ArbitraryBoundedMC isgiveninFigure7.1.Itsstructureisdeterminedbythemainrepeatloop.Eachiterationofthemainloopbeginswith doublingthevariable k ,whichdeterminestheselectionrange[1 ; 2 k ].Thismeansthatthesize oftheselectionrangeincreasesquadraticallywithconsecutiveiterationsofthemainrepeatloop.Aprocessorbeginsaniterationofthemainloopbychoosinganintegeruniformlyat randomfromthecurrentselectionrange[1 ; 2 k ].Thereisaninnerrepeatloop,nestedwithin themainloop,whichassignstentativenamesdependingontherandomselectionsjustmade. Allprocessorsrepeatedlywritetoasharedvariable Pad andnextreadtoverifyifthe 63
PAGE 72
writewassuccessful.Itispossiblethatdierentprocessorsattempttowritethesamevalue andthenverifythattheirwritewassuccessful.Thesharedvariable LastName isused toprogressthroughconsecutiveintegerstoprovidetentativenamestobeassignedtothe latestsuccessfulwriters.Whenmultipleprocessorsattempttowritethesamevalueto Pad anditgetswrittensuccessfully,thenallofthemobtainthesametentativename.The variable LastName ,attheendofeachiterationoftheinnerrepeatloop,equalsthenumber ofoccupiedbins.Thesharedvariable AllNamed isusedtoverifyifallprocessorshave tentativenames.Theouterloopterminateswhenthenumberofassignednames,whichis thesameasthenumberofoccupiedbins,issmallerthanorequalto2 k= ,where > 0isa parametertobedeterminedinanalysis. Ballsintobins. Weconsiderthefollowingauxiliary process ofthrowingballsintobins, foraparameter > 0.Theprocessproceedsthroughstagesidentiedbyconsecutivepositive integers.The i thstagehasthenumberparameter k equalto k =2 i .Duringastage,we rstthrow n ballsintothecorresponding2 k binsandnextcountthenumberofoccupied bins.Astageislastinanexecutionofthe process,andsothe processterminates,when thenumberofoccupiedbinsissmallerthanorequalto2 k= .Weobservethatthe process alwaysterminates.Thisisbecause,byitsspecication,the processterminatesbythe rststageinwhichtheinequality n 2 k= holdsand n isanupperboundonthenumber ofoccupiedbinsinastage.Theinequality n 2 k= isequivalentto n 2 k andsoto lg n k .Since k goesthroughconsecutivepowersof2,weobtainthatthenumberof stagesofthe processwith n ballsisatmostlg lg n =lg +lglg n Wesaythatsucha processis correct whenuponterminationeachballisinaseparate bin,otherwisetheprocessis incorrect Lemma12 Forany a> 0 thereexists > 0 suchthatthe processisincorrectwith probabilitythatisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsucientlylarge n Proof: The processisincorrectwhentherearecollisionsafterthelaststage.Theprobabilityoftheintersectionoftheevents processterminates"andtherearecollisions"is 64
PAGE 73
boundedfromabovebytheprobabilityofanyoneoftheseevents.Nextweshowthat,for eachpairof k and n ,someofthesetwoeventsoccurswithprobabilitythatisatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a forasuitable First,weconsidertheeventthatthe processterminates.Theprobabilitythatthere areatmost2 k= occupiedbinsisatmost 2 k 2 k= 2 k= 2 k n e 2 k 2 k= 2 k= 2 k )]TJ/F23 5.9776 Tf 5.756 0 Td [(1 )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 n e 2 k= 2 k )]TJ/F24 7.9701 Tf 6.587 0 Td [( )]TJ/F23 5.9776 Tf 5.756 0 Td [(1 k= 2 k )]TJ/F23 5.9776 Tf 5.756 0 Td [(1 )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 n e 2 k= 2 k )]TJ/F23 5.9776 Tf 5.757 0 Td [(1 )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 n )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 k= : 7.1 Weestimatefromabovethenaturallogarithmoftherighthandsideof.1.Weobtain thefollowingupperbound: 2 k= + k )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 k= ln2 < 2 k= )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(1 2 n )]TJ/F15 11.9552 Tf 11.955 0 Td [(2 k= ln2 =2 k= )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(ln2 2 n + ln2 2 2 k= = )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(ln2 2 n +2 k= 2+ln2 2 ; 7.2 for > 4 = 3,as k 2.Theestimate7.2isatmost )]TJ/F18 11.9552 Tf 9.298 0 Td [(n ln2 4 when2 k= n ,for = ln2 2+ln2 byadirectalgebraicverication.Theserestrictionson k and canberestatedas k lg n and > 4 = 3 : 7.3 Whenthiscondition7.3issatised,thentheprobabilityofatmost2 k= occupiedbinsis atmost exp )]TJ/F18 11.9552 Tf 9.298 0 Td [(n ln2 4 n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a forsucientlylarge n Next,letusconsidertheprobabilityofcollisionsoccurring.Collisionsdonotoccurwith probabilitythatisatleast 1 )]TJ/F18 11.9552 Tf 15.143 8.087 Td [(n 2 k n 1 )]TJ/F18 11.9552 Tf 13.15 8.087 Td [(n 2 2 k ; 65
PAGE 74
bytheBernoulli'sinequality.Itfollowsthattheprobabilityofcollisionsoccurringcanbe boundedfromaboveby n 2 2 k .Thisboundinturnisatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a when k + a lg n: 7.4 Inordertohavesomeoftheinequalities.3and7.4holdforany k and n ,itis sucienttohave + a lg n lg n : Thisdetermines asfollows: + a lg n lg n +lg 2+ a; with n !1 .Weobtainthattheinequality > 2+ a suces,for n thatislargeenough. Lemma13 Foreach > 0 thereexists c> 0 suchthatwhenthe processterminatesthen thenumberofbinseverneededisatmost cn andthenumberofrandombitsevergenerated isatmost cn ln n Proof: The processterminatesbythestageinwhichtheinequality n 2 k= holds,so k getstobeatmost lg n .Wepartitiontherange[2 ; lg n ]ofvaluesof k intotwosubranges andconsiderthemseparately. First,when k rangesfrom2tolg n throughthestages,thenthenumbersofneededbins increasequadraticallythroughthestages,because k isdoubledwitheachtransitiontothe nextstage.Thismeansthatthetotalnumberofallthesebinsis O n .Atthesametime, thenumberofrandombitsincreasesgeometricallythroughthestages,sothetotalnumber ofrandombitsaprocessorusesis O log n Second,when k rangesfromlg n to lg n ,thenumberofneededbinsisatmost n in eachstage.Thereareonlylg +1suchstages,sothetotalnumberofallthesebinsis lg +1 n .Atthesametime,aprocessorusesatmost lg n randombitsineachofthese stages. 66
PAGE 75
Thereisadirectcorrespondencebetweeniterationsoftheouterrepeatloopandstages ofa process.The i thstagehasthenumber k equaltothevalueof k duringthe i thiteration oftheouterrepeatloopofalgorithm ArbitraryBoundedMC ,thatis,wehave k =2 i Wemapanexecutionofthealgorithmintoacorrespondingexecutionofa processinorder toapplyLemmas12and13intheproofofthefollowingTheorem,whichsummarizesthe performanceofalgorithm ArbitraryBoundedMC andjustiesthatitisMonteCarlo. Theorem8 Algorithm ArbitraryBoundedMC alwaysterminates,forany > 0 .For each a> 0 thereexists > 0 and c> 0 suchthatthealgorithmassignsuniquenames,works intimeatmost cn ,andusesatmost cn ln n randombits,allthiswithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Thenumberofstagesofthe processwith n ballsisatmostlg lg n =lg +lglg n Thisisalsoanupperboundonthenumberofiterationsofthemainrepeatloop.Weconclude thatthealgorithmalwaysterminates. Thenumberofbinsavailableinastageisanupperboundonthenumberofbinsoccupied inthisstage.Thenumberofbinsoccupiedinastageequalsthenumberoftimestheinner repeatloopisiterated,becauseexecutinginstruction Pad bin eliminatesoneoccupiedbin. Itfollowsthatthenumberofbinseverneededisanupperboundontimeofthealgorithm. Thenumberofiterationsoftheinnerrepeatloopisexecutedisrecordedinthevariable LastName ,sotheterminationconditionofthealgorithmcorrespondstothetermination conditionofthe process. Whenthe processiscorrectthenthismeansthattheprocessorsobtaindistinctnames. WeconcludethatLemmas12and13applywhenunderstoodaboutthebehaviorofthe algorithm.Thisimpliesthefollowing:thenamesarecorrectandexecutionterminatesin O n timewhile O n log n bitsareused,allthiswithprobabilitythatisatleast1 )]TJ/F18 11.9552 Tf 11.106 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Algorithm ArbitraryBoundedMC isoptimalwithrespecttothefollowingperformancemeasures:theexpectedtime O n ,byTheorem2,theexpectednumberofrandom 67
PAGE 76
bits O n log n ,byProposition1,andtheprobabilityoferror n O ,byProposition3. 7.2ArbitrarywithUnboundedMemory WedevelopanamingalgorithmforArbitraryPRAMwithanunboundedamountof sharedregisters.Thealgorithmiscalled ArbitraryUnboundedMC TheunderlyingideaistoparallelizetheprocessofselectionofnamesappliedinSection7.1inalgorithm ArbitraryBoundedMC sothatmultipleprocessescouldacquire informationinthesameroundthatlaterwouldallowthemtoobtainnames.Asalgorithm ArbitraryBoundedMC usedsharedregisters Pad and LastName ,thenewalgorithm usesarraysofsharedregistersplayingsimilarroles.Thevaluesreadofrom LastName cannotbeusesdirectlyasnames,becausemultipleprocessorscanreadthesamevalues,so weneedtodistinguishbetweenthesevaluestoassignnames.Tothisend,weassignranks toprocessorsbasedontheirlexicographicorderingbypairsofnumbersdeterminedby Pad and LastName Apseudocodeforalgorithm ArbitraryUnboundedMC isgiveninFigure7.2.Itis structuredasarepeatloop.Intherstiteration,theparameter k equals1,andinsubsequent onesisdeterminedbyiterationsofanincreasingintegervaluedfunction r k ,whichisa parameter.Weconsidertwoinstantiationsofthealgorithm,determinedby r k = k +1 andby r k =2 k .Inoneiterationofthemainrepeatloop,aprocessorusestwovariables bin 2 [1 ; 2 k = k ]and label 2 [1 ; 2 k ],whichareselectedindependentlyanduniformlyat randomfromtherespectiveranges. Weinterpret bin asabin'snumberand label asalabelforaball.Processorswritetheir values label intotherespectivebinbyinstruction Pad [ bin ] label andverifywhatvalue gotwritten.Afterasuccessfulwrite,aprocessorincrements LastName [ bin ]andassignsthe pair bin ; LastName [ bin ]asits position .Thisisrepeated k timesbywayofiteratingthe innerforloop.Thisloophasaspecicupperbound k onthenumberofiterationsbecause wewanttoascertainthatthereareatmost k ballsineachbin.Themainrepeatloop terminateswhenallvaluesattemptedtobewrittenactuallygetwritten.Thenprocessors 68
PAGE 77
Algorithm ArbitraryUnboundedMC initialize k 1/ initialapproximationoflg n / repeat initialize AllNamed true initialize position v ; 0 k r k bin v randomintegerin[1 ; 2 k = k ]/ chooseabinfortheball / label v randomintegerin[1 ; 2 k ]/ choosealabelfortheball / for i 1 to k do ifposition v = ; 0 then Pad [ bin v ] label v ifPad [ bin v ]= label v then LastName [ bin v ] LastName [ bin v ]+1 position v bin v ; LastName [ bin v ] ifposition v = ; 0 then AllNamed false untilAllNamed name v therankof position v Figure7.2: Apseudocodeforaprocessor v ofanArbitraryPRAM,when thenumberofsharedmemorycellsisunbounded.Thevariables Pad and LastName arearraysofsharedmemorycells,thevariable AllNamed issharedas well.Theprivatevariable name storestheacquiredname.Theconstant > 0 andanincreasingfunction r k areparameters. assignthemselvesnamesaccordingtotheranksoftheirpositions.Thearray LastName is assumedtobeinitializedto0's,andineachiterationoftherepeatloopweuseafreshregion ofsharedmemorytoallocatethisarray. Ballsintobins. Weconsiderarelatedprocessofplacinglabeledballsintobins,whichis referredtoas process .Suchaprocessproceedsthroughstagesandisparametrizedbya function r k .Intherststage,wehave k =1,andgivensomevalueof k inastage,the nextstagehasthisparameterequalto r k .Inastagewithagiven k ,weplace n balls into2 k = k bins,withlabelsfrom[1 ; 2 k ].Theselectionsofbinsandlabelsareperformed independentlyanduniformlyatrandom.Astageterminatesthe processwhenthereare 69
PAGE 78
atmost k labelsofballsineachbin. Lemma14 The processalwaysterminates. Proof: The processterminatesbyastageinwhichtheinequality n k holds,because n isanupperboundonthenumberofballsinabin.Thisalwaysoccurswhenfunction r k isincreasing. Weexpectthe processtoterminateearlier,asthenextLemmastates. Lemma15 Foreach a> 0 ,if k lg n )]TJ/F15 11.9552 Tf 11.956 0 Td [(2 and 1+ a thentheprobabilityofhaltingin thestageissmallerthan n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsucientlylarge n Proof: Weshowthatwhen k issuitablysmallthentheprobabilityofatmost k dierent labelsineachbinissmall.Thereare n ballsplacedinto2 k = k bins,sothereareatleast kn 2 k ballsinsomebin,bythepigeonholeprinciple.Weconsidertheseballsandtheirlabels. Theprobabilitythatalltheseballshaveatmost k labelsisatmost 2 k k k 2 k kn 2 k e 2 k k k k kn 2 k k kn 2 k = e k 2 k k )]TJ/F25 5.9776 Tf 7.782 3.693 Td [(kn 2 k k kn 2 k )]TJ/F24 7.9701 Tf 6.586 0 Td [(k = e k k 2 k kn 2 k )]TJ/F24 7.9701 Tf 6.586 0 Td [(k : 7.5 Wewanttoshowthatthisisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .WecomparethelogarithmsButthebaseof logarithms!of n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a andtherighthandsideof7.5,andwantthefollowinginequalityto hold: k + kn 2 k )]TJ/F18 11.9552 Tf 11.955 0 Td [(k lg k )]TJ/F18 11.9552 Tf 11.955 0 Td [(k )]TJ/F18 11.9552 Tf 21.918 0 Td [(a lg n; whichisequivalenttothefollowinginequality,byalgebra: n 2 k 1 k )]TJ/F15 11.9552 Tf 11.955 0 Td [(lg k +1+ a lg n k k )]TJ/F15 11.9552 Tf 11.955 0 Td [(lg k : 7.6 Observenowthat,assuming a +1,if k< p lg n thentherighthandsideof7.6isat most2+lg n whilethelefthandsideisatleast p n ,andwhen p lg n k lg n )]TJ/F15 11.9552 Tf 12.27 0 Td [(2then 70
PAGE 79
righthandsideof.6isatmost3whilethelefthandsideisatleast4,forsuciently large n Wesaythata labelcollision occurs,inacongurationproducedbytheprocess,ifsome bincontainstwoballswiththesamelabel. Lemma16 Forany a> 0 ,if k> 1 2 lg n and > 4 a +7 thentheprobabilityofalabel collisionissmallerthan n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Proof: Thenumberofpairsofabinnumberandalabelis2 k 2 k = k .Itfollowsthatthe probabilityofsometwoballsinthesamebinobtainingdierentlabelsisatleast 1 )]TJ/F18 11.9552 Tf 37.687 8.087 Td [(n 2 k + k = k n 1 )]TJ/F18 11.9552 Tf 35.321 8.087 Td [(n 2 2 k + k = k ; bytheBernoulli'sinequality.Sotheprobabilitythattwodierentballsobtainthesame labelisatmost n 2 2 k + k = k .Wewantthefollowinginequalitytohold n 2 2 k + k = k 2+ a 1+ lg n: Thisinequalityholdsfor k> 1 2 lg n when > 4 a +7. Wesaythatsucha processis correct whenuponterminationnolabelcollisionoccurs, otherwisetheprocessis incorrect Lemma17 Forany a> 0 ,thereexists > 0 suchthatthe processisincorrectwith probabilitythatisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsucientlylarge n 71
PAGE 80
Proof: The processisincorrectwhenthereisalabelcollisionafterthelaststage.The probabilityoftheintersectionoftheevents processterminates"andtherearelabelcollisions"isboundedfromabovebytheprobabilityofanyoneoftheseevents.Nextweshow that,foreachpairof k and n ,someofthesetwoeventsoccurswithprobabilitythatisat most n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a ,forasuitable TothisendweuseLemmas15and16inwhichwesubstitute2 a for a .Weobtainthat, ontheonehand,if k lg n )]TJ/F15 11.9552 Tf 12.308 0 Td [(2and 1+2 a thentheprobabilityofhaltingissmaller than n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a ,and,ontheotherhand,thatif k> 1 2 lg n and > 8 a +7thentheprobability ofalabelcollisionissmallerthan n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a .Itfollowsthatsomeofthetwoconsideredevents occurswithprobabilityatmost2 n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a forsucientlylarge andanysucientlylarge n Thisprobabilityisatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsucientlylarge n Lemma18 Forany a> 0 ,thereexists > 0 and c> 0 suchthatthefollowingtwofacts aboutthe processhold.If r k = k +1 thenatmost cn= ln n binsareeverneededand cn ln 2 n randombitsareevergenerated,eachamongthesepropertiesoccurringwithprobabilitythat isatleast 1 )]TJ/F18 11.9552 Tf 12.722 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .If r k =2 k thenatmost cn 2 = ln n binsareeverneededand cn ln n randombitsareevergenerated,eachamongthesepropertiesoccurringwithprobabilitythat isatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Wethrow n ballsinto2 k = k bins.As k keepsincreasing,thentheprobabilityof terminationincreasesaswell,becauseboth2 k = k and k increaseasfunctionsof k .Let ustake k =1+lg n sothatthenumberofbinsis 2 n k .Wewanttoshowthatnobincontains morethan k ballswithasuitablysmallprobability. Letusconsideraspecicbinandlet X bethenumberofballsinthisbin.Theexpected numberofballsinthebinis = k 2 .WeusetheChernoboundforasequenceofBernoulli trialsintheformof Pr X> + < exp )]TJ/F18 11.9552 Tf 9.298 0 Td [(" 2 = 3 ; whichholdsfor0 <"< 1,see[75].Letuschoose = 1 2 ,sothat1+ = 3 2 and 3 2 = 3 4 k 72
PAGE 81
Weobtainthat Pr X>k < Pr )]TJ/F18 11.9552 Tf 5.48 9.684 Td [(X> 3 4 k < exp )]TJ/F15 11.9552 Tf 10.494 8.088 Td [(1 4 k 6 =exp )]TJ/F19 11.9552 Tf 5.479 9.684 Td [()]TJ/F18 11.9552 Tf 12.712 8.088 Td [( 24 +lg n ; whichcanbemadesmallerthan n )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 )]TJ/F24 7.9701 Tf 6.587 0 Td [(a fora sucientlylargewithrespectto a ,and sucientlylarge n .Usingtheunionbound,eachofthe n binscontainsatmost k balls withprobabilityatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .Thisimpliesthatterminationoccursassoonas k reachesor surpasses k =1+lg n ,withthecorrespondinglargeprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a Inthecaseof r k = k +1,theconsecutiveintegervaluesof k aretried,sothe process terminatesbythetime k =1+lg n ,andforthis k thenumberofbinsneededis n= log n Tochooseabinforanyvalueof k requiresatmost k randombits,soimplementingsuch choicesfor k =1 ; 2 ;:::; 1+lg n requires O log 2 n randombitsperprocessor. Inthecaseof r k =2 k ,the processterminatesby k equalto2+lg n ,andforthis valueof k thenumberofbinsneededis n 2 = log n .As k progressesthroughconsecutive powersof2,thesumofthesenumbersisasumofageometricprogression,andsoisof theorderofthemaximumterm,thatislog n ,whichisthenumberofrandombitsper processor. Thereisadirectcorrespondencebetweeniterationsoftheouterrepeatloopofalgorithm ArbitraryUnboundedMC andstagesofthe process.Wemapanexecutionofthe algorithmintoacorrespondingexecutionofa processinordertoapplyLemmas17and18 intheproofofthefollowingTheorem,whichsummarizestheperformanceofalgorithm ArbitraryUnboundedMC andjustiesthatitisMonteCarlo. Theorem9 Algorithm ArbitraryUnboundedMC alwaysterminates,forany > 0 Foreach a> 0 thereexists > 0 and c> 0 suchthatthealgorithmassignsuniquenames andhasthefollowingadditionalpropertieswithprobability 1 )]TJ/F18 11.9552 Tf 11.994 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .If r k = k +1 thenat most cn= ln n memorycellsareeverneeded, cn ln 2 n randombitsareevergenerated,andthe algorithmterminatesintime O log 2 n .If r k =2 k thenatmost cn 2 = ln n memorycells areeverneeded, cn ln n randombitsareevergenerated,andthealgorithmterminatesintime 73
PAGE 82
O log n Proof: ThealgorithmalwaysterminatesbyLemma14.ByLemma17,thealgorithmassigns correctnameswithprobabilitythatisatleast1 )]TJ/F18 11.9552 Tf 10.672 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a .Theremainingpropertiesfollowfrom Lemma18,becausethenumberofbinsisproportionaltothenumberofmemorycellsand thenumberofrandombitsperprocessorisproportionaltotime. Theinstantiationsofalgorithm ArbitraryUnboundedMC areclosetooptimality withrespecttosomeoftheperformancemetricsweconsider,dependingonwhether r k = k +1or r k =2 k .If r k = k +1thenthealgorithm'suseofsharedmemorywouldbe optimalifitstimewere O log n ,byTheorem2,butitmaymissspaceoptimalitybyat mostalogarithmicfactor,sincethealgorithm'stimeis O log 2 n .Similarly,if r k = k +1 thenthenumberofrandombitsevergenerated O n log 2 n missesoptimalitybyatmosta logarithmicfactor,byProposition1.Onthepotherhand,if r k =2 k thentheexpected time O log n isoptimal,byTheorem3,theexpectednumberofrandombits O n log n is optimal,byProposition1,andtheprobabilityoferror n O isoptimal,byProposition3, buttheamountofusedsharedmemorymissesoptimalitybyatmostapolynomialfactor, byTheorem2. 7.3CommonwithBoundedMemory Algorithm CommonBoundedMC solvesthenamingproblemforCommonPRAM withaconstantnumberofsharedreadwriteregisters.Tomakeitsexpositionmoremodular, weusetwoprocedures EstimateSize and ExtendNames .Procedure EstimateSize producesanestimateofthenumber n ofprocessors.Procedure ExtendNames isiterated multipletimes,eachiterationisintendedtoassignnamestoagroupofprocessors.Thisis accomplishedbytheprocessorsselectingintegervaluesatrandom,interpretedasthrowing ballsintobins,andverifyingforcollisions.Eachselectionofabinisfollowedbyacollision detection.Aballplacementwithoutadetectedcollisionresultsinanameassigned,otherwise theinvolvedprocessorstryagaintothrowballsintoarangeofbins.Theeectivenessof 74
PAGE 83
thealgorithmhingesofcalibratingthenumberofbinstotheexpectednumberofballstobe thrown. Algorithm CommonBoundedMC hasitspseudocodeinFigure7.5.Theprivatevariableshavethefollowingmeaning: size isanapproximationofthenumberofprocessors n and numberofbins determinesthesizeoftherangeofbins.Thepseudocodesofprocedures EstimateSize and ExtendNames aregiveninFigures7.3and7.4,respectively. Ballsintobinsforthersttime. Theroleofprocedure EstimateSize ,whencalled byalgorithm CommonBoundedMC ,istoestimatetheunknownnumberofprocessors n whichisreturnedas size ,toassignavaluetovariable numberofbins ,andassignvalues toeachprivatevariable bin ,whichindicatesthenumberofaselectedbinintherange [1 ; numberofbins ].Theproceduretriesconsecutivevaluesof k asapproximationsoflg n Foragiven k ,anexperimentiscarriedouttothrow n ballsinto k 2 k bins.Theexecution stopswhenthenumberofoccupiedbinsisatmost2 k ,andthen3 2 k istreatedasan approximationof n and k 2 k isthereturnednumberofbins. Lemma19 For n 20 processors,procedure EstimateSize returnsanestimate size of n suchthattheinequality size < 6 n holdswithcertaintyandtheinequality n< size holds withprobability 1 )]TJ/F15 11.9552 Tf 11.956 0 Td [(2 )]TJ/F22 7.9701 Tf 6.586 0 Td [( n Proof: Theprocedurereturns3 2 k ,forsomeinteger k> 0.Weinterpretselectingofvalues forvariable bin inaniterationofthemainrepeatloopasthrowing n ballsinto k 2 k bins; here k = j +2inthe j thiterationofthisloop,becausethesmallestvalueof k is3.Clearly, n isanupperboundonthenumberofoccupiedbins. If n isapowerof2,say n =2 i ,thentheprocedureterminatesbythetime i = k ,sothat 2 k < 2 i +1 =2 n .Otherwise,themaximumpossible k equals d lg n e ,because2 b lg n c
PAGE 84
Procedure EstimateSize initialize k 2/ initialapproximationoflg n / repeat k k +1 bin v randomintegerin[1 ;k 2 k ] initialize NonemptyBins 0 for i 1 to k 2 k do ifbin v = i then NonemptyBins NonemptyBins +1 untilNonemptyBins 2 k return 2 k ;k 2 k / 3 2 k is size k 2 k is numberofbins / Figure7.3: Apseudocodeforaprocessor v ofaCommonPRAM.This procedureisinvokedbyalgorithm CommonBoundedMC inFigure7.5. Thevariable NonemptyBins isshared. intoatmost2 k binswithprobabilitythatisatmost k 2 k 2 k 2 k k 2 k n ek 2 k 2 k 2 k 1 k n = ek 2 k k )]TJ/F24 7.9701 Tf 6.587 0 Td [(n = e 2 k k 2 k )]TJ/F24 7.9701 Tf 6.587 0 Td [(n e n= 3 k )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 n= 3 : 7.7 Therighthandsideof7.7isatmost e )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 3 whentheinequality k>e holds.Thesmallest k consideredinthepseudocodeinFigure7.3is k =3 >e .Theinequality k>e isconsistent with2 k n 3 when n 20.Thenumberofpossiblevaluesfor k is O log n sotheprobability oftheprocedurereturningfor2 k n 3 is e )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 3 O log n =2 )]TJ/F22 7.9701 Tf 6.586 0 Td [( n Procedure ExtendNames 'sbehaviorcanalsobeinterpretedasthrowingballsinto bins,whereaprocessor v 'sballisinabin x when bin v = x .Theprocedurerstveriesthe suitablerangeofbins[1 ; numberofbins ]forcollisions.Avericationforcollisionstakes eitherjustaconstanttimeorlog n time. Aconstantvericationoccurswhenthereisnoballintheconsideredbin i ,whichis veriedwhentheline ifbin x = i forsomeprocessor x "inthepseudocodeinFigure7.4is tobeexecuted.Suchavericationisperformedbyusingasharedregisterinitializedto0, 76
PAGE 85
Procedure ExtendNames initialize CollisionDetected collision v false for i 1 tonumberofbinsdo ifbin x = i forsomeprocessor x then ifbin v = i then for j 1 to lg sizedo if VerifyCollision then CollisionDetected collision v true ifnotcollision v then LastName LastName +1 name v LastName bin v 0 if numberofbins > size then numberofbins size ifcollision v then bin v randomintegerin[1 ; numberofbins ] Figure7.4: Apseudocodeforaprocessor v ofaCommonPRAM.This procedureinvokesprocedure VerifyCollision ,whosepseudocodeisinFigure4.1,andisitselfinvokedbyalgorithm CommonBoundedMC inFigure7.5.Thevariables LastName and CollisionDetected areshared.The privatevariable name storestheacquiredname.Theconstant > 0istobe determinedinanalysis. intowhichallprocessors v with bin v = i write1,thenalltheprocessorsreadthisregister, andiftheoutcomeofreadingis1thenallwrite0again,whichindicatesthatthereisat leastoneballinthebin,otherwisethereisnoball. Alogarithmictimevericationofcollisionoccurswhenthereissomeballinthecorrespondingbin.Thistriggerscallingprocedure VerifyCollision precisely lg n times; noticethatthisprocedurehasthedefaultparameter1,asonlyonebinisveriedatatime. Ultimately,whenacollisionisnotdetectedforsomeprocessor v whoseballisthebin,then thisprocessorincrements LastName andassignsitsnewvalueasatentativename.Otherwise,whenacollisionisdetected,processor v placesitsballinanewbinwhenthelastline 77
PAGE 86
inFigure7.4isexecuted.Toprepareforthis,thevariable numberofbins maybereset. Duringoneiterationofthemainrepeatloopofthepseudocodeofalgorithm CommonBoundedMC inFigure7.5,thenumberofbinsisrstsettoavaluethatis n log n byprocedure EstimateSize .Immediatelyafterthat,itisresetto n bytherstcall ofprocedure ExtendNames ,inwhichtheinstruction numberofbins size isperformed.Here,weneedtonoticethat numberofbins = n log n and size = n ,by thepseudocodesinFigures7.3and7.5andLemma19. Ballsintobinsforthesecondtime. Inthecourseofanalysisofperformanceofprocedure ExtendNames ,weconsideraballsintobinsprocess;wecallitsimplythe ballprocess .It proceedsthroughstagessothatinastagewehaveanumberofballswhichwethrowinto anumberofbins.Thesetsofbinsusedindierentstagesaredisjoint.Thenumberof ballsandbinsusedinastageareasdeterminedinthepseudocodeinFigure7.4,which meansthatthereare n ballsandthenumbersofbinsareasdeterminedbyanexecutionof procedure EstimateSize ,thatis,therststageuses numberofbins binsandsubsequent stagesuse size bins,asreturnedby EstimateSize .Theonlydierencefromtheactionsof procedure ExtendNames isthatcollisionsaredetectedwithcertaintyintheballprocess ratherthanbeingtestedfor,whichimpliesthattheparameter isnotinvolved.Theball processterminatesinstagelg size orearlierintherststageinwhichnomultiplebinsare produced,whensuchastageoccurs. Lemma20 Theballprocessresultsinallballsendingsingletonintheirbinsandthenumber oftimesaballisthrown,summedoverallthestages,being O n ,botheventsoccurringwith probability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n Proof: Theargumentleveragesthepropertythat,ineachstage,thenumberofbinsexceeds thenumberofballsbyatleastalogarithmicfactor.Wewilldenotethenumberofbinsina stageby m .Thisnumberwilltakeontwovalues,rst m = k 2 k returnedas numberofbins byprocedure EstimateSize andthen m =3 2 k returnedas size bythesameprocedure 78
PAGE 87
EstimateSize ,for k> 3.Because m = k 2 k intherststage,andalso size =3 2 k >n byLemma19,weobtainthat m> n 3 lg n 3 intherststage,andthat m isatleast n inthe followingstages,withprobabilityexponentiallycloseto1. Intherststage,wethrow ` 1 = n ballsintoatleast m = n 3 lg n 3 bins,withlarge probability.Conditionalontheeventthatthereareatleastthesemanybins,theprobability thatagivenballendsthestageasasingletoninabinis m 1 m 1 )]TJ/F15 11.9552 Tf 15.344 8.088 Td [(1 m ` 1 )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 1 )]TJ/F18 11.9552 Tf 13.15 8.088 Td [(` 1 )]TJ/F15 11.9552 Tf 11.956 0 Td [(1 m 1 )]TJ/F18 11.9552 Tf 13.58 8.088 Td [(n )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 n 3 lg n 3 1 )]TJ/F15 11.9552 Tf 19.348 8.088 Td [(4 lg n ; forsucientlylarge n ,whereweusedtheBernoulli'sinequality.Let Y 1 bethenumberof singletonballsintherststage.Theexpectancyof Y 1 satises E [ Y 1 ] ` 1 1 )]TJ/F15 11.9552 Tf 19.348 8.088 Td [(4 lg n : Toestimatethedeviationof Y 1 fromitsexpectedvalue E [ Y 1 ]weusetheboundeddierences inequality[71,75].Let B j bethebinofball b j ,for1 j ` 1 .Then Y 1 isoftheform Y 1 = h B 1 ;:::;B ` 1 ,where h satisestheLipschitzconditionwithconstant2,because movingoneballtoadierentbinresultsinchangingthevalueof h byatmost2with respecttotheoriginalvalue.Theboundeddierencesinequalityspecializedtothisinstance isasfollows,forany d> 0: Pr Y 1 E [ Y 1 ] )]TJ/F18 11.9552 Tf 11.955 0 Td [(d p ` 1 exp )]TJ/F18 11.9552 Tf 9.299 0 Td [(d 2 = 8 : 7.8 Weemploy d =lg n ,whichmakestherighthandsideof7.8asymptoticallyequalto n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n .Thenumberofballs ` 2 eligibleforthesecondstagecanbeestimatedasfollows, thisboundholdingwithprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n : ` 2 4 ` 1 lg n +lg n p ` 1 = 4 ` 1 lg n 1+ lg 2 n 4 p ` 1 5 n lg n ; 7.9 forsucientlylarge n Inthesecondstage,wethrow ` 2 ballsinto m n bins,withlargeprobability.Conditionalonthebound7.9holding,theprobabilitythatagivenballendsupasasingletonin 79
PAGE 88
abinis m 1 m 1 )]TJ/F15 11.9552 Tf 15.343 8.087 Td [(1 m ` 2 )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 1 )]TJ/F18 11.9552 Tf 13.151 8.087 Td [(` 2 )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 m 1 )]TJ/F15 11.9552 Tf 19.348 8.087 Td [(5 lg n ; whereweusedtheBernoulli'sinequality.Let Y 2 bethenumberofsingletonballsinthe secondstage.Theexpectancyof Y 2 satises E [ Y 2 ] ` 2 1 )]TJ/F15 11.9552 Tf 19.348 8.088 Td [(5 lg n : Toestimatethedeviationof Y 2 fromitsexpectedvalue E [ Y 2 ],weagainusethebounded dierencesinequality,whichspecializedtothisinstanceisasfollows,forany d> 0: Pr Y 2 E [ Y 2 ] )]TJ/F18 11.9552 Tf 11.955 0 Td [(d p ` 2 exp )]TJ/F18 11.9552 Tf 9.299 0 Td [(d 2 = 8 : 7.10 Weagainemploy d =lg n ,whichmakestherighthandsideof7.10asymptoticallyequal to n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n .Thenumberofballs ` 3 eligibleforthethirdstagecanbeboundedfromabove asfollows,whichholdswithprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n ,: ` 3 5 ` 2 lg n +lg n p ` 2 = 5 ` 2 lg n 1+ lg 2 n 5 p ` 2 6 n lg 2 n ; 7.11 forsucientlylarge n Next,wegeneralizetheseestimates.Instages i ,for i 2,amongtherst O log n ones, wethrowballsinto m n binswithlargeprobability.Let ` i bethenumberofballseligible forsuchastage i .Weshowbyinductionthat ` i ,for i 3,canbeestimatedasfollows: ` i 6 n lg 2 n 2 3 )]TJ/F24 7.9701 Tf 6.586 0 Td [(i 7.12 withprobability1 )]TJ/F18 11.9552 Tf 11.899 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n .Theestimate.11providesthebaseofinductionfor i =3. Intheinductivestep,weassume7.12,andconsiderwhathappensduringstage i> 3in ordertoestimatethenumberofballseligibleforthenextstage i +1. Instage i ,wethrow ` i ballsinto m n bins,withlargeprobability.Conditionalonthe bound.12,theprobabilitythatagivenballendsupsingleinabinis m 1 m 1 )]TJ/F15 11.9552 Tf 15.344 8.087 Td [(1 m ` i )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 1 )]TJ/F18 11.9552 Tf 13.151 8.087 Td [(` i )]TJ/F15 11.9552 Tf 11.955 0 Td [(1 m 1 )]TJ/F15 11.9552 Tf 13.15 8.087 Td [(6 2 3 )]TJ/F24 7.9701 Tf 6.586 0 Td [(i lg 2 n ; 80
PAGE 89
bytheinductiveassumption,wherewealsousedtheBernoulli'sinequality.If Y i isthe numberofsingletonballsinstage i ,thenitsexpectation E [ Y i ]satises E [ Y i ] ` i 1 )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(6 2 3 )]TJ/F24 7.9701 Tf 6.586 0 Td [(i lg 2 n : 7.13 Toestimatethedeviationof Y i fromitsexpectedvalue E [ Y i ],weagainusethebounded dierencesinequality,whichspecializedtothisinstanceisasfollows,forany d> 0: Pr Y i E [ Y i ] )]TJ/F18 11.9552 Tf 11.955 0 Td [(d p ` i exp )]TJ/F18 11.9552 Tf 9.298 0 Td [(d 2 = 8 : 7.14 Weemploy d =lg n ,whichmakestherighthandsideof7.14asymptoticallyequalto n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n .Thenumberofballs ` i +1 eligibleforthenextstage i +1canbeestimatedfrom aboveinthefollowingway,theestimateholdingwithprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n : ` i +1 6 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i ` i lg 2 n +lg n p ` i = 6 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i ` i lg 2 n 1+ 1 6 2 i )]TJ/F22 7.9701 Tf 6.587 0 Td [(3 lg 3 n ` )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 = 2 i 6 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i lg 2 n 6 n lg 2 n 2 3 )]TJ/F24 7.9701 Tf 6.586 0 Td [(i 1+ 2 i )]TJ/F22 7.9701 Tf 6.586 0 Td [(3 = 2 lg 4 n 6 p 6 n 6 n lg 2 n 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i 6 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i lg 2 n + 2 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i = 2 lg 2 n p 6 n 6 n lg 2 n 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i 6 lg 2 n + lg 2 n p 6 n 6 n lg 2 n 2 3 )]TJ/F24 7.9701 Tf 6.587 0 Td [(i )]TJ/F22 7.9701 Tf 6.587 0 Td [(1 ; forsucientlylarge n thatdoesnotdependon i .Fortheevent Y i E [ Y i ] )]TJ/F18 11.9552 Tf 12.364 0 Td [(d p ` i inthe estimate7.14tobemeaningful,itissucientifthefollowingestimateholds: lg n p ` i = o E [ Y i ] : Thisisthecaseaslongas ` i > lg 3 n ,because E [ Y i ]= ` i + o by7.13. Tosummarizeatthispoint,aslongas ` i issucientlylarge,thatis, ` i > lg 3 n ,the numberofeligibleballsdecreasesbyatleastafactorof2withprobabilitythatisatleast 1 )]TJ/F18 11.9552 Tf 11.667 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n .Itfollowsthatthetotalnumberofeligibleballs,summedoverthesestages,is O n withthisprobability. 81
PAGE 90
Algorithm CommonBoundedMC repeat initialize LastName 0 size,numberofbins EstimateSize for ` 1 to lg sizedo ExtendNames ifnotCollisionDetectedthenreturn Figure7.5: Apseudocodeforaprocessor v ofaCommonPRAM,where thereisaconstantnumberofsharedmemorycells.Procedures EstimateSize and ExtendNames havetheirpseudocodesinFigures7.3and7.4,respectively.Thevariables LastName and CollisionDetected areshared. Afteratmostlg n suchstages,thenumberofballsbecomesatmostlg 3 n withprobability 1 )]TJ/F18 11.9552 Tf 12.274 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n .Itremainstoconsiderthestageswhen ` i lg 3 n ,sothatwethrowatmost lg 3 n ballsintoatleast n bins.Theyallendupinsingletonbinswithaprobabilitythatis atleast n )]TJ/F15 11.9552 Tf 11.955 0 Td [(lg 3 n n lg 3 n 1 )]TJ/F15 11.9552 Tf 13.15 8.088 Td [(lg 3 n n lg 3 n 1 )]TJ/F15 11.9552 Tf 13.151 8.088 Td [(lg 6 n n ; bytheBernoulli'sinequality.Sotheprobabilityofacollisionisatmost lg 6 n n .Onestage withoutanycollisionterminatestheprocess.Ifwerepeatsuchstageslg n times,without evenremovingsingletonballs,thentheprobabilityofcollisionsoccurringinallthesestages isatmost lg 6 n n lg n = n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n : Thenumberofeligibleballssummedoverthesenalstagesisonlyatmostlg 7 n = o n ThefollowingTheoremsummarizestheperformanceofalgorithm CommonBoundedMC seethepseudocodeinFigure7.5asaMonteCarloone. Theorem10 Algorithm CommonBoundedMC terminatesalmostsurely.Foreach a> 0 thereexists > 0 and c> 0 suchthatthealgorithmassignsuniquenames,worksintime atmost cn ln n ,andusesatmost cn ln n randombits,eachamongthesepropertiesholding withprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a 82
PAGE 91
Proof: Oneiterationofthemainrepeatloopsucestoassignnameswithprobability 1 )]TJ/F18 11.9552 Tf 12.004 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n ,byLemma20.Thismeansthattheprobabilityofnotterminatingbythe i th iterationisatmost n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n i ,whichconvergesto0with i growingtoinnity. Thealgorithmreturnsduplicatenamesonlywhenacollisionoccursthatisnotdetected byprocedure VerifyCollision .Foragivenmultiplebin,oneiterationofthisprocedure doesnotdetectcollisionwithprobabilityatmost1 = 2,byLemma1.Therefore lg size iterationsdonotdetectcollisionwithprobability O n )]TJ/F24 7.9701 Tf 6.586 0 Td [(= 2 ,byLemma19.Thenumber ofnonemptybinsevertestedisatmost dn ,forsomeconstant d> 0,byLemma20,with thesuitablylargeprobability.Applyingtheunionboundresultsinestimate n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a onthe probabilityoferrorforsucientlylarge Thedurationofaniterationoftheinnerforloopiseitherconstant,thenwecallis short orittakestime O log size ,thenwecallit long .First,weestimatethetotaltimespent onshortiterations.Thistimeintherstiterationoftheinnerforloopisproportionalto numberofbins returnedbyprocedure EstimateSize ,whichisatmost6 n lg n ,by Lemma19.Eachofthesubsequentiterationstakestimeproportionalto size ,whichis atmost6 n ,againbyLemma19.Weobtainthatthetotalnumberofshortiterationsis O n log n intheworstcase.Next,weestimatethetotaltimespentonlongiterations.One suchaniterationhastimeproportionaltolg size ,whichisatmostlg6 n withcertainty.The numberofsuchiterationsisatmost dn withprobability1 )]TJ/F18 11.9552 Tf 12.585 0 Td [(n )]TJ/F22 7.9701 Tf 6.586 0 Td [(log n ,forsomeconstant d> 0,byLemma20.Weobtainthatthetotalnumberoflongiterationsis O n log n ,with thecorrespondinglylargeprobability.Combiningtheestimatesforshortandlongiterations, weobtain O n log n asaboundontimeofoneiterationofthemainrepeatloop.Onesuch aniterationsuceswithprobability1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F22 7.9701 Tf 6.587 0 Td [(log n ,byLemma20. Throwingoneballuses O log n randombits,byLemma19.Thenumberofthrowsis O n withthesuitablylargeprobability,byLemma20. Algorithm CommonBoundedMC isoptimalwithrespecttothefollowingperformancemetrics:theexpectedtime O n log n ,byTheorem1,thenumberofrandombits 83
PAGE 92
O n log n ,byProposition1,andtheprobabilityoferror n O ,byProposition3. 7.4CommonwithUnboundedMemory WeconsidernamingonaCommonPRAMinthecasewhentheamountofshared memoryisunbounded.Thealgorithmwepropose,called CommonUnboundedMC ,is similartoalgorithm CommonBoundedMC inSection7.3,inthatitinvolvesarandomized experimenttoestimatethenumberofprocessorsofthePRAM.Suchanexperimentisthen followedbyrepeatedlythrowingballsintobins,testingforcollisions,andthrowingagainif acollisionisdetected,untileventuallynocollisionsaredetected. Algorithm CommonUnboundedMC hasitspseudocodegiveninFigure7.7.The algorithmisstructuredasarepeatloop.Aniterationstartsbyinvokingprocedure GaugeSize ,whosepseudocodeisinFigure7.6.Thisprocedurereturns size asanestimateofthe numberofprocessors n .Next,aprocessorchoosesrandomlyabinintherange[1 ; 3 size ]. Thenitkeepsverifyingforcollisions lg size ,insuchamannerthatwhenacollisionisdetectedthenanewbinisselectedformthesamerange.Aftersuch lg size vericationsand possiblenewselectionsofbins,another lg size vericationsfollow,butwithoutchanging theselectedbins.Whennocollisionisdetectedinthesecondsegmentof lg size verications,thenthisterminatestherepeatloop,whichfollowsbyassigningtoeachstationthe rankoftheselectedbin,byaprexlikecomputation.Ifacollisionisdetectedinthesecond segmentof lg size verications,thenthisstartsanotheriterationofthemainrepeatloop. Procedure GaugeSizeMC returnsanestimateofthenumber n ofprocessorsinthe form2 k ,forsomepositiveinteger k .Itoperatesbytryingvariousvaluesof k ,and,for aconsidered k ,bythrowing n ballsinto2 k binsandnextcountinghowmanybinscontain balls.Suchcountingisperformedbyaprexlikecomputation,whosepseudocodeisomitted inFigure7.6.Theadditionalparameter > 0isanumberthataectstheprobabilityof underestimating n Thewayinwhichselectionsofnumbers k isperformediscontrolledbyfunction r k whichisaparameter.Wewillconsidertwoinstantiationsofthisfunction:oneisfunc84
PAGE 93
Procedure GaugeSizeMC k 1 repeat k r k bin v randomintegerin[1 ; 2 k ] until thenumberofselectedvaluesofvariable bin is 2 k = return d 2 k +1 = e Figure7.6: Apseudocodeforaprocessor v ofaCommonPRAM,wherethe numberofsharedmemorycellsisunbounded.Theconstant > 0isthesame parameterasinFigure7.7,andanincreasingfunction r k isalsoaparameter. tion r k = k +1andtheotherisfunction r k =2 k Lemma21 If r k = k +1 thenthevalueof size asreturnedby GaugeSizeMC satises size 2 n withcertaintyandtheinequality size n holdswithprobability 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [( )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 3 If r k =2 k thenthevalueof size asreturnedby GaugeSizeMC satises size 2 n 2 withcertaintyand size n 2 = 2 withprobability 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [( )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 3 Proof: Wemodelprocedure'sexecutionbyanexperimentofthrowing n ballsinto2 k bins. Iftheparameterfunction r k is r k = k +1thenweconsiderallpossibleconsecutive valuesof k startingfrom k =2,suchthat k = i +1inthe i thiterationoftherepeatloop. Ifparameter r k isfunction r k =2 k then k takesononlythepowersof2. Thereareatmost n binsoccupiedinanysuchanexperiment.Therefore,theprocedure returnsbythetimetheinequality2 k = n holdsand k isconsideredasdeterminingthe rangeofbins.Itfollowsthatif r k = k +1thenthereturnedvalue d 2 k +1 = e isatmost2 n If r k =2 k thentheworsterrorinestimatingoccurswhen2 i = = n )]TJ/F15 11.9552 Tf 11.987 0 Td [(1forsome i thatis apowerof2.Thenthereturnedvalueis2 2 i = = n )]TJ/F15 11.9552 Tf 11.643 0 Td [(1 2 = ,whichisatmost2 n 2 ,this occurringwithprobability1 )]TJ/F18 11.9552 Tf 11.955 0 Td [( )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 3 Given2 k bins,weestimatetheprobabilitythatthenumberofoccupiedbinsisatmost 85
PAGE 94
2 k = .Itis 2 k 2 k = 2 k = 2 k n 2 k e 2 k = 2 k = 1 n = e 2 k = )]TJ/F24 7.9701 Tf 6.586 0 Td [(n : Next,weidentifyarangeofvaluesof k forwhichthisprobabilityisexponentiallycloseto0 withrespectto n Tothisend,let0 << 1andletusconsidertheinequality e 2 k = )]TJ/F24 7.9701 Tf 6.586 0 Td [(n < n : 7.15 Itisequivalenttothefollowingone 2 k +ln )]TJ/F18 11.9552 Tf 11.955 0 Td [(n ln
PAGE 95
Algorithm CommonUnboundedMC repeat size GaugeSize bin v randomintegerin[1 ; 3 size ] for i 1 to lg sizedo if VerifyCollision bin v then bin v randomnumberin[1 ; 3 size ] CollisionDetected false for i 1 to lg sizedo if VerifyCollision bin v then CollisionDetected true untilnotCollisionDetected name v therankof bin v amongselectedbins Figure7.7: Apseudocodeforaprocessor v ofaCommonPRAM,where thenumberofsharedmemorycellsisunbounded.Theconstant > 0isa parameterimpactingtheprobabilityoferror.Theprivatevariable name stores theacquiredname. Wediscussperformanceofalgorithm CommonUnboundedMC seethepseudocode inFigure7.7byreferringtoanalysisofarelatedalgorithm CommonUnboundedLV giveninSection6.4.Weconsidera processwithverications ,whichisdenedasfollows. Theprocessproceedsthroughstages.Therststagestartswithplacing n ballsinto3 size bins.Foranyofsubsequentstages,foreachmultiplebinsandforeachballinsuchabinwe performaBernoullitrialwiththeprobability 1 2 ofsuccess,whichrepresentstheoutcomeof procedure VerifyCollision .Asuccessinatrialisreferredtoasa positiveverication otherwiseitisa negative one.Ifatleastonepositivevericationoccursforaballina multiplebinthenalltheballsinthisbinarerelocatedinthisstagetobinsselecteduniformly atrandomandindependentlyforeachsuchaball,otherwisetheballsstayputinthisbin untilthenextstage.Theprocessterminateswhenallballsaresingleton. Lemma22 Foranynumber a> 0 thereexists > 0 suchthatthe processwithverica87
PAGE 96
tionsterminateswithin lg n stageswithallofthemcomprisingthetotalof O n ballthrows withprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: WeusetherespectiveLemma11inSection6.4.Theconstant3determiningour processwithvericationscorrespondsto1+ inSection6.4.Thecorresponding process invericationsconsideredinSection6.4isdenedbyreferringtoknown n .Weusethe approximation size instead,whichisatleastaslargeas n withprobability1 )]TJ/F18 11.9552 Tf 12.333 0 Td [( )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 3 ,by Lemma21justproved.BySection6.4,our processwithvericationsdoesnotterminate within lg n stageswhen size n withprobabilityatmost n )]TJ/F22 7.9701 Tf 6.587 0 Td [(2 a andtheinequality size n doesnotholdwithprobabilityatmost )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 3 .Thereforetheconclusionwewanttoprove doesnotholdwithprobabilityatmost n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a + )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 3 ,whichisatmost n )]TJ/F22 7.9701 Tf 6.586 0 Td [(2 a forsuciently large n ThefollowingTheoremsummarizestheperformanceofalgorithm CommonUnboundedMC seethepseudocodeinFigure7.7asaMonteCarloone.Itsproofreliesonmappinganexecutionofthe processwithvericationsonexecutionsofalgorithm CommonUnboundedMC inanaturalmanner. Theorem11 Algorithm CommonUnboundedMC terminatesalmostsurely,forsucientlylarge .Foreach a> 0 thereexists > 0 and c> 0 suchthatthealgorithm assignsuniquenamesandhasthefollowingadditionalpropertieswithprobability 1 )]TJ/F18 11.9552 Tf 12.448 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a If r k = k +1 thenatmost cn memorycellsareeverneeded, cn ln 2 n randombitsare evergenerated,andthealgorithmterminatesintime O log 2 n .If r k =2 k thenatmost cn 2 memorycellsareeverneeded, cn ln n randombitsareevergenerated,andthealgorithm terminatesintime O log n Proof: Foragiven a> 0,letustake thatexistsbyLemma22.Whenthe processwith vericationsterminatesthenthismodelsassigninguniquenamesbythealgorithm.Itfollows thatoneiterationoftherepeatloopresultsinalgorithmterminatingwithpropernames assignedwithprobability1 )]TJ/F18 11.9552 Tf 12.043 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a .Oneiterationofthemainrepeatloopdoesnotresultin 88
PAGE 97
terminationwithprobabilityatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,so i iterationsarenotsucienttoterminatewith probabilityatmost n )]TJ/F24 7.9701 Tf 6.586 0 Td [(ia .Thisconvergesto0withincreasing i sothealgorithmterminates almostsurely. TheperformancemetricsrelymostlyonLemma21.Weconsidertwocases,depending onwhichfunction r k isused. If r k = k +1thenprocedure GaugeSizeMC considersalltheconsecutivevalues of k uptolg n ,andforeachsuch k ,throwingaballrequires k randombits.Weobtain thatprocedure GaugeSizeMC uses O n log 2 n randombits.Similarly,tocomputethe numberofselectedvaluesinaniterationofthemainrepeatloopofthisproceduretakestime O k ,forthecorresponding k ,sothisproceduretakes O log 2 n time.Thevalueof size satises size 2 n withcertainty.Therefore, O n memoryregistersareeverneededand onethrowofaballuses O log n randombits,after size hasbeencomputed.Itfollowsthat oneiterationofthemainrepeatloopofthealgorithm,afterprocedure GaugeSizeMC hasbeencompleted,uses O n log n randombits,byLemmas21and22,andtakes O log n time.Sinceoneiterationofthemainrepeatloopsuceswithprobability1 )]TJ/F18 11.9552 Tf 10.346 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a ,theoverall timeisdominatedbythetimeperformanceofprocedure GaugeSizeMC If r k =2 k thenprocedure GaugeSizeMC considersalltheconsecutivepowersof2 asvaluesof k uptolg n ,andforeachsuch k ,throwingaballrequires k randombits.Sincethe values k formageometricprogression,procedure GaugeSizeMC uses O log n random bitsperprocessor.Similarly,tocomputethenumberofselectedvaluesinaniterationofthe mainrepeatloopofthisproceduretakestime O k ,forthecorresponding k thatincrease geometrically,sothisproceduretakes O log n time.Thevalueof size satises size 2 n withcertainty.ByLemma21, O n 2 memoryregistersareeverneeded,soonethrowof aballuses O log n randombits.Oneiterationofthemainrepeatloop,afterprocedure GaugeSizeMC hasbeencompleted,uses O n log n randombits,byLemmas21and22, andtakes O log n time. Theinstantiationsofalgorithm CommonUnboundedMC areclosetooptimalitywith 89
PAGE 98
respecttosomeoftheperformancemetricsweconsider,dependingonwhether r k = k +1 or r k =2 k .If r k = k +1thenthealgorithm'suseofsharedmemorywouldbeoptimalif itstimewere O log n ,byTheorem2,butitmissesspaceoptimalitybyatmostalogarithmic factor,sincethealgorithm'stimeis O log 2 n .Similarly,forthiscaseof r k = k +1,the numberofrandombitsevergenerated O n log 2 n missesoptimalitybyatmostalogarithmic factor,byProposition1.Intheothercaseof r k =2 k ,theexpectedtime O log n is optimal,byTheorem3,theexpectednumberofrandombits O n log n isoptimal,by Proposition1,andtheprobabilityoferror n O isoptimal,byProposition3,buttheamount ofusedsharedmemorymissesoptimalitybyatmostapolynomialfactor,byTheorem3. 7.5Conclusion WeconsideredfourvariantsofthenamingproblemforananonymousPRAMwhenthe numberofprocessors n isunknownanddevelopedMonteCarlonamingalgorithmsforeach ofthem.Thetwoalgorithmsforaboundednumberofsharedregisterareprovablyoptimal withrespecttothefollowingthreeperformancemetrics:expectedtime,expectednumberof generatedrandombitsandprobabilityoferror. 90
PAGE 99
8.NAMINGACHANNELWITHBEEPS Inthissection,weconsideranonymouschannelwithbeeping.Wepresentnamescanbe assignedtotheanonymousstationsbyaLasVegasandaMonteCarlonamingalgorithms. 8.1ALasVegasAlgorithm WegiveaLasVegasnamingalgorithmforthecasewhen n isknown.Theideaistohave stationschooseroundstobeepfromasegmentofintegers.Asaconvenientprobabilistic interpretation,theseintegersareinterpretedasbins,andafterselectingabinaballis placedinthebin.Thealgorithmproceedsbyconsideringalltheconsecutivebins.First,a binisveriedtobenonemptybymakingtheownersoftheballsinthebinbeep.When nobeepisheardthenthenextbinisconsidered,otherwisethenonemptybinisveried forcollisions.Suchavericationisperformedby O log n consecutivecallsofprocedure DetectCollision .Whenacollisionisnotdetectedthenthestationsthatplacedtheir ballsinthisbinassignthemselvesthenextavailablename,otherwisethestationswhose ballsareinthisbinplacetheirballsinanewsetofbins.Wheneachstationhasaname assigned,weverifyifthemaximumassignednameis n .Ifthisisthecasethenthealgorithm terminates,otherwisewerepeat.Thealgorithmiscalled BeepNamingLV ,itspseudocode isinFigure8.1. Algorithm BeepNamingLV isanalyzedbymodelingitsexecutionsbyaprocessof throwingballsintobins,whichwecallthe ballprocess .Theprocessproceedsthroughstages. Thereare n ballsintherststage.Whenastagebeginsandtherearesome i ballseligible forthestagethenthenumberofusedbinsis i lg n .Eachballisthrownintoarandomly selectedbin.Next,ballsthataresingletonintheirbinsareremovedandtheremainingballs thatparticipatedincollisionsadvancetothenextstage.Theprocessterminateswhenno eligibleballsremain. Lemma23 Thenumberoftimesaballisthrownintoabinduringanexecutionoftheball processthatstartswith n ballsisatmost 3 n withprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(e )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 4 Proof: Ineachstage,wethrowsome k ballsintoatleast k lg n bins.Theprobabilitythat 91
PAGE 100
Algorithm BeepNamingLV repeat counter 0; left 1; right n lg n ; name v null repeat slot v randomnumberintheinterval[ left ; right ] for i lefttorightdo if i = slot v then beep if abeepwasjustheard then collision false for j 1 to lg n do if DetectCollision thencollision true ifnotcollisionthen counter counter +1 name v counter ifname v = nullthenbeep if abeepwasjustheard then left counter right n )]TJ/F57 11.9552 Tf 11.955 0 Td [(counter lg n until nobeepwasheardinthepreviousround untilcounter = n Figure8.1: Apseudocodeforastation v .Thenumberofstations n is known.Constant > 1isaparameterdeterminedintheanalysis.Procedure DetectCollision hasitspseudocodeinFigure4.2.Thevariable name isto storetheassignedidentier. agivenballendsupsingletoninabinisatleast 1 )]TJ/F18 11.9552 Tf 23.27 8.087 Td [(k k lg n =1 )]TJ/F15 11.9552 Tf 19.348 8.087 Td [(1 lg n ; whichwedenoteas p .Aballisthrownrepeatedlyinconsecutiveiterationsuntilitlands singleinabin.Ourimmediateconcernisthenumberoftrialstohaveallballsassingletons intheirbins. Supposethatweperformsome m independentBernoullitrials,eachwithprobability p ofsuccess,andlet X bethenumberofsuccesses.Weshownextthat m = n suceswith largeprobabilitytohavetheinequality X n 92
PAGE 101
Theexpectednumberofsuccessesis E [ X ]= = pm .WeusetheChernoboundin theform Pr X< )]TJ/F18 11.9552 Tf 11.955 0 Td [(" 0 ,terminatesalmostsurelyand thereisnoerrorwhenitterminates.Foreach a> 0 ,thereexists > 1 and c> 0 suchthat thealgorithmassignsuniquenames,worksintimeatmost cn lg n ,andusesatmost cn lg n randombits,allthesepropertiesholdingwithprobabilityatleast 1 )]TJ/F18 11.9552 Tf 12.832 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a ,forsuciently large n Proof: Consideraniterationofthemainrepeatloop.Anerrorcanoccurinthisiteration onlywhenthereisacollisionthatisnotdetectedbyprocedure DetectCollision innone ofits lg n calls.Suchanerrorresultsinduplicatenames,sothatthenumberofassigned dierentnamesissmallerthan n .Themaximumnameassignedinaniterationisthevalue ofthevariable counter ,whichhasthesamevalueateachstation.Thealgorithmterminates 93
PAGE 102
byhavinganiterationthatproduces counter = n ,butthentherearenorepetitionsamong thenames,andsothereisnoerror. Nextweshowthatterminationisasureevent.Consideraniterationofthemainrepeatloop.Thereare n ballsandeachofthemiskeptthrownuntileitheritisnotinvolvedina collisionorthereisacollisionbutitisnotdetected.Eventuallyeachballislefttoresidein itsbinwithprobability1.Thismeansthateachiterationendsalmostsurely. Weintroducethenotationfortwoeventsinaniterationofthemainrepeatloop.Let A betheeventthatthereisacollisionthatpassesundetected.Theiterationfailstoassign propernamesifandonlyifevent A holds.Let B betheeventthatthetotalnumberof throwingballsintobinsisatmost3 n .Wedenoteby : E thecomplementsofanevent E WehavethatPr : B e )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 4 ,byLemma23. Whenaballlandsinabinthenitisveriedforacollision lg n times.Ifthereisa collisionthenitpassesundetectedwithprobabilityatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [( .Thisisbecauseonecallof procedure DetectCollision detectsacollisionwithprobabilityatleast 1 2 ,byLemma2, inwhich m =1and k 2. Weestimatetheprobabilityoftheeventthataniterationfailstoassignpropernames, whichisthesameasofevent A .Thisisaccomplishedasfollows: Pr A =Pr A B +Pr A : B =Pr A j B Pr B +Pr A j: B Pr : B Pr A j B +Pr : B 3 n n )]TJ/F24 7.9701 Tf 6.587 0 Td [( + e )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 4 ; 8.3 whereweusedtheunionboundtoobtainthelastline8.3.Itfollowsthatatleast i iterationsareneededwithprobabilityatmost e )]TJ/F24 7.9701 Tf 6.587 0 Td [(n= 4 +3 n 1 )]TJ/F24 7.9701 Tf 6.586 0 Td [( i ,whichconvergesto0as i growsunbounded,assumingonlythat > 1and n issucientlylarge. Letusconsidertheevent : A B ,whichoccurswhenballsarethrownatmost3 n times andallcollisionsaredetected,whenmodelinganiterationofthemainrepeatloop.The 94
PAGE 103
probabilitythatevent : A B holdscanbeestimatedfrombelowasfollows: Pr : A B =Pr : A j B Pr B )]TJ/F15 11.9552 Tf 11.955 0 Td [(3 n 1 )]TJ/F24 7.9701 Tf 6.586 0 Td [( )]TJ/F18 11.9552 Tf 11.955 0 Td [(e )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 4 1 )]TJ/F15 11.9552 Tf 11.956 0 Td [(3 n 1 )]TJ/F24 7.9701 Tf 6.587 0 Td [( )]TJ/F18 11.9552 Tf 11.955 0 Td [(e )]TJ/F24 7.9701 Tf 6.586 0 Td [(n= 4 )]TJ/F15 11.9552 Tf 11.955 0 Td [(3 n 1 )]TJ/F24 7.9701 Tf 6.586 0 Td [( : 8.4 Thisbound8.4isatleast1 )]TJ/F18 11.9552 Tf 11.469 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a forsucientlylarge > 1,whenalso n islargeenough. Bound.4holdsfortherstiterationofthemainrepeatloop.Sowithprobabilityat least1 )]TJ/F18 11.9552 Tf 11.888 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a therstiterationassignspropernameswithatmost3 n ballsthrownintotal. Letusassumethatthiseventoccurs.Thenthewholeexecutiontakestimeatmost cn lg n forasuitablylarge c> 0.Thisisbecauseprocedure DetectCollision isexecutedat most3 n lg n times,andeachofitscallstakestworounds.Oneassignmentofavalueto variable slot requireslg n lg n < 2lg n bits,forsucientlylarge n .Thereareatmost3 n suchassignments,foratotalofatmost cn lg n randombits,forasuitablylarge c> 0. Algorithm BeepNamingLV runsintheoptimalexpectedtime O n log n ,byProposition6,anditusestheoptimumexpectednumberofrandombits O n log n ,byProposition5, thesepropositionsgiveninSection5.3. 8.2AMonteCarloAlgorithm Wegivearandomizednamingalgorithmforthecasewhen n isunknown.Inviewof Proposition8,noLasVegasalgorithmexistsinthiscase,sowedevelopaMonteCarloone. Thealgorithmagaincanbeinterpretedasrepeatedlythrowingballsintobinsandverifyingforcollisions.Abinisdeterminedbyastringofsome k bits.Eachstationchooses onesuchastringrandomly.Thealgorithmproceedstorepeatedlyidentifythesmallestlexicographicallystringamongthosenotconsideredyet.Thisisaccomplishedbyprocedure NextSting whichoperatesasasearchimplementedbyusingbeeps.Havingidentieda nonemptybin,allthestationsthatplacedtheirballsintothisbinverifyifthereisacollisioninthisbinbycalling DetectCollision asuitablylargenumberoftimes.Incase nocollisionhasbeendetected,thestationswhoseballsareinthebinassignthemselvesthe 95
PAGE 104
consecutiveavailablenameasatemporaryone.Thiscontinuesuntilalltheballshavebeen considered.Ifnocollisionhaseverbeendetectedinthecurrentstage,thenthealgorithm terminatesandthetemporarynamesareconsideredasthenalassignednames,otherwise thealgorithmproceedstothenextstage. Next,wespecifyprocedure NextString .Itoperatesasaradixsearchtoidentify thesmalleststringofbitsbyconsideringconsecutivebitpositions.Itusestwovariables mystring and k ,where k isthelengthofthebitstringsconsideredand mystring v isthe stringof k bitsgeneratedbystation v .Theprocedurebeginsbysettingto1allbitpositions invariable string ,whichhas k suchbitpositions.Thentheconsecutivebitpositions i =1 ; 2 ;:::;k areconsideredonebyone.Foragivenbitposition i ,allthestations v ,that stillcanpossiblyhavethesmalleststringandwhosebitonposition i in mystring v is0,do beep.Thisdeterminestherst i bitsofthesmalleststring,becauseifabeepisheardthen the i thbitofthesmalleststringis0andotherwiseitis1.Thisisrecordedbysettingthe i th bitpositioninthevariable string tothedeterminedbit.Thestationseligibleforbeeping, iftheir i thbitis0,arethosewhosestringsagreeontherst i )]TJ/F15 11.9552 Tf 10.911 0 Td [(1positionswiththesmallest string.Afterall k bitpositionshavebeenconsidered,thevariable string isreturned. Procedure NextString hasitspseudocodeinFigure8.2.Itsrelevantpropertyis summarizedasthefollowinglemma. Lemma24 Procedure NextString returnsthesmallestlexicographicallystringamong thenonnullstringvaluesoftheprivatecopiesofthevariable mystring Proof: Thestringthatisoutputisobtainedbyprocessingalltheinputstrings mystring throughconsecutivebitpositions.Weshowtheinvariantthatafter i bitshavebeenconsidered,for0 i k ,thenthebitsonthesepositionsmaketheprexoftherst i bitsofthe smalleststring. Theinvariantisshownbyinductionon i .When i =1thenthebitsonpreviously consideredpositionsmakeanemptystring,asnopositionshavebeenconsideredyet,and theemptystringisaprexofthesmalleststring.Supposethattheinvariantholdsforall i 96
PAGE 105
Procedure NextString string astringof k bitpositions,withallofthemsetto1 for i 1 to k do if mystring v matches string ontherst i )]TJ/F15 11.9552 Tf 11.955 0 Td [(1bitpositions and the i thbitof mystring v is0 thenbeep if abeepwasheardinthepreviousround then setthe i thbitof string to0 return string Figure8.2: Apseudocodeforastation v .Thisprocedureisusedby algorithm BeepNamingMC .Thevariables mystring and k arethesame asthoseinthepseudocodeinFigure8.3. suchthat0 i
PAGE 106
Algorithm BeepNamingMC k 1 repeat k 2 k collision false counter 0 mystring v arandomstringof k bits repeat ifmystring v 6 = nullthensmalleststring NextString ifmystring v = smalleststringthen for i 1 to k do if DetectCollision thencollision true ifnotcollisionthen counter counter +1 name v counter mystring v null ifmystring v 6 = nullthenbeep until nobeepwasheardinthepreviousround untilnotcollision Figure8.3: Apseudocodeforastation v .Constant > 0isaninteger parameterdeterminedintheanalysis.Procedure DetectCollision hasits pseudocodeinFigure4.2andprocedure NextString hasitspseudocodein Figure8.2.Thevariable name istostoretheassignedidentier. String .Next,thisbinisveriedforcollisionsbycallingprocedure DetectCollision k times,foraconstant > 0,whichisaparametertobesettledinanalysis.Duringsucha verication,thestationswhoseballsareinthisbinparticipateonly. Thenexttheoremsummarizesthegoodpropertiesofalgorithm BeepNamingMC .In particular,thatitisaMonteCarloalgorithmwithasuitablysmallprobabilityoferror. Theorem13 Algorithm BeepNamingMC ,forany > 0 ,terminatesalmostsurely.For each a> 0 ,thereexists > 0 and c> 0 suchthatthealgorithmassignsuniquenames,works intimeatmost cn lg n ,andusesatmost cn lg n randombits,allthesepropertiesholding withprobabilityatleast 1 )]TJ/F18 11.9552 Tf 11.956 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Proof: Weinterpretaniterationoftheouterrepeatloopasastageinaprocessofthrowing 98
PAGE 107
n ballsinto2 k binsandverifying k timesforcollisions.Thestringselectedbyastation isthenameofthebin.Whenatleastonecollisionisdetectedthen k getsincremented andanotheriterationisperformed.Anerroroccurswhenthereisacollisionbutitisnot detected. Nextweestimatefromabovetheprobabilityofnotdetectingacollision.Tothisend, weconsidertwocases,dependingonwhichoftheinequalities2 k 2 k n= 2,then n )]TJ/F15 11.9552 Tf 12.001 0 Td [(2 k 1and k lg n )]TJ/F15 11.9552 Tf 12.001 0 Td [(1,so thatweobtainthefollowingestimate: 2 )]TJ/F24 7.9701 Tf 6.587 0 Td [(n +2 k k 2 )]TJ/F24 7.9701 Tf 6.587 0 Td [( lg n )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 2 n )]TJ/F24 7.9701 Tf 6.587 0 Td [( : 99
PAGE 108
Weobtainedthefollowingtwoestimates:2 )]TJ/F22 7.9701 Tf 6.587 0 Td [( n and2 n )]TJ/F24 7.9701 Tf 6.587 0 Td [( ,ofwhichthelatterislarger,for sucientlylarge n .Itissucienttotake >a ,asthentheinequality2 n )]TJ/F24 7.9701 Tf 6.586 0 Td [( a ,asthen n )]TJ/F24 7.9701 Tf 6.587 0 Td [( 2.Thenthenumberofbins is2 k = n d .Theprobabilitythatthereisnocollisionatallinthisstageisatleast 1 )]TJ/F18 11.9552 Tf 15.578 8.088 Td [(n n d n 1 )]TJ/F18 11.9552 Tf 20.989 8.088 Td [(n n d )]TJ/F22 7.9701 Tf 6.586 0 Td [(1 =1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(d +2 : 8.5 Choosing d = a +2weobtainthatthealgorithmterminatesbytheiterationoftheouter repeatloopwhen k = d lg n withprobabilityatleast1 )]TJ/F18 11.9552 Tf 12.527 0 Td [(n )]TJ/F24 7.9701 Tf 6.586 0 Td [(a .Oneiterationoftheouter repeatloop,forsome k ,isproportionalto k n .Thetotaltimespentuptoandincluding k = d lg n isproportionalto lg a +2lg n X i =1 2 i n n 2 a +2lg n = O n log n 8.6 withprobabilityatleast1 )]TJ/F18 11.9552 Tf 11.955 0 Td [(n )]TJ/F24 7.9701 Tf 6.587 0 Td [(a Thenumberofbitsgenerateduptoandincludingtheiterationfor k = d lg n isalso proportionalto8.6.Thisisbecausethenumberofbitsgeneratedinoneiterationofthe mainrepeatloopisproportionalto k n ,similarlyastherunningtime. Toshowthatthealgorithmterminatesalmostsurely,itissucienttodemonstratethat theprobabilityofacollisionconvergestozerowith k increasing.Theprobabilityofno 100
PAGE 109
collisionfor k = d lg n isatmost n )]TJ/F24 7.9701 Tf 6.587 0 Td [(d +2 ,by.6.If k growstoinnitythen d = k= lg n increasestoinnityaswell,andthen n )]TJ/F24 7.9701 Tf 6.586 0 Td [(d +2 convergesto0asafunctionof d Algorithm BeepNamingMC isoptimalwithrespecttothefollowingperformance measures:theexpectedrunningtime O n log n ,byProposition6,theexpectednumberof usedrandombits O n log n ,byProposition5,andtheprobabilityoferror,asdetermined bythenumberofusedbits,byProposition7. 8.3Conclusion Weconsideredachannelinwhichasynchronizedbeepingistheonlymeansofcommunication.Weshowedthatnamescanbeassignedtotheanonymousstationsbyrandomized algorithms.ThealgorithmsareeitherLasVegasorMonteCarlo,dependingonwhetherthe numberofstations n isknownornot,respectively.Theperformancecharacteristicsofthe twoalgorithms,suchastherunningtime,thenumberofrandombits,andtheprobability oferror,areprovedtobeoptimal. 101
PAGE 110
9.OPENPROBLEMSANDFUTUREWORK Herewegivesomeoftheopenproblemsandfuturework.Thealgorithmscoverthe boundary"casesfortheanonymoussynchronousPRAM.Onecaseisaboutaminimum amountofsharedmemory,thatis,whenonlyaconstantnumberofsharedmemorycells areavailable.Theothercaseisaboutaminimumexpectedrunningtime,thatis,whenthe expectedrunningtimeis O log n ;suchperformancerequiresanumberofsharedregisters thatgrowsunboundedwith n .Itwouldbeinterestingtohavetheseresultsgeneralizedby investigatingnamingonaPRAMwhenthenumberofprocessorsandthenumberofshared registersareindependentparametersofthemodel. ItisanopenproblemtodevelopMonteCarloalgorithmsforArbitraryandCommon PRAMsforthecasewhentheamountofsharedmemoryisunbounded,suchthattheyare simultaneouslyasymptoticallyoptimalwithrespecttothesesamethreeperformancemetrics: expectedtime,expectednumberofgeneratedrandombitsandprobabilityoferror. Thealgorithmswedevelopedforbeepingchannelsrelyinanessentialmanneronsynchronizationofthechannel.Itwouldbeinterestingtoconsiderananonymousasynchronous beepingchannelandinvestigatehowtoassignnamestostationsinsuchacommunication environment. 102
PAGE 111
REFERENCES [1]YehudaAfek,NogaAlon,ZivBarJoseph,AlejandroCornejo,BernhardHaeupler,and FabianKuhn.Beepingamaximalindependentset. DistributedComputing ,26:195{ 208,2013. [2]YehudaAfek,NogaAlon,OmerBarad,EranHornstein,NaamaBarkai,andZivBarJoseph.Abiologicalsolutiontoafundamentaldistributedcomputingproblem. Science 3316014:183{185,2011. [3]YehudaAfekandYossiMatias.Electionsinanonymousnetworks. Informationand Computation ,1132:312{330,1994. [4]DanAlistarh,JamesAspnes,KerenCensorHillel,SethGilbert,andRachidGuerraoui. Tightboundsforasynchronousrenaming. JournaloftheACM ,61:18:1{18:51,2014. [5]DanAlistarh,HagitAttiya,SethGilbert,AndreiGiurgiu,andRachidGuerraoui.Fast randomizedtestandsetandrenaming.In Proceedingsofthe 24 thInternationalSymposiumonDistributedComputingDISC ,volume6343of LectureNotesinComputer Science ,pages94{108.Springer,2010. [6]DanaAngluin.Localandglobalpropertiesinnetworksofprocessors.In Proceedingsof the 12 thACMSymposiumonTheoryofComputingSTOC ,pages82{93,1980. [7]DanaAngluin,JamesAspnes,Zo eDiamadi,MichaelJ.Fischer,andRenePeralta. Computationinnetworksofpassivelymobilenitestatesensors. DistributedComputing ,184:235{253,2006. [8]DanaAngluin,JamesAspnes,DavidEisenstat,andEricRuppert.Onthepowerof anonymousonewaycommunication.In Proceedingsofthe 9 thInternationalConference onPrinciplesofDistributedSystemsOPODIS2005 ,volume3974of LectureNotesin ComputerScience ,pages396{411.Springer,2006. [9]DanaAngluin,JamesAspnes,MichaelJ.Fischer,andHongJiang.Selfstabilizing populationprotocols. ACMTransactionsonAutonomousandAdaptiveSystems ,34, 2008. [10]JamesAspnes,FaithEllenFich,andEricRuppert.Relationshipsbetweenbroadcast andsharedmemoryinreliableanonymousdistributedsystems. DistributedComputing 183:209{219,2006. [11]JamesAspnesandEricRuppert.Anintroductiontopopulationprotocols. Bulletinof theEATCS ,93:98{117,2007. 103
PAGE 112
[12]JamesAspnes,GauriShah,andJatinShah.Waitfreeconsensuswithinnitearrivals. In Proceedingsofthe 34 thACMSymposiumonTheoryofComputingSTOC ,pages 524{533,2002. [13]HagitAttiya,AmotzBarNoy,DannyDolev,DavidPeleg,andR udigerReischuk.Renaminginanasynchronousenvironment. JournaloftheACM ,373:524{548,1990. [14]HagitAttiyaandFaithEllen. ImpossibilityResultsforDistributedComputing .Synthesis LecturesonDistributedComputingTheory.Morgan&ClaypoolPublishers,2014. [15]HagitAttiya,AllaGorbach,andShlomoMoran.Computingintotallyanonymous asynchronoussharedmemorysystems. InformationandComputation ,1732:162{183, 2002. [16]HagitAttiyaandMarcSnir.Bettercomputingontheanonymousring. Journalof Algorithms ,122:204{238,1991. [17]HagitAttiya,MarcSnir,andManfredK.Warmuth.Computingonananonymousring. JournaloftheACM ,354:845{875,1988. [18]HagitAttiyaandJenniferWelch. DistributedComputing:Fundamentals,Simulations, andAdvancedTopics .JohnWiley,2ndedition,2004. [19]PaulBeame.Limitsonthepowerofconcurrentwriteparallelmachines. Information andComputation ,761:13{28,1988. [20]PaoloBoldiandSebastianoVigna.Aneectivecharacterizationofcomputabilityin anonymousnetworks.In Proceedingsofthe 15 thInternationalConferenceonDistributed ComputingDISC ,volume2180of LectureNotesinComputerScience ,pages33{47. Springer,2001. [21]FrancoisBonnetandMichelRaynal.Thepriceofanonymity:Optimalconsensusdespite asynchrony,crash,andanonymity. ACMTransactionsonAutonomousandAdaptive Systems ,6:23,2011. [22]PhilippBrandes,MarcinKardas,MarekKlonowski,DominikPaj ak,andRogerWattenhofer.Approximatingthesizeofaradionetworkinbeepingmodel.In Proceedings ofthe 23 rdInternationalColloquiumonStructuralInformationandCommunication ComplexitySIROCCO ,2016.Toappear. [23]HarryBuhrman,AlessandroPanconesi,RiccardoSilvestri,andPaulVitanyi.Onthe importanceofhavinganidentityor,isconsensusreallyuniversal? DistributedComputing ,183:167{176,2006. [24]JeremieChalopin,YvesMetivier,andThomasMorsellino.Enumerationandleader electioninpartiallyanonymousandmultihopbroadcastnetworks. FundamentaInformatica ,1201:1{27,2012. 104
PAGE 113
[25]BogdanS.Chlebus.Randomizedcommunicationinradionetworks.InPanosM.Pardalos,SanguthevarRajasekaran,JohnH.Reif,andJoseD.P.Rolim,editors, Handbookof RandomizedComputing ,volumeI,pages401{456.KluwerAcademicPublishers,2001. [26]BogdanS.Chlebus,GianlucaDeMarco,andMuhammedTalo.Anonymousprocessors withsynchronoussharedmemory:LasVegasalgorithms.Submittedtoajournal. [27]BogdanS.Chlebus,GianlucaDeMarco,andMuhammedTalo.Anonymousprocessors withsynchronoussharedmemory:MonteCarloalgorithms.Submittedtoajournal. [28]BogdanS.Chlebus,GianlucaDeMarco,andMuhammedTalo.Namingachannelwith beeps.Submittedtoajournal. [29]BogdanS.Chlebus,KrzysztofDiks,andAndrzejPelc.Wakingupananonymousfaulty networkfromasinglesource.In Proceedingsofthe 27 thHawaiiInternationalConference onSystemSciencesHICSS ,pages187{193.IEEE,1994. [30]BogdanS.ChlebusandDariuszR.Kowalski.Asynchronousexclusiveselection.In Proceedingsofthe 27 thACMSymposiumonPrinciplesofDistributedComputingPODC pages375{384,2008. [31]StephenA.Cook,CynthiaDwork,andR udigerReischuk.Upperandlowertimebounds forparallelrandomaccessmachineswithoutsimultaneouswrites. SIAMJournalof Computing ,151:87{97,1986. [32]AlejandroCornejoandFabianKuhn.Deployingwirelessnetworkswithbeeps.In Proceedingsofthe 24 thInternationalSymposiumonDistributedComputingDISC volume6343of LectureNotesinComputerScience ,pages148{162.Springer,2010. [33]ThomasM.CoverandJoyA.Thomas. ElementsofInformationTheory .Wiley,2nd edition,2006. [34]ArturCzumajandPeterDavies.Communicatingwithbeeps. CoRR ,abs/1505.06107, 2015. [35]JuliusDegesys,IanRose,AnkitPatel,andRadhikaNagpal.DESYNC:selforganizing desynchronizationandTDMAonwirelesssensornetworks.In Proceedingsofthe 6 th InternationalConferenceonInformationProcessinginSensorNetworksIPSN ,pages 11{20.ACM,2007. [36]DariuszDereniowskiandAndrzejPelc.Leaderelectionforanonymousasynchronous agentsinarbitrarynetworks. DistributedComputing ,271:21{38,2014. [37]YoannDieudonneandAndrzejPelc.Anonymousmeetinginnetworks. Algorithmica 742:908{946,2016. [38]KrzysztofDiks,EvangelosKranakis,AdamMalinowski,andAndrzejPelc.Anonymous wirelessrings. TheoreticalComputerScience ,1451&2:95{109,1995. 105
PAGE 114
[39] OmerEgeciogluandAmbujK.Singh.Namingsymmetricprocessesusingsharedvariables. DistributedComputing ,8:19{38,1994. [40]YuvalEmek,JochenSeidel,andRogerWattenhofer.Computabilityinanonymous networks:Revocablevs.irrecovableoutputs.In Proceedingsofthe 41 stInternational ColloquiumonAutomata,Languages,andProgrammingICALP,PartII ,volume8573 of LectureNotesinComputerScience ,pages183{195.Springer,2014. [41]YuvalEmekandRogerWattenhofer.Stoneagedistributedcomputing.In Proceedings ofthe2013ACMSymposiumonPrinciplesofDistributedComputingPODC ,pages 137{146,2013. [42]WilliamFeller. AnIntroductiontoProbabilityTheoryandItsApplications ,volumeI. Wiley,3rdedition,1968. [43]FaithE.Fich,FriedhelmMeyeraufderHeide,PrabhakarRagde,andAviWigderson. One,two,three...innity:Lowerboundsforparallelcomputation.In Proceedingsof the 17 thACMSymposiumonTheoryofComputingSTOC ,pages48{58,1985. [44]FaithE.FichandEricRuppert.Hundredsofimpossibilityresultsfordistributedcomputing. DistributedComputing ,1623:121{163,2003. [45]PaolaFlocchini,EvangelosKranakis,DannyKrizanc,FlaminiaL.Luccio,andNicola Santoro.Sortingandelectioninanonymousasynchronousrings. JournalofParallel andDistributedComputing ,642:254{265,2004. [46]RolandFluryandRogerWattenhofer.Slottedprogrammingforsensornetworks.In Proceedingsofthe 9 thInternationalConferenceonInformationProcessinginSensor NetworksIPSN ,pages24{34.ACM,2010. [47]KlausTychoF orster,JochenSeidel,andRogerWattenhofer.Deterministicleaderelectioninmultihopbeepingnetworks.In Proceedingsofthe 28 thInternationalSymposium onDistributedComputingDISC ,volume8784of LectureNotesinComputerScience pages212{226.Springer,2014. [48]PierreFraigniaud,AndrzejPelc,DavidPeleg,andStephanePerennes.Assigninglabels inanunknownanonymousnetworkwithaleader. DistributedComputing ,143:163{ 183,2001. [49]MohsenGhaariandBernhardHaeupler.Nearoptimalleaderelectioninmultihopradionetworks.In Proceedingsofthe 24 thACMSIAMSymposiumonDiscreteAlgorithms SODA ,pages748{766.SIAM,2013. [50]MohsenGhaari,BernhardHaeupler,andMajidKhabbazian.Randomizedbroadcast inradionetworkswithcollisiondetection. DistributedComputing ,28:407{422,2015. [51]SethGilbertandCalvinC.Newport.Thecomputationalpowerofbeeps.In Proceedings ofthe 29 thInternationalSymposiumonDistributedComputingDISC ,volume9363 of LectureNotesinComputerScience ,pages31{46.Springer,2015. 106
PAGE 115
[52]LeszekG asieniec,EvangelosKranakis,DannyKrizanc,andX.Zhang.Optimalmemoryrendezvousofanonymousmobileagentsinaunidirectionalring.In Proceedings ofthe 32 ndConferenceonCurrentTrendsinTheoryandPracticeofComputerScienceSOFSEM ,volume3831of LectureNotesinComputerScience ,pages282{292. Springer,2006. [53]ChristianGlacet,AveryMiller,andAndrzejPelc.Timevs.informationtradeosfor leaderelectioninanonymoustrees.In Proceedingsofthe 27 thACMSIAMSymposium onDiscreteAlgorithmsSODA ,pages600{609.SIAM,2016. [54]OlgaGoussevskaia,YvonneAnnePignolet,andRogerWattenhofer.Eciencyofwirelessnetworks:Approximationalgorithmsforthephysicalinterferencemodel. FoundationsandTrendsinNetworking ,4:313{420,2010. [55]RachidGuerraouiandEricRuppert.Anonymousandfaulttolerantsharedmemory computing. DistributedComputing ,203:165{177,2007. [56]KokouviHounkanliandAndrzejPelc.Asynchronousbroadcastingwithbivalentbeeps. In Proceedingsofthe 23 rdInternationalColloquiumonStructuralInformationandCommunicationComplexitySIROCCO ,2016.Toappear. [57]BojunHuangandThomasMoscibroda.Conictresolutionandmembershipproblemin beepingchannels.In Proceedingsofthe 27 thInternationalSymposiumonDistributed ComputingDISC ,volume8205of LectureNotesinComputerScience ,pages314{328. Springer,2013. [58]AlonItaiandMichaelRodeh.Symmetrybreakingindistributednetworks. Information andComputation ,881:60{87,1990. [59]JosephJaJa. AnIntroductiontoParallelAlgorithms .AddisonWesley,1992. [60]PrasadJayantiandSamToueg.Wakeupunderread/writeatomicity.In Proceedings ofthe 4 thInternationalWorkshoponDistributedAlgorithmsWDAG ,volume486of LectureNotesinComputerScience ,pages277{288.Springer,1990. [61]TomaszJurdzinskiandDariuszR.Kowalski.Distributedrandomizedbroadcastingin wirelessnetworksundertheSINRmodel.InMingYangKao,editor, Encyclopediaof Algorithms .SpringerUS,2014. [62]J orgKeller,ChristophW.Keler,andJesperLarssonTr a. PracticalPRAMProgramming .WileySeriesonParallelandDistributedComputing.Wiley,2001. [63]DariuszR.KowalskiandAdamMalinowski.Howtomeetinanonymousnetwork. TheoreticalComputerScience ,39912:141{156,2008. [64]EvangelosKranakisandDannyKrizanc.Distributedcomputingonanonymoushypercubenetworks. JournalofAlgorithms ,231:32{50,1997. 107
PAGE 116
[65]EvangelosKranakis,DannyKrizanc,andFlaminiaL.Luccio.Onrecognizingastring onananonymousring. TheoryofComputingSystems ,341:3{12,2001. [66]EvangelosKranakis,DannyKrizanc,andJacobvandenBerg.Computingbooleanfunctionsonanonymousnetworks. InformationandComputation ,1142:214{236,1994. [67]EvangelosKranakisandNicolaSantoro.Distributedcomputingonorientedanonymous hypercubeswithfaultycomponents. DistributedComputing ,143:185{189,2001. [68]ShayKutten,RafailOstrovsky,andBoazPattShamir.TheLasVegasprocessoridentityproblemHowandwhentobeunique. JournalofAlgorithms ,372:468{494, 2000. [69]LloydLimandAPark.Solvingtheprocessoridentityprobleminonspace.In ParallelandDistributedProcessing,1990.ProceedingsoftheSecondIEEESymposium on ,pages676{680.IEEE,1990. [70]RichardJLiptonandArvinPark.Theprocessoridentityproblem. InformationProcessingLetters ,362:91{94,1990. [71]ColinMcDiarmid.Onthemethodofboundeddierences.InJ.Siemons,editor, Surveys inCombinatorics,1989 ,pages148{188.CambridgeUniversityPress,1989. [72]YvesMetivier,JohnMichaelRobson,andAkkaZemmari.Analysisoffullydistributed splittingandnamingprobabilisticproceduresandapplications. TheoreticalComputer Science ,584:115{130,2015. [73]OthonMichail,IoannisChatzigiannakis,andPaulG.Spirakis. NewModelsforPopulationProtocols .SynthesisLecturesonDistributedComputingTheory.Morgan& ClaypoolPublishers,2011. [74]OthonMichail,IoannisChatzigiannakis,andPaulG.Spirakis.Namingandcounting inanonymousunknowndynamicnetworks.In Proceedingsofthe 15 th Symposiumon Stabilization,Safety,andSecurityofDistributedSystemsSSS ,volume8255of Lecture NotesinComputerScience ,pages281{295.Springer,2013. [75]MichaelMitzenmacherandEliUpfal. ProbabilityandComputing .CambridgeUniversity Press,2005. [76]ArikMotskin,TimRoughgarden,PrimozSkraba,andLeonidasJ.Guibas.Lightweight coloringanddesynchronizationfornetworks.In Proceedingsofthe 28 thIEEEInternationalConferenceonComputerCommunicationsINFOCOM ,pages2383{2391,2009. [77]RajeevMotwaniandPrabhakarRaghavan. RandomizedAlgorithms .CambridgeUniversityPress,1995. [78]SaketNavlakhaandZivBarJoseph.Distributedinformationprocessinginbiological andcomputationalsystems. CommunicationsoftheACM ,581:94{102,2014. 108
PAGE 117
[79]AlessandroPanconesi,MarinaPapatriantalou,PhilippasTsigas,andPaulM.B. Vitanyi.Randomizednamingusingwaitfreesharedvariables. DistributedComputing ,113:113{124,1998. [80]AndrzejPelc.Activatinganonymousadhocradionetworks. DistributedComputing 1956:361{371,2007. [81]JohnH.Reif,editor. SynthesisofParallelAlgorithms .MorganKaufmannPublishers, 1993. [82]EricRuppert.Theanonymousconsensushierarchyandnamingproblems.In Proceedingsofthe 11 thInternationalConferenceonPrinciplesofDistributedSystems OPODIS ,volume4878of LectureNotesinComputerScience ,pages386{400.Springer, 2007. [83]NaoshiSakamoto.Comparisonofinitialconditionsfordistributedalgorithmsonanonymousnetworks.In Proceedingsofthe 18 thACMSymposiumonPrinciplesofDistributed ComputingPODC ,pages173{179,1999. [84]BaruchSchieberandMarcSnir.Callingnamesonnamelessnetworks. Informationand Computation ,1131:80{101,1994. [85]StefanSchmidandRogerWattenhofer.Algorithmicmodelsforsensornetworks.In Proceedingsofthe 20 thInternationalParallelandDistributedProcessingSymposium IPDPS .IEEE,2006. [86]AlexScott,PeterJeavons,andLeiXu.Feedbackfromnature:Anoptimaldistributed algorithmformaximalindependentsetselection.In Proceedingsofthe2013ACM SymposiumonPrinciplesofDistributedComputingPODC ,pages147{156,2013. [87]ShangHuaTeng.Spaceecientprocessoridentityprotocol. InformationProcessing Letters ,343:147{154,1990. [88]PetervanEmdeBoas.Machinemodelsandsimulation.In HandbookofTheoretical ComputerScience,VolumeA:AlgorithmsandComplexityA ,pages1{66.TheMIT Press,1990. [89]UziVishkin.Usingsimpleabstractiontoreinventcomputingforparallelism. CommunicationsoftheACM ,541:75{85,2011. [90]MasafumiYamashitaandTsunehikoKameda.Computingonanonymousnetworks: PartIcharacterizingthesolvablecases. IEEETransactionsonParallelandDistributed Systems ,7:69{89,1996. [91]AndrewChiChihYao.Probabilisticcomputations:Towardauniedmeasureofcomplexity.In Proceedingsofthe 18 thSymposiumonFoundationsofComputerScience FOCS ,pages222{227.IEEEComputerSociety,1977. 109
PAGE 118
[92]JiguoYu,LiliJia,DongxiaoYu,GuangshunLi,andXiuzhenCheng.Minimumconnecteddominatingsetconstructioninwirelessnetworksunderthebeepingmodel.In Proceedingsofthe2015IEEEConferenceonComputerCommunicationsINFOCOM pages972{980,2015. 110

