﻿1
Energy Scaling Laws for Distributed Inference
in Random Fusion Networks
Animashree Anandkumar, Student Member, IEEE, Joseph E. Yukich,
Lang Tong † , Fellow, IEEE, and Ananthram Swami, Fellow, IEEE
Abstract—The energy scaling laws of multihop data fusion
networks for distributed inference are considered. The fusion
network consists of randomly located sensors distributed i.i.d.
according to a general spatial distribution in an expanding
region. Among the class of data fusion schemes that enable
optimal inference at the fusion center for Markov random field
(MRF) hypotheses, the scheme with minimum average energy
consumption is bounded below by average energy of fusion along
the minimum spanning tree, and above by a suboptimal scheme,
referred to as Data Fusion for Markov Random Fields (DFMRF).
Scaling laws are derived for the optimal and suboptimal fusion
policies. It is shown that the average asymptotic energy of the
DFMRF scheme is finite for a class of MRF models.
Index Terms—Distributed detection, graphical models, random
graphs, stochastic geometry and data fusion.
I. INTRODUCTION
WE consider the problem of distributed statistical inference
in a network of randomly located sensors, each
taking a measurement and transporting the locally processed
data to a designated fusion center. The fusion center then
makes an inference about the underlying phenomenon based
on the data collected from all the sensors.
For statistical inference using wireless sensor networks,
energy consumption is an important design parameter. The
transmission power required to reach a receiver distance d
away with a certain signal-to-noise ratio (SNR) scales in
the order of dν , where 2 ≤ ν ≤ 6 is the path loss [3].
Therefore, the cost of moving data from sensor locations to the
fusion center, either through direct transmissions or multihop
forwarding, significantly affects the lifetime of the network.
A. Scalable data fusion
We investigate the cost of data fusion for inference, and
its scaling behavior with the size of the network and the
† Corresponding author.
A. Anandkumar and L. Tong are with the School of Electrical and
Computer Engineering, Cornell University, Ithaca, NY 14853, USA. Email:
{aa332@,ltong@ece.}cornell.edu
J.E. Yukich is with the Department of Mathematics, Lehigh University, Bethlehem,
Pa. 18015. E-mail: joseph.yukich@lehigh.edu.
A. Swami is with the Army Research Laboratory, Adelphi, MD 20783 USA E-mail:
a.swami@ieee.org.
This work was supported in part through collaborative participation in Communications
and Networks Consortium sponsored by the U. S. Army Research Laboratory under
the Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-2-
0011 and by the Army Research Office under Grant ARO-W911NF-06-1-0346. The first
author is supported by the IBM Ph.D Fellowship for the year 2008-09 and is currently
a visiting student at MIT, Cambridge, MA 02139. The second author was partially
supported by NSA grant H98230-06-1-0052 and NSF grant DMS-0805570. The U. S.
Government is authorized to reproduce and distribute reprints for Government purposes
notwithstanding any copyright notation thereon.
Parts of this paper are presented at [1], [2]
area of deployment. In particular, for a network of n random
sensors located at points Vn = {V1, · · · ,Vn} in R 2 , a fusion
policy πn maps Vn to a set of scheduled transmissions and
computations. The average cost (e.g., energy) of a policy is
given by
Ē(πn(Vn)):= 1
n
∑
Ei(πn(Vn)), (1)
i∈Vn
where Ei(πn(Vn)) is the cost at node i under policy πn. The
above average cost is random, and we are interested in its
scalability in random networks as n → ∞.
Definition 1 (Scalable Policy): A sequence of policies
π:=(πn)n≥1 is scalable on average if
lim
n→∞ E(Ē(πn(Vn))) = Ē∞(π) < ∞
where Ē∞(π) is referred to as the scaling constant. A sequence
of policies πn is weakly scalable if
p lim Ē(π(Vn))) =
n→∞
Ē∞(π) < ∞,
where p lim denotes convergence in probability. It is strongly
scalable if the above average energy converges almost surely
and is L 2 (mean squared) scalable if the convergence is in
mean square.
We focus mostly on the L 2 scalability of the fusion policies,
which implies weak and average scalability. We are interested
in scalable data fusion policies that enable optimal statistical
inference at the fusion center implying finite average energy
expenditure as the network size increases.
To motivate this study, first consider two simple fusion
policies: the direct transmission policy (DT) in which all
sensors transmit directly to the fusion center (single hop), and
the shortest path (SP) policy, where each node forwards its
data to the fusion center using the shortest path route without
any data combination at the intermediate nodes.
We assume, for now, that n sensor nodes are uniformly
distributed in a square having area n. It is perhaps not surprising
that neither of the two policies is scalable as n → ∞.
For the DT policy, intuitively, the average transmission from
the sensors to the fusion center range scales as √ n, thus
Ē(DT(Vn)) scales as n ν
2 . On the other hand, we expect the
SP policy to have better scaling since it chooses the best multihop
path to forward data from each node to the fusion center.
However, even in this case, there is no finite scaling. Here, the
average number of hops scales in the order of √ n, and thus,
Ē(SP(Vn)) scales in the order of √ n. Rigorously establishing2
the scaling laws for these two non-scalable policies is not
crucial at this point since the same scaling laws can be easily
established for regular networks when sensor nodes are on
two-dimensional lattice points. See [4].
Are there scalable policies for data fusion? Among all
the fusion policies not performing data combination at the
intermediate nodes, the shortest path (SP) policy minimizes
the total energy. Thus, no scalable policy exists unless nodes
cooperatively combine their information, a process known
as data aggregation. Data aggregation, however, must be
considered in conjunction with the performance requirements
of specific applications. In this paper, we assume that optimal
inference is made at the fusion center, and this places a
constraint on data aggregation. It rules out sub-sampling of
the sensor field, which is dealt in [5].
B. Summary of results and contributions
In this paper, we allow data aggregation at intermediate
nodes, but require that the fusion center achieves the same
inference performance as if all raw observations were collected
without any data combination. We assume that the underlying
hypotheses can be modeled as Markov random fields (MRF)
and investigate the energy scaling laws.
Given sensor locations Vn and possibly correlated sensor
measurements, finding the minimum energy fusion policy
under the constraint of optimal inference is, in general, NPhard
[6], and hence, intractable. We will establish upper and
lower bounds on the fusion energy of this optimal scheme
and analyze their scaling behavior. The lower bound is obtained
by a scheme conducting fusion along the Euclidean
minimum spanning tree (MST), which is shown to be optimal
when the observations are statistically independent under
both hypotheses. The upper bound on the optimal fusion
scheme is established through a specific suboptimal fusion
scheme, referred to as Data Fusion over Markov Random
Fields (DFMRF). DFMRF becomes optimal for independent
observations where it reduces to fusion along the MST. For
certain spatial dependencies among sensor measurements of
practical significance, such as the Euclidean 1-nearest neighbor
graph, DFMRF has an approximation ratio 2, i.e., it costs
no more than twice the cost of the optimal fusion scheme,
independent of the size and configuration of the network.
We then proceed to establish a number of asymptotic
properties of the DFMRF scheme in Section IV, including
the scalability of DFMRF, its performance bounds, and the
approximation ratio with respect to the optimal fusion policy
when the sensor measurements have dependencies described
by a k-nearest neighbor graph or a disk graph (continuum
percolation). Applying techniques developed in [7]–[9], we
provide a precise characterization of the scaling bounds as a
function of sensor density and sensor placement distribution.
These asymptotic bounds for DFMRF, in turn, imply that
the optimal fusion scheme is also scalable. Hence, we use
the DFMRF scheme as a vehicle to establish scaling laws
for optimal fusion. Additionally, we use the energy scaling
constants to optimize the distribution of the sensor placements.
For independent measurements, we show that the uniform
distribution minimizes the average energy consumption over
all i.i.d spatial placements when the path-loss coefficient of
transmission is greater than two (ν > 2). For ν ∈ [0,2),
we show that the uniform distribution is, in fact, the most
expensive 1 node configuration in terms of routing costs. We
further show the optimality of the uniform distribution applies
for both the lower and upper bounds on the average energy
consumption for correlated measurements with k-nearest
neighbor dependency graph or disk dependency graph under
certain conditions.
To the best of our knowledge, our results are the first to
establish the scalability of data fusion for certain correlation
structures of the sensor measurements. The use of energy
scaling laws for the design of efficient sensor placement is
new and has direct engineering implications. The heuristic
policy DFMRF first appeared in [10], and is made precise
here with detailed asymptotic analysis using the weak law of
large numbers for stabilizing graph functionals. One should
not expect that scalable data fusion is always possible, and at
the end of Section IV, we will discuss examples of correlation
structures where scalable data fusion does not exist.
C. Prior and related work
The seminal work of Gupta and Kumar [11] on the capacity
of wireless networks has stimulated extensive studies
covering a broad range of networking problems with different
performance metrics. See also [12]. Here, we restrict ourselves
to related works on energy consumption and data fusion for
statistical inference.
Results on scaling laws for energy consumption are limited.
In [13], energy scaling laws for multihop wireless networks
(without any data fusion) are derived under different routing
strategies. The issue of node placement for desirable energy
scaling has been considered in [14], [15], where it is argued
that uniform node placement, routinely considered in the
literature, has poor energy performance. It is interesting to
note that, for fusion networks, uniform sensor distribution is
in fact optimal among a general class of distributions. See
Section IV-B.
Energy-efficient data fusion has received a great deal of
attention over the past decade. See a few recent surveys in [16],
[17]. It has been recognized that sensor observations tend to be
correlated, and that correlations should be exploited through
data fusion. One line of approach is the use of distributed
compression with the aim of routing all the measurements to
the fusion center. Examples of such approaches can be found
in [18]–[20].
While sending data from all sensors to the fusion center
certainly ensures optimal inference, it is not necessary for
statistical inference. More relevant to our work is the idea of
data aggregation, e.g., [21]–[23]. Finding aggregation policies
for correlated data, however, is nontrivial; it depends on the
specific applications for which the sensor network is designed.
Perhaps a more precise notion of aggregation is in-network
function computation where certain functions are computed by
1 The path-loss coefficient for wireless transmissions, in general, satisfies
ν > 2.3
passing intermediate values among nodes [24]–[27]. However,
these works are mostly concerned with computing symmetric
functions such as the sum function.
In the context of statistical inference using wireless sensor
networks, the idea of aggregation and in-network processing
has been explored by several authors. See, e.g., [28]–[34].
Most relevant to our work are [28]–[32] where the Markovian
correlation structures of sensor measurements are exploited
explicitly. These results, however, do not deal with randomly
placed sensors, and energy scaling laws are not established.
The results presented in this paper extend some of our
earlier work in the direction of scaling-law analysis in random
fusion networks. In [6], [10], [35], for fixed network size
and node placement, we analyzed the minimum energy fusion
scheme for optimal inference and showed that it reduces to
a Steiner tree under certain constraints. We also proposed
a heuristic called the DFMRF 2 . In [36], we analyzed the
optimal sensor density for uniform node placement which
maximizes the inference error exponent under an average
energy constraint, and in [37], [48], we derived the error
exponent for MRF hypotheses. In [5], we analyzed optimal
sensor selection (i.e., sub-sampling) policies for achieving
tradeoff between fusion costs and inference performance.
The energy scaling laws derived in this paper rely heavily
on several results on the law of large numbers on geometric
random graphs. We have extensively borrowed the formulations
and techniques of Penrose and Yukich [9], [38]. See
Appendix A for a brief description and [7], [8], [39] for
detailed expositions of these ideas.
II. SYSTEM MODEL
In this paper, we will consider various graphs. Chief among
these are (i) dependency graphs specifying the correlation
structure of sensor measurements, (ii) network graphs denoting
feasible links for communication, and (iii) fusion digraphs
denoting the (directed) links used by a policy to route and
aggregate data.
A. Stochastic model of sensor locations
We assume that n sensor nodes (including the fusion center)
are placed randomly with sensor i located at Vi ∈ R2 .
By convention, the location of the fusion center is denoted
by V1. We denote the set of locations of the n sensors by
Vn:={V1,...,Vn}. For our scaling law analysis, we consider
a sequence of sensor populations on expanding square regions
n
Q n of area
λ λ
and centered at the origin, where we fix λ as the
overall sensor density and let the number of sensors n → ∞.
To generate sensor locations Vi, first let Q1 := [−1 be the unit area square3 i.i.d.
, and Xi ∼
2
,
1
2 ]2
κ,1 ≤ i ≤ n, be a set
of n independent and identically distributed (i.i.d.) random
variables distributed on support Q1 according to κ. Here, κ is
a probability density function (pdf) on Q1 which is bounded
away from zero and infinity. We then generate Vi by scaling
Xi accordingly: Vi = √ n
λXi ∈ Q n
λ
. A useful special case is
2 The DFMRF scheme is referred to as AggMST in [6], [35].
3 The results in this paper hold for κ defined on any convex unit area.
the uniform distribution (κ ≡ 1). Let Pλ be the homogeneous
Poisson distribution on R 2 with density λ.
B. Graphical inference model: dependency graphs
We consider the inference problem of simple binary hypothesis
testing, H0 vs. H1, on a pair of Markov random fields
(MRF). Under regularity conditions [40], a MRF is defined
by its (undirected) dependency graph G and an associated pdf
f(· | G).
Under hypothesis Hk, we assume that the dependency graph
Gk := (Vn,Ek) models the correlation among the sensor
observations, where Vn = {V1, · · · ,Vn} is the set of vertices
corresponding to sensor locations, generated according to the
stochastic model in Sec II-A. Note that the vertex sets under
the two hypotheses are identical. Set Ek is the set of edges
of the dependency graph Gk, and it defines the correlations of
the sensor observations, as described in the next section.
We restrict our attention to proximity-based dependency
graphs. In particular, we consider two classes of dependency
graphs 4 : the (undirected) k-nearest neighbor graph (k-NNG)
and the disk graph, also known as continuum percolation.
We expect that our results extend to other locally-defined
dependency structures such as the Delaunay, Voronoi, the minimum
spanning tree, the sphere of influence and the Gabriel
graphs. An important property of the aforementioned graphs
is a certain stabilization property (discussed in Appendix A)
facilitating asymptotic scaling analysis.
C. Graphical inference model: likelihood functions
We denote the (random) measurements from all the sensors
in a vertex set V by YV, and YU denotes the vector that contains
observations on a vertex subset U ⊂ V. The inference
problem can now be stated as the following hypothesis test:
H0 : YV ∼ f(y | G0, H0) vs. H1 : YV ∼ f(y | G1, H1), (2)
where f(y | Gk, Hk) is the pdf of YV conditioned on the
dependency graph Gk under hypothesis Hk. Note that the
sensor locations Vn have the same distribution under either
hypothesis. Therefore, only the conditional distribution of YV
under each hypothesis is relevant for inference.
Under each hypothesis, the dependency graph involves
conditional-independence relations between the measurements
[40]
Yi ⊥ Y V\N(i;Gk) | {Y N(i;Gk),V}, under Hk, (3)
where N(i;Gk) is the set of neighbors of i in Gk, and ⊥
denotes conditional independence. In words, given the node
locations and the measurements at neighbors of a node in the
dependency graph, the measurement at a node is conditionally
independent of the rest of the network.
4 The k-nearest neighbor graph (k-NNG) has edges (i, j) if i is one of the
k nearest neighbors of j or viceversa, and ties are arbitrarily broken. The disk
graph has edges between any two points within a certain specified distance
(radius).4
The celebrated Hammersley-Clifford theorem states that,
under the positivity condition [41], the log-likelihood function
of a MRF with dependency graph Gk can be expressed as
−log f(yV | Gk, Hk) = ∑
ψk,c(yc), k = 0,1, (4)
c∈Ck
where Ck is a collection of (maximal) cliques in Gk, the
functions ψk,c, known as clique potentials, are real valued,
non-negative and not zero everywhere on the support of
distribution of yc. We assume that the normalization constant
is already incorporated in the potential functions to ensure that
(4) indeed describes a pdf. In general, it is NP-hard to evaluate
the normalization constant given arbitrary potential functions
[42], but can be carried out at the fusion center without any
need for communication of sensor measurements.
D. Graphical fusion model and energy consumption
The set of feasible communications links form the (directed)
network graph denoted by Ng(V). We assume that it is
connected but not necessarily fully connected, and that it
contains the Euclidean minimum spanning tree over the node
set Vn and directed towards the fusion center V1, denoted
by DMST(Vn;V1). Transmissions are scheduled so as to not
interfere with one other. Nodes are capable of adjusting their
transmission power depending on the location of the receiver.
A fusion policy π consists of a transmission schedule with
the transmitter-receiver pairs, the time of transmission, and
the aggregation algorithm that allows a node to combine its
own and received values to produce a new communicating
value. We model a fusion policy π by a directed fusion
graph, Fπ := (V, −→ E π), where V is the same set of vertices
corresponding to sensor locations, and −→ E π contains directed
links. A directed5 link < i,j > denotes a direct transmission
from i to j and is required to be contained in the network
graph Ng(V) for the transmissions to be feasible. If one node
communicates with another node k times, k direct links will
be added between these two nodes in the edge set −→ E π of the
fusion policy π. Since we are only interested in characterizing
the overall energy expenditure, the order of transmissions is
not important; we only need to consider the associated cost
with each link in −→ E π and calculate the sum cost for π.
Nodes communicate in the form of packets. Each packet
contains bits for at most one (quantized) real variable and other
overhead bits independent of the network size. We assume
that all real variables 6
are quantized to K bits, and K is
independent of network size and is sufficiently large that
quantization errors can be ignored. Thus, for node i to transmit
data to node j distance |i,j| away, we assume that node i
spends energy 7 γ|i,j| ν . Without loss of generality, we assume
γ = 1. Hence, given a fusion policy Fπ = (V, −→ E π) of network
5 We denote a directed link by < i, j > and an undirected link by (i, j).
6 In principle, the raw and aggregated data may require different amount
of energy for communication, and this can be easily incorporated into our
framework.
7 Since nodes only communicate a finite number of bits, we use energy
instead of power as the cost measure.
size n, the average energy consumption is given by
Ē(π(V)) = 1 1 ∑
E(π(V)) = |i,j|
n n
ν , 2 ≤ ν ≤ 6.
<i,j>∈ −→ E π
The model specification is now complete.
III. MINIMUM ENERGY DATA FUSION
In this section, we present data fusion policies aimed at
minimizing energy expenditure under the constraint of optimal
inference at the fusion center. The scalability of these policies
is deferred to Section IV.
A. Optimal data fusion: a reformulation
The inference problem, defined in (2), involves two different
graphical models, each with its own dependency graph and
associated likelihood function. They do share the same vertex
set V which allows us to join the two graphical models into
one.
Define
⋃
the joint dependency graph G:=(V,E), where
E:=E0 E1, as the union of the two dependency graphs G0
and G1. The minimal sufficient statistic8 is given by the loglikelihood
ratio (LLR) [43]. With the substitution of (4), it is
given by
LG(YV) := log f(YV | G0, H0)
f(YV | G1, H1)
= ∑
ψ1,a(Ya) − ∑
a∈C1
:= ∑
φc(Yc),
c∈C
b∈C0
ψ0,b(Yb)
(5)
⋃
C:=C0 C1, (6)
where the effective potential functions φc are given by
φc(Yc):= ∑
a∈C1,a⊂c
ψ1,a(Ya) − ∑
b∈C0,b⊂c
ψ0,b(Yb), ∀c ∈ C. (7)
Hereafter, we will work with (G,LG(YV)). Note that the
LLR is minimally sufficient (i.e., maximum dimensionality
reduction) implying maximum possible savings in routing
costs under the constraint of optimal inference.
Given the node set V, we can now reformulate the optimal
data fusion problem as the following optimization
E(π ∗ ∑
(V)) = inf Ei(π(V)), (8)
π∈FG
i∈V
where FG is the set of valid data fusion policies
FG:={π : LG(YV) computable at the fusion center}.
Note that the optimization in (8) is a function of the dependency
graph G, and that the optimal solution is attained by
some policy.
8 A sufficient statistic is a well-behaved function of the data, which is as
informative as the raw data for inference. It is minimal if it is a function of
every other sufficient statistic.5
B. Minimum energy data fusion: a lower bound
The following theorem gives a lower bound on minimum
energy in (8), given the joint dependency graph G and the
path-loss coefficient ν.
Theorem 1 (Lower bound on minimum energy expenditure):
Let MST(V) be the Euclidean minimum spanning tree over
node set V. Then,
1) the energy cost for the optimal fusion policy π ∗ in (8)
satisfies
E(π ∗ (V)) ≥
∑
e∈MST(V)
|e| ν :=E(MST(V)), (9)
2) the lower bound (9) is achieved (i.e., equality holds)
when the observations are independent under both hypotheses.
In this case, the optimal fusion policy π∗ aggregates data along DMST(V;V1), the directed minimum
spanning tree, with all the edges directed toward
the fusion center V1. Hence, the optimal fusion digraph
Fπ∗ is the DMST(V;V1).
Proof: We will first prove part 2), for which we consider
the case when observations are independent, and the loglikelihood
ratio is given by
LG(YV) = ∑
Li(Yi), Li(Yi):= log f1,i(Yi)
f0,i(Yi) .
i∈V
Consider MST(V), whose links minimize the sum of the
power weighted edges
∑
|e| ν . It is easy to check
e∈Tree(V)
that at the fusion center, the log-likelihood ratio can be
computed using the following aggregation scheme along the
DMST(V;V1) as illustrated in (1): each node i computes the
aggregated variable qi(YV) from its predecessor and sends
it to its immediate successor. The variable qi is given by the
summation
qi(YV):= ∑
qj(YV) + Li(Yi), (10)
j∈Np(i)
where Np(i) is the set of immediate predecessors of i in
DMST(V;V1).
To show part 1), we note that any data fusion policy must
have each node transmit at least once and the transmission
must ultimately reach the fusion center. This implies that the
fusion digraph must be connected with the fusion center and
the DMST with edge-weight |e| ν minimizes the total energy
under the above constraints. Hence, we have (9). ✷
Note that the above lower bound is tight in the sense that the
bound is achievable when the measurements are independent
under both hypotheses. It is interesting to note that data
correlations, in general, increase the fusion cost under the
constraint of optimal inference performance.
C. Minimum energy data fusion: an upper bound
We now consider the general case of correlated measurements
and devise a suboptimal data fusion scheme which gives
an upper bound on the optimal energy in (8). The suboptimal
scheme, referred to as Data Fusion on Markov Random Fields
Fig. 1.
V3
V4
q3 q4
q5
V2
V7
q1
V1
V5
Fusion center
V6
q6
q2 = L2(Y2) + q4 + q5
The optimal fusion graph DMST for independent observations.
(DFMRF), is a natural generalization of the MST aggregation
scheme described in Theorem 1.
Recall the form of the log-likelihood ratio for hypothesis
testing of Markov random fields, given in (6)
LG(YV) = ∑
φc(Yc).
c∈C
We shall use Fig. 2 to illustrate the idea behind DFMRF. It
is made of two phases:
1) In the data forwarding phase, for each clique c in the set
of maximal cliques C, a processor, denoted by Proc(c),
is chosen randomly amongst the members of clique
c. Each node in clique c then forwards its raw data
to Proc(c) and Proc(c) computes the clique potential
φc(Yc).
2) In the data aggregation phase, processors compute the
sum of the clique potentials along DMST(V;V1), the
directed MST towards the fusion center.
Hence, the fusion digraph for the DFMRF scheme is the
union of the two graphs in the above stages, viz., forwarding
subgraph (FG(V)) and aggregation subgraph (AG(V)). The
total energy consumption of DFMRF is given by
E(DFMRF(V)) = ∑
c∈C(V) i⊂c
∑
E SP (i, Proc(c);Ng)
+ E(MST(V)), (11)
where E SP (i,j;Ng) denotes the energy consumption for the
shortest path between i and j using the links in the network
graph Ng(V) (set of feasible links for direct transmission).
For independent measurements, the maximal clique set is
trivially the set of vertices V and hence, DFMRF reduces
to aggregation along the DMST(V;V1), which is optimal for
independent observations. However, in general, DFMRF is
not optimal. For the 1-nearest neighbor dependency graph,
DFMRF has a constant approximation ratio with respect to
the optimal data fusion scheme π ∗ in (8).
Theorem 2 (Approximation under 1-NNG dependency [10]):
DFMRF is a 2-approximation algorithm when the dependency
graph G is the 1-nearest neighbor graph6
Fusion center
Processor
(a) Maximal cliques of dependency
graph
Fig. 2.
(b) Forwarding subgraph computes
clique potentials
Schematic of dependency graph of Markov random field and stages of data fusion.
+
(c) Aggregation subgraph adds
computed potentials
Dependency graph
Forwarding subgraph (FG)
Aggregation graph (AG)
(d) Legend
E(DFMRF(V))
E(π ∗ (V))
≤ 2, (12)
over all node sets V in R2 .
Proof: Sine 1-NNG is acyclic, the maximum clique size is
2. Hence, for DFMRF, the forwarding subgraph (FG) is the
1-NNG with arbitrary directions on the edges. We have
Thus,
E(FG(V)) = E(1-NNG(V)) ≤ E(MST(V)).
E(DFMRF(V)) = E(FG(V)) + E(AG(V)), (13)
≤ 2 E(MST(V)) ≤ 2E(π ∗ (V)), (14)
where the last inequality comes from Theorem 1. ✷
Note that the above result does not extend to general k-NNG
dependency graphs (k > 1) for finite network size. However,
as the network size goes to infinity, we will show in Section
IV-B that a constant-factor approximation ratio is achieved.
IV. ENERGY SCALING LAWS
We now establish the scaling laws for optimal and suboptimal
fusion policies. From the expression of average energy
cost in (5), we see that the scaling laws will rely on the law
of large numbers (LLN) for stabilizing graph functionals. An
overview of the LLN is provided in Appendix A.
We recall some notations and definitions used in this section.
i.i.d.
Xi ∼ κ, where κ is defined on Q1, the unit square centered
at the origin. The node set is Vn:= √ n
λ (Xi) n i=1 and the limit
is obtained by letting n → ∞ with fixed λ > 0.
A. Energy scaling for optimal fusion: independent case
We first provide the scaling result for the case when
the measurements are independent under either hypothesis.
From Theorem 1, the optimal fusion scheme minimizing total
energy consumption is given by summation along the directed
minimum spanning tree. Hence, the energy scaling is obtained
by the analysis of the MST.
For node set Vn, the average energy consumption of the
optimal fusion scheme for independent measurements is
Ē(π ∗ (Vn)) = Ē(MST(Vn)) = 1
n
∑
e∈MST(Vn)
|e| ν . (15)
Let ζ(ν; MST) be the constant arising in the asymptotic
analysis of the MST edge lengths, that is
[ ∑
ζ(ν; MST):=E
e∈E(0;MST(P1∪{0}))
1
2 |e|ν], (16)
where 0 is a point at the origin of R2 , Pτ is the homogeneous
Poisson process of intensity τ, and E(0; ∪ {0})) denotes
the set of edges incident to the origin in MST(P1∪{0}).
The above constant is half the expectation of the powerweighted
edges incident to the origin in the minimum spanning
tree over a homogeneous unit intensity Poisson process, and is
discussed in Appendix A in (41). Although ζ(ν; MST) is not
available in closed form, we evaluate it through simulations
in Section V.
We now provide the scaling result for the optimal fusion
scheme when the measurements are independent based on the
LLN for the MST obtained in [9, Thm 2.3(ii)].
Theorem 3 (Scaling for independent data [9]): When the
sensor measurements are independent under each hypothesis,
the limit of the average energy consumption of the optimal
fusion scheme in (15) is given by
lim Ē(π
n→∞
∗ (Vn)) L2
∫
ν
ν
−
= λ 2 1−
ζ(ν; MST) κ(x) 2 dx. (17)
Q1
Hence, asymptotically the average energy consumption of
optimal fusion is a constant in the mean square sense for
independent measurements. In contrast, forwarding all the raw
data to the fusion center according to the shortest-path (SP)
policy has an unbounded average energy growing in the order
of √ n.
The scaling constant for average energy in (17) brings out
the influence of several factors on energy consumption. It is
inversely proportional to the node density λ. This is intuitive
since placing the nodes with a higher density (smaller area)
decreases the average inter-node distances and hence, also the
energy consumption.
The node-placement pdf κ influences the asymptotic energy
consumption through the term
∫
Q1
ν 1−
κ(x) 2 dx.
When the placement is uniform (κ ≡ 1), the above term
evaluates to unity. Hence, the scaling constant in (17) for7
∫
κ(x)
Q1 1− ν 2 dx
2.5
2
1.5
1
0.5
Uniform is
Worst−Case
Uniform
is Optimal
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Path Loss ν
Fig. 3. Ratio of energy consumption under node placement distribution κ
and uniform distribution as a function of path-loss ν. See (18) and (19).
B. Energy scaling for optimal fusion: MRF case
We now evaluate the scaling laws for energy consumption of
the DFMRF scheme for a general Markov random field dependency
among sensor measurements. The DFMRF aggregation
scheme involves cliques of the dependency graph which arise
from correlation between sensor measurements. Recall that the
total energy consumption of DFMRF in (11) is given by
E(DFMRF(V)) = ∑
c∈C(V) i⊂c
∑
E SP (i, Proc(c);Ng)
+ E(MST(V)), (21)
uniform placement simplifies to
ν −
λ 2 ζ(ν; MST).
The next theorem shows that the energy under uniform node
placement (κ ≡ 1) optimizes the scaling limit in (17) when
the path loss ν > 2. Also, see Fig.3.
Theorem 4 (Minimum energy placement: independent case):
For any pdf κ supported on the unit square Q1, we have
Proof:
∫
∫
Q1
Q1
ν 1−
κ(x) 2 dx ≥ 1, ∀ν > 2, (18)
ν 1−
κ(x) 2 dx ≤ 1, ∀ν ∈ [0,2). (19)
We have the Hölder inequality
‖f1f2‖1≤‖f1‖p‖f2‖q, ∀p > 1,q = p
, (20)
p − 1
where for any positive function f,
(
‖f‖p :=
∫
f(x) p ) p
dx .
Q1
When ν > 2, in (20), substitute f1(x) with κ(x) 1
p
, f2(x) with
1 −
κ(x)
p ν
, and p with
ν−2
≥ 1 which ensures that p > 1, to
obtain (18).
For ν ∈ (0,2), in (20), substitute f1(x) with κ(x) 1
p
, f2(x)
with 1, p = 2
2−ν
> 1 to obtain (19).
✷
The above result implies that, in the context of i.i.d. node
placements, it is asymptotically energy-optimal to place the
nodes uniformly when the path-loss coefficient ν > 2. The
intuitive reason is as follows: without loss of generality,
consider a clustered distribution in the unit square, where
nodes are more likely to be placed near the origin. The MST
over such a point set has many short edges, but a few very long
edges, since some nodes are likely to occur near the boundary.
On the other hand, for uniform point sets, the edges of the
MST are more likely to be of similar lengths. Since for energy
consumption, we have a power law on edge lengths with path
loss ν > 2, long edges are penalized harshly, leading to the
result that the uniform distribution is optimal.
where E SP (i,j;Ng) denotes the energy consumption for the
shortest path between i and j using the links in the network
graph Ng(V) (set of feasible links for direct transmission).
We now assume that the network graph Ng(V) is a local u-
energy spanner. In the literature [44], a graph Ng(V) is called
a u-energy spanner, for some constant u > 0 called its energy
stretch factor, when it satisfies
E
max
i,j∈V
SP (i,j;Ng)
ESP ≤ u, (22)
(i,j;Cg)
where Cg(V) denotes the complete graph. In other words,
the energy consumption between any two nodes is no worse
than u-times the optimal value. Examples of energy spanners
include the Gabriel graph 9 (with stretch factor u = 1 when
the path-loss ν ≥ 2), the Yao graph, and its variations [44].
In this paper, we only require a weaker version 10 of the above
property that there is at most u-energy stretch between the
neighbors in the dependency graph
From (23), we have
E
max
(i,j)∈G
SP (i,j;Ng)
ESP ≤ u. (23)
(i,j;Cg)
E(FG(V)) ≤ u ∑
≤
c∈C(V) i⊂c
u ∑
c∈C(V) i⊂c
∑
E SP (i, Proc(c);Cg),
∑
|i, Proc(c)| ν . (24)
Recall that the processors are local within the clique, i.e.,
Proc(c) ⊂ c, for each clique c in the dependency graph. Hence,
in (24), only the edges of the processors of all the cliques
are included in the summation. This is upper bounded by the
sum of all the power-weighted edges of the dependency graph
G(V). Hence, we have
E(FG(V)) ≤ u ∑
e∈G(V)
|e| ν . (25)
√
9The longest edge in Gabriel graph is O( log n), the same order as that
of the MST [45]. Hence, the maximum power at a node needed to ensure u-
energy spanning property is of the same order as that needed for connectivity.
10 In fact, it suffices to have lim sup
n→∞
max
(i,j)∈G(Vn)
ESP (i, j; Ng(Vn))
ESP ≤ u.
(i, j; Cg(Vn))8
From (21), for the total energy consumption of the DFMRF
scheme, we have the upper bound,
E(DFMRF(V)) ≤ u ∑
e∈G(V)
|e| ν + E(MST(V)). (26)
By (26), the total cost of this scheme E(DFMRF) is upper
bounded by the sum of powers of edge lengths of the dependency
graph, allowing us to draw upon the general methods
of [9], [46].
From (26), the DFMRF scheme will scale whenever the
right-hand side of (25) scales. By Theorem 3, the energy
consumption along the MST scales. Hence, we only need to
establish the scaling behavior of the first term in (25).
We now prove scaling laws governing the energy consumption
of DFMRF and we also establish its approximation ratio
with respect to the optimal fusion scheme. This in turn also
establishes the scaling behavior of the optimal scheme.
Theorem 5 (Scaling of DFMRF Scheme): When the dependency
graph G is either the k-nearest neighbor or the disk
graph, the average energy of DFMRF scheme satisfies
limsup Ē(DFMRF(Vn))
n→∞
a.s. (
1 ∑
≤ limsup u |e|
n→∞ n
e∈G(Vn)
ν + Ē(MST(Vn))
)
∫
L 2
= u
2
Q1
[ ∑
E
j:(0,j)∈G(P λκ(x)∪{0})
ν −
+λ 2 ζ(ν; MST)
∫
Q1
|0,j| ν]
κ(x)dx
ν 1−
κ(x) 2 dx. (27)
Proof: See Appendix B. ✷
Hence, the above result establishes the scalability of the
DFMRF scheme. In the theorem below, we use this result
to prove the scalability of the optimal fusion scheme and
establish asymptotic upper and lower bounds on its average
energy.
Theorem 6 (Scaling of Optimal Scheme): When the dependency
graph G is either the k-nearest neighbor or the disk
graph, the limit of the average energy consumption of the
optimal scheme π ∗ satisfies the upper bound
limsup Ē(π
n→∞
∗ (Vn)) a.s.
≤ limsup Ē(DFMRF(Vn)), (28)
n→∞
where the right-hand side satisfies the upper bound in (27).
Also, π ∗ satisfies the lower bound given by the MST
liminf
n→∞
Ē(DFMRF(Vn)) a.s.
≥ liminf Ē(π
n→∞
∗ (Vn))
∫
a.s.
≥ lim Ē(MST(Vn))
n→∞
L2 ν −
= λ 2 ζ(ν; MST)
Q1
ν 1−
κ(x) 2 dx.(29)
Proof: From (9), the DFMRF and the optimal scheme satisfy
the lower bound given by the MST.
✷
Hence, the limiting average energy consumption for both
the DFMRF scheme and the optimal scheme is strictly finite,
and is bounded by (27) and (29). These bounds also establish
that the approximation ratio of the DFMRF scheme is asymptotically
bounded by a constant, as stated below. Define the
constant ρ := ρ(u,λ,κ,ν), given by
ρ:=1 +
∫
u
Q1
1
2 E
[ ∑
j:(0,j)∈G(P λκ(x)∪{0})
ν
λ− 2 ζ(ν; MST)
∫
Q1
|0,j| ν]
κ(x)dx
ν 1−
κ(x) 2 dx
Lemma 1 (Approximation Ratio for DFMRF): The
approximation ratio of DFMRF is given by
limsup
n→∞
a.s.
≤
E(DFMRF(Vn))
E(π∗ (Vn))
limsup
n→∞
E(DFMRF(Vn))
E(MST(Vn))
. (30)
L 2
= ρ, (31)
where ρ is given by (30).
Proof: Combine Theorem 5 and Theorem 6. ✷
We further simplify the above results for the k-nearest
neighbor dependency graph in the corollary below by exploiting
its scale invariance. The results are expected to hold for
other scale-invariant stabilizing graphs as well. The edges of
a scale-invariant graph are invariant under a change of scale,
or put differently, G is scale invariant if scalar multiplication
by α induces a graph isomorphism from G(V) to G(αV) for
all node sets V and all α > 0.
Along the lines of (16), let ζ(ν;k-NNG) be the constant
arising in the asymptotic analysis of the k-NNG edge lengths,
that is
[ ∑
ζ(ν;k-NNG):=E
j:(0,j)∈k-NNG(P1∪{0})
1
2 |0,j|ν]. (32)
Corollary 1 (k-NNG Dependency Graph): We obtain a
simplification of Theorem 5 and 6 for average energy
consumption, namely
limsup
n→∞
a.s.
≤ limsup
n→∞
Ē(π ∗ (Vn)) a.s.
(
1
n
≤ limsup Ē(DFMRF(Vn))
n→∞
∑
u |e| ν + Ē(MST(Vn))
)
e∈G(Vn)
L 2 ν −
= λ 2 [u ζ(ν;k-NNG) + ζ(ν; MST)]
The approximation ratio of DFMRF satisfies
limsup
n→∞
E(DFMRF(Vn))
E(π ∗ (Vn))
a.s.
≤ limsup
L 2
=
(
n→∞
∫
Q1
ν 1−
κ(x) 2 dx. (33)
E(DFMRF(Vn))
E(MST(Vn))
1 + u ζ(ν;k-NNG)
ζ(ν; MST)
)
. (34)9
Proof: This follows from [9, Thm 2.2]. ✷
Hence, the expressions for scaling bounds and the approximation
ratio are simplified when the dependency graph is the
k-nearest neighbor graph. A special case of this scaling result
for nearest-neighbor dependency under uniform placement was
proven in [36, Thm 2].
It is interesting to note that the approximation factor for
the k-NNG dependency graph in (34) is independent of the
node placement pdf κ and node density λ. Hence, DFMRF
has the same efficiency under different node placements. The
results of Theorem 4 on the optimality of uniform placement
are applicable here, but for the lower and upper bounds on
energy consumption. We formally state it below.
Theorem 7 (Minimum energy bounds for k-NNG):
Uniform node placement minimizes the asymptotic lower
and upper bounds on the average energy consumption in (29)
and (33) for k-NNG dependency graph over all i.i.d. node
placements κ.
Proof: From Theorem 4 and (33). ✷
We also prove the optimality of uniform distribution under
disk-dependency graphs, but over a limited set of node
placements κ.
Theorem 8 (Minimum energy bound for disk graph):
Uniform node placement minimizes the asymptotic lower and
upper bounds on the average energy consumption in (29) and
(33) for disk dependency graph over all i.i.d. node placements
κ satisfying the lower bound
κ(x) > 1
λ , ∀x ∈ Q1, (35)
where λ > 1 is the (fixed) node placement density.
Proof: We use the fact that for the disk graph with a fixed
radius, more edges are added as we scale down the area.
Hence, for Poisson processes with intensities λ1 > λ2 > 0,
[ ∑
E
j:(0,j)∈G(Pλ 1 ∪{0})
|0,j| ν]
[ ∑
≥ E
j:(0,j)∈G(Pλ 2 ∪{0})
|0,j| ν] [ λ2
λ1
] ν
2
,
where the right-hand side is obtained by merely rescaling the
edges present at intensity λ2. Since, new edges are added at
λ1, this is an inequality, unlike the case of k-NNG where the
edge set is invariant under scaling. Substituting λ1 with λκ(x),
and λ2 by 1, we have
≥
≥
∫
[ ∑
E
Q1 j:(0,j)∈G(Pλκ(x)∪{0}) ν −
λ 2 E
ν −
λ 2 E
[ ∑
j:(0,j)∈G(P1∪{0})
[ ∑
j:(0,j)∈G(P1∪{0})
|0,j| ν]
κ(x)dx
∫
|0,j|
ν]
Q1
|0,j| ν]
, ν > 2.
ν 1−
κ(x) 2 dx,
✷
Hence, uniform placement is optimal if we limit to distributions
κ satisfying (35). We have so far established the finite
scaling of the average energy when the dependency graph
describing correlations among the sensor observations is either
the k-NNG or the disk graph. However, we cannot expect finite
scaling for any general dependency graph. For instance, for the
the complete graph, the optimal fusion scheme reduces to a
version of the shortest path (SP) routing, where the average
energy consumption grows as √ n. Since the LLR in (6) is
now function over a single clique containing all the nodes,
the optimal scheme consists of a unique processor chosen
optimally, to which all the other nodes forward their raw data
along shortest paths, and the processor then forwards the value
of the LLR to the fusion center.
V. NUMERICAL ILLUSTRATIONS
As described in Section II-A, n nodes are placed in area
n
λ
and one of them is randomly chosen as the fusion center.
We conduct 500 independent simulation runs and average the
results. We fix node density λ = 1. We plot results for two
cases of dependency graph, viz., the k-nearest neighbor graph
and the disk graph with radius δ.
In Fig.4, we plot the simulation results for k-nearest neighbor
dependency and uniform node placement. Corollary 1 establishes
that the average energy consumption of the DFMRF
scheme in (33) is finite and bounded for asymptotic networks
under k-NNG dependency. The results in Fig.4a agree with
theory and we note that the convergence to asymptotic values
is quick, and occurs in networks with as little as 30 nodes.
Moreover, the energy for DFMRF scheme increases with
the number of neighbors k in the dependency graph since
more edges are added. On the other hand, the average energy
under no aggregation (SP policy) increases without bound, as
predicted in Section I-A.
We plot the approximation ratio of the DFMRF scheme for
k-NNG in (34) vs. the number of nodes in Fig.4b and vs. the
path-loss coefficient ν in Fig.4c. As predicted by Corollary
1, the approximation ratio is a constant for large networks,
and find a quick convergence to this value in Fig.4b as we
increase the network size. In Fig.4c, we also find that the
approximation ratio is insensitive with respect to the path loss
ν. Hence, DFMRF scheme has nearly the same efficiency in
the entire range of ν ∈ [2,6] under the k-NNG dependency.
In Fig.5a, we plot the average energy consumption of
DFMRF in (27) under uniform node placement and disk dependency
graph with radius δ. The average energy is bounded,
as predicted by Theorem 5. As in the k-NNG case, on
increasing the network size, there is a quick convergence to the
asymptotic values. Moreover, as expected, energy consumption
increases with δ since more edges are added to the dependency
graph. Note that the energy consumption at δ = 0 and δ = 0.3
are nearly the same, since at δ = 0.3, the disk graph is very
sparse, and hence, energy consumed in the forwarding stage
(FG) of LLR computation is small.
We now study the effect of node placement distribution
on energy consumption. In Fig.5b and 5c, we compare the
uniform node placement with i.i.d. placement according to pdf
κ given by
κ(x) = κ1(x(1))κ1(x(2)), x ∈ R 2 , (36)
where, for some a̸=0, κ1 is given by the truncated exponential10
Avg. energy per node
10
9
8
7
6
5
4
3
2
1
0
0-NNG: MST
1-NNG: DFMRF
2-NNG: DFMRF
3-NNG: DFMRF
No Fusion: SPR
20 40 60 80 100 120 140 160 180
Number of nodes n
(a) Avg energy vs. no. of nodes, ν = 2
Approx. ratio for DFMRF
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
No correlation
1-NNG dependency
2-NNG dependency
3-NNG dependency
20 40 60 80 100 120 140 160 180
Number of nodes n
(b) Approx. ratio vs. no. of nodes, ν = 2
Approx. ratio for DFMRF
5.5
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
No correlation
1-NNG dependency
2-NNG dependency
3-NNG dependency
0
2 2.5 3 3.5 4 4.5 5 5.5 6
Path loss coefficient ν
(c) Approx. ratio vs. path loss, n = 190
Fig. 4. Average energy consumption for DFMRF scheme and shortest-path routing for uniform distribution and k-NNG dependency. See Corollary 1.
Avg. energy for DFMRF
1.3
1.2
1.1
1
0.9
0.8
0.7
0.6
0.5
δ = 0
δ = 0.3
δ = 0.6
δ = 0.9
20 40 60 80 100 120 140 160 180
Number of nodes n
(a) Disk graph, ν = 2, uniform
Avg. energy for SPR
30
25
20
15
10
5
Uniform: a = 0
Clustered: a = 5
Spread out: a = −5
0
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.8 4
Path Loss Coefficient ν
(b) Avg energy for SPR, n = 190
Fig. 5. Average energy consumption for DFMRF and shortest path (SPR) scheme. See Theorem 5.
Avg. Energy for DFMRF
0.7
0.65
0.6
0.55
0.5
Uniform: a = 0
Clustered: a = 5
Spread out: a = −5
0.45
1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3
Path Loss ν
(c) Avg. Energy for DFMRF, i.i.d., n = 190
⎧
⎨ ae
κ1(z) =
⎩
−a|z|
1
a , if z ∈ [−1
2(1 − e− 2
,
2 2 ) ],
0, o.w. (37)
Note that as a → 0, we obtain the uniform distribution in
the limit. A positive (negative) a corresponds to clustering
(spreading out) of the points with respect to the origin. In
Fig.6, a sample realization for cases a = ±5 and anda = 0 is
shown.
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.5
0.4
0.3
0.2
0.1
For i.i.d. data, from Theorem 4, the uniform node placement
minimizes the asymptotic average energy consumption of
the optimal scheme, which is aggregation along the MST,
whenever the path-loss coefficient ν ≥ 2. For ν ∈ [0,2], the
uniform distribution has the worst-case energy. This is verified
in Fig.5c, where for ν ∈ [1,3], the uniform distribution initially
has high energy consumption but decreases as we increase
ν. We see that at threshold of around ν = 2.4, the uniform
distribution starts having lower energy than the non-uniform
placements (clustered and spread-out), while according to
Theorem 4, the threshold is ν = 2. Moreover, Theorem 4
also predicts that the clustered and spread-out distributions
will have the same energy consumption since ∫
ν
κ(x)1− 2 dx
Q1
are equal for a = 5 and a = −5 for κ given by (36) and (37),
and this approximately holds in Fig.5c.
0.5
0
0
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(a) Uniform a → 0
−0.1
−0.2
−0.3
−0.4
−0.5
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
(b) Clustered a = 5
−0.1
−0.2
−0.3
−0.4
−0.5
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
(c) Spread-out a = −5
Fig. 6. Sample realization of n = 190 points on unit square. See (36), (37).
Intuitively, for shortest-path (SP) policy, if we cluster the
nodes close to one another, the average energy consumption
decreases. On the other hand, spreading the nodes out
towards the boundary increases the average energy. Indeed
this behavior is validated by the results in Fig.5b, for kappa
defined above in (36) and (37). However, as we analyzed in
the previous section, optimal node placement for the DFMRF
scheme does not follow this simple intuition.
VI. CONCLUSION
We analyzed the scaling laws for energy consumption of
data fusion schemes for optimal distributed inference. Forwarding
all the raw data without fusion has an unbounded
average energy as we increase the network size, and hence,
is not a feasible strategy. We established finite average energy
scaling for a fusion heuristic known as Data Fusion for Markov
Random Fields (DFMRF) for certain class of spatial correlation
models. We analyzed the influence of the correlation
structure, node placement distribution, node density and the
transmission environment on the energy consumption.
There are many issues that are not handled in this paper.
Our fusion scheme DFMRF needs centralized topology information,
and has to be extended to a distributed scheme, where11
only local topology information is available. Our model currently
only incorporates i.i.d. node placements. We expect our
results to extend to the correlated node placement according to
a Gibbs point process through the results in [47]. We have not
considered here the scaling behavior of inference performance
with network size, and is a topic of study in [37], [48]. We
have not considered the time required for data fusion, and it
will be interesting to establish bounds in this case. Our current
correlation model assumes a discrete Markov random field. A
more natural but difficult approach is to consider Markov field
over continuous space [49] and then, sample it through node
placements.
Acknowledgment
The authors thank A. Ephremides, T. He, D. Shah and the
anonymous reviewers for helpful comments.
APPENDIX
A. Functionals on random points sets
In [9], [38], [50], Penrose and Yukich introduce the concept
of stabilizing functionals to establish weak laws of large
numbers for functionals on graphs with random vertex sets.
As in this paper, the vertex sets may be marked (sensor
measurements constituting one example of marks), but for
simplicity of exposition we will work with unmarked vertices.
We briefly describe the general weak law of large numbers
after introducing the necessary definitions.
Graph functionals on a vertex set V are often represented
as sums of spatially dependent terms
∑
ξ(x,V),
x∈V
where V ⊂ R 2 is locally finite (contains only finitely many
points in any bounded region), and the measurable function
ξ, defined on all pairs (x,V), with x ∈ V, represents the
interaction of x with other points in V. We see that the
functionals corresponding to energy consumption can be cast
in this framework.
When V is random, the range of spatial dependence of ξ
at node x ∈ V is random, and the purpose of stabilization
is to quantify this range in a way useful for asymptotic
analysis. There are several similar notions of stabilization, but
the essence is captured by the notion of stabilization of ξ
with respect to homogeneous Poisson points on R 2 , defined
as follows. Recall that Pτ is a homogeneous Poisson point
process with intensity τ.
We say that ξ is translation invariant if ξ(x,V) = ξ(x +
z,V + z) for all z ∈ R 2 . Let 0 denote the origin of R 2 and
let Br(x) denote the Euclidean ball centered at x with radius
r. A translation-invariant ξ is homogeneously stabilizing if for
all intensities τ > 0 there exists almost surely a finite random
variable R := R(τ) such that
ξ(0,(Pτ ∩ BR(0)) ∪ A) = ξ(0, Pτ ∩ BR(0))
for all locally finite A ⊂ R 2 \ BR(0). Thus ξ stabilizes if the
value of ξ at 0 is unaffected by changes in point configurations
outside BR(0).
ξ satisfies the moment condition of order p > 0 if
sup E[ξ(n
n∈N
1
2 X1,n 1
2 {Xi} n i=1) p ] < ∞. (38)
We will use the following weak laws of large numbers
throughout. Recall that Xi are i.i.d. with density κ.
Theorem 9 (WLLN [9], [46]): Put q = 1 or q = 2. Let ξ
be a homogeneously stabilizing translation-invariant functional
satisfying the moment condition (38) for some p > q. Then
=
1
lim
n→∞ n
∫
Q1
n∑
i=1
(
ξ
√ n
λ Xi,
√
n
λ {Xj} n )
j=1
E[ξ(0, P λκ(x))]κ(x)dx in L q . (39)
We interpret the right-hand side of the above equation
as a weighted average of the values of ξ on homogeneous
Poisson point processes P λκ(x). When ξ satisfies scaling such
as E[ξ(0, Pτ)] = τ −α E[ξ(0, P1)], then the limit on the righthand
side of (39) simplifies to
λ −α ∫
E[ξ(0, P1)]
Q1
(κ(x)) 1−α dx in L q , (40)
a limit appearing regularly in problems in Euclidean combinatorial
optimization. For uniform node placement (κ(x) ≡ 1),
the expression in (39) reduces to E[ξ(0, Pλ)], and the LLN
result for this instance is pictorially depicted in Fig.7.
For example, if ξ(x,V) is one half the sum of the ν-
power weighted edges incident to x in the MST (or any scaleinvariant
stabilizing graph) on V, i.e.,
ξ(x,V):= 1
2
then substituting α with ν
2
1
lim
n→∞ n
n∑
i=1
∑
e∈E(x,MST(V))
in (40),
(
ξ
√ n
ν −
= λ 2 E[ξ(0, P1)]
ν −
= λ 2 ζ(ν; MST)
λ Xi,
∫
∫
Q1
Q1
where ζ(ν; MST) is defined in (16).
B. Proof of Theorem 5
|e| ν ,
√
n
λ {Xi} n )
i=1
ν 1−
(κ(x)) 2 dx
ν 1−
(κ(x)) 2 dx, (41)
The energy consumption of DFMRF satisfies the inequality
in (27). For the MST we have the result in Theorem 3. We
now use stabilizing functionals to show that
1
n
∑
e∈G(Vn)
|e| ν
converges in L 2 to a constant. For all locally finite vertex sets
X ⊂ R 2 supporting some dependency graph G(X) and for all
x ∈ X , define the functional η(x, X) by12
Normalized sum of edges
1
n
∑
e∈G(Vn)
n → ∞
Origin
Expectation of edges
of origin of Poisson process
ν 1
|e| 2 λ− ν 2 E ∑
|e| ν
e∈E(0,G(P λ ∪{0}))
Fig. 7. LLN for sum graph edges on uniform point sets (κ ≡ 1).
η(x, X):=
∑
y:(x,y)∈G(X)
|x,y| ν . (42)
Notice that ∑
x∈X
η(x, X) = 2∑e∈G(X) |e|ν.
From [9, Thm 2.4], the sum of power-weighted edges of
the k-nearest neighbors graph is a stabilizing functional and
satisfies the bounded-moments condition (38). Hence, the limit
in (39) holds when the dependency graph is the k-NNG.
We now show that the sum of power-weighted edges of the
continuum percolation graph is a stabilizing functional which
satisfies the bounded-moments condition (38), thus implying
that the limit in (39) holds.
It is clear that η stabilizes with respect to Pτ , τ ∈ (0, ∞),
since points distant from x by more than the deterministic
disc radius do not modify the value of η(x, Pτ). Moreover, η
satisfies the bounded moments condition (38) since each |x,y|
is bounded by the deterministic disc radius and the number of
nodes in n 1
2 {Xi} n 1
i=1 which are joined to n 2 X1 is a random
variable with moments of all orders.
REFERENCES
[1] A. Anandkumar, J.E.Yukich, A. Swami, and L. Tong, “Energy-
Performance Scaling Laws for Statistical Inference in Large Random
Networks,” in Proc. of ASA Joint Stat. Meet., Denver, USA, Aug. 2008.
[2] A. Anandkumar, J. Yukich, L. Tong, and A. Swami, “Scaling Laws for
Statistical Inference in Random Networks,” in Proc. of Allerton Conf. on
Communication, Control and Computing, Monticello, USA, Sept. 2008.
[3] A. Ephremides, “Energy Concerns in Wireless Networks,” IEEE Wireless
Communications, no. 4, pp. 48–59, August 2002.
[4] W. Li and H. Dai, “Energy-Efficient Distributed Detection Via Multihop
Transmission in Sensor Networks,” IEEE Signal Processing Letters,
vol. 15, pp. 265–268, 2008.
[5] A. Anandkumar, M. Wang, L. Tong, and A. Swami, “Prize-Collecting
Data Fusion for Cost-Performance Tradeoff in Distributed Inference,” in
Proc. of IEEE INFOCOM, Rio De Janeiro, Brazil, April 2009.
[6] A. Anandkumar, L. Tong, A. Swami, and A. Ephremides, “Minimum
Cost Data Aggregation with Localized Processing for Statistical Inference,”
in Proc. of INFOCOM, Phoenix, USA, April 2008, pp. 780–788.
[7] J. Steele, “Growth Rates of Euclidean Minimal Spanning Trees with
Power Weighted Edges,” The Annals of Probability, vol. 16, no. 4, pp.
1767–1787, 1988.
[8] J. Yukich, “Asymptotics for Weighted Minimal Spanning Trees on
Random Points,” Stochastic Processes and their Applications, vol. 85,
no. 1, pp. 123–138, 2000.
[9] M. Penrose and J. Yukich, “Weak Laws Of Large Numbers In Geometric
Probability,” Annals of Applied Probability, vol. 13, no. 1, pp. 277–303,
2003.
[10] A. Anandkumar, L. Tong, and A. Swami, “Energy Efficient Routing for
Statistical Inference of Markov Random Fields,” in Proc. of CISS ’07,
Baltimore, USA, March 2007, pp. 643–648.
[11] P. Gupta and P. R. Kumar, “The Capacity of Wireless Networks,” IEEE
Tran. Information Theory, vol. 46, no. 2, pp. 388–404, March 2000.
[12] M. Franceschetti and R. Meester, Random Networks for Communication:
From Statistical Physics to Information Systems. Cambridge University
Press, 2008.
[13] Q. Zhao and L. Tong, “Energy Efficiency of Large-Scale Wireless
Networks: Proactive vs. Reactive Networking,” IEEE JSAC Special Issue
on Advances in Military Wireless Communications, May 2005.
[14] X. Liu and M. Haenggi, “Toward Quasiregular Sensor Networks: Topology
Control Algorithms for Improved Energy Efficiency,” IEEE Tran.
on Parallel and Distributed Systems, pp. 975–986, 2006.
[15] X. Wu, G. Chen, and S. Das, “Avoiding Energy Holes in Wireless Sensor
Networks with Nonuniform Node Distribution,” IEEE Tran. on Parallel
and Distributed Systems, vol. 19, no. 5, pp. 710–720, May 2008.
[16] Q. Zhao, A. Swami, and L. Tong, “The Interplay Between Signal Processing
and Networking in Sensor Networks,” IEEE Signal Processing
Magazine, vol. 23, no. 4, pp. 84–93, 2006.
[17] A. Giridhar and P. Kumar, “Toward a Theory of In-network Computation
in Wireless Sensor Networks,” IEEE Comm. Mag., vol. 44, no. 4, pp.
98–107, 2006.
[18] R. Cristescu, B. Beferull-Lozano, M. Vetterli, and R. Wattenhofer,
“Network Correlated Data Gathering with Explicit Communication: NP-
Completeness and Algorithms,” IEEE/ACM Transactions on Networking
(TON), vol. 14, no. 1, pp. 41–54, 2006.
[19] P. von Rickenbach and R. Wattenhofer, “Gathering Correlated Data
in Sensor Networks,” in Joint workshop on Foundations of Mobile
Computing, 2004, pp. 60–66.
[20] H. Gupta, V. Navda, S. Das, and V. Chowdhary, “Efficient gathering of
correlated data in sensor networks,” in Proc. of ACM Intl. symposium
on Mobile ad hoc networking and computing, 2005, pp. 402–413.
[21] S. Madden, M. Franklin, J. Hellerstein, and W. Hong, “TinyDB: an
acquisitional query processing system for sensor networks,” ACM Transactions
on Database Systems, vol. 30, no. 1, pp. 122–173, 2005.
[22] C. Intanagonwiwat, R. Govindan, and D. Esterin, “Directed Diffusion
: A Scalable and Robust Paradigm for Sensor Networks,” in Proc. 6th
ACM/Mobicom Conference, Boston,MA, 2000, pp. pp 56–67.
[23] B. Krishnamachari, D. Estrin, and S. Wicker, “Modeling Data-centric
Routing in Wireless Sensor Networks,” in IEEE INFOCOM, New York,
USA, 2002.
[24] A. Giridhar and P. Kumar, “Maximizing the functional lifetime of sensor
networks,” in Proc. of IPSN, 2005.
[25] ——, “Computing and Communicating Functions over Sensor Networks,”
IEEE JSAC, vol. 23, no. 4, pp. 755–764, 2005.
[26] S. Subramanian, P. Gupta, and S. Shakkottai, “Scaling Bounds for
Function Computation over Large Networks,” in IEEE ISIT, June 2007.
[27] O. Ayaso, D. Shah, and M. Dahleh, “Counting Bits for Distributed
Function Computation,” in Proc. ISIT, Toronto, Canada, July 2008, pp.
652–656.
[28] Y. Sung, S. Misra, L. Tong, and A. Ephremides, “Cooperative Routing
for Signal Detection in Large Sensor Networks,” IEEE JSAC, vol. 25,
no. 2, pp. 471–483, 2007.
[29] J. Chamberland and V. Veeravalli, “How Dense Should a Sensor Network
Be for Detection With Correlated Observations?” IEEE Tran. on
Information Theory, vol. 52, no. 11, pp. 5099–5106, 2006.
[30] S. Misra and L. Tong, “Error Exponents for Bayesian Detection with
Randomly Spaced Sensors,” IEEE Tran. on Signal Processing, vol. 56,
no. 8, 2008.
[31] Y. Sung, X. Zhang, L. Tong, and H. Poor, “Sensor Configuration and
Activation for Field Detection in Large Sensor Arrays,” IEEE Tran. on
Signal Processing, vol. 56, no. 2, pp. 447–463, 2008.
[32] Y. Sung, H. Yu, and H. V. Poor, “Information, Energy and Density
for Ad-hoc Sensor Networks over Correlated Random Fields: Largedeviation
Analysis,” in IEEE ISIT, July 2008, pp. 1592–1596.
[33] N. Katenka, E. Levina, and G. Michailidis, “Local Vote Decision Fusion
for Target Detection in Wireless Sensor Networks,” in Joint Research
Conf. on Statistics in Quality Industry and Tech., Knoxville, USA, June
2006.
[34] L. Yu, L. Yuan, G. Qu, and A. Ephremides, “Energy-driven Detection
Scheme with Guaranteed Accuracy,” in Proc. IPSN, 2006, pp. 284–291.
[35] A. Anandkumar, A. Ephremides, A. Swami, and L. Tong, “Routing
for Statistical Inference in Sensor Networks,” in Handbook on Array
Processing and Sensor Networks, S. Haykin and R. Liu, Eds. John
Wiley & Sons, 2009, ch. 25.
[36] A. Anandkumar, L. Tong, and A. Swami, “Optimal Node Density for
Detection in Energy Constrained Random Networks,” IEEE Tran. Signal
Proc., vol. 56, no. 10, pp. 5232–5245, Oct. 2008.
[37] ——, “Detection of Gauss-Markov Random Fields with Nearestneighbor
Dependency,” IEEE Tran. Information Theory, vol. 55, no. 2,
Feb. 2009.[38] M. Penrose and J. Yukich, “Limit Theory For Random Sequential
Packing And Deposition,” Annals of Applied probability, vol. 12, no. 1,
pp. 272–301, 2002.
[39] M. Penrose, Random Geometric Graphs. Oxford University Press,
2003.
[40] P. Brémaud, Markov Chains: Gibbs fields, Monte Carlo simulation, and
queues. Springer, 1999.
[41] P. Clifford, “Markov Random Fields in Statistics,” Disorder in Physical
Systems, pp. 19–32, 1990.
[42] M. Jerrum and A. Sinclair, “Polynomial Time Approximations for the
Ising Model,” SIAM J. Computing, vol. 22, no. 5, pp. 1087–1116, 1993.
[43] E. Dynkin, “Necessary and Sufficient Statistics for a Family of Probability
Distributions,” Tran. Math, Stat. and Prob., vol. 1, pp. 23–41,
1961.
[44] X. Li, “Algorithmic, Geometric and Graphs Issues in Wireless Networks,”
Wireless Comm. and Mobile Computing, vol. 3, no. 2, March
2003.
[45] P. Wan and C. Yi, “On the Longest Edge of Gabriel Graphs in Wireless
Ad Hoc Networks,” IEEE Tran. on Parallel and Distributed Systems,
pp. 111–125, 2007.
[46] M. Penrose, “Laws Of Large Numbers In Stochastic Geometry With
Statistical Applications,” Bernoulli, vol. 13, no. 4, pp. 1124–1150, 2007.
[47] T. Schreiber and J. Yukich, “Stabilization and Limit Theorems for
Geometric Functionals of Gibbs Point Processes,” Arxiv preprint
arXiv:0802.0647, 2008.
[48] A. Anandkumar, J. Yukich, L. Tong, and A. Willsky, “Detection Error
Exponent for Spatially Dependent Samples in Random Networks,” in
submitted to Proc. of IEEE ISIT, Seoul, S. Korea, July 2009.
[49] H. Kunsch, “Gaussian Markov Random Fields,” J. Fac. Sci. Univ. of
Tokyo, no. 26, pp. 53–73, 1979.
[50] M. Penrose and J. Yukich, “Central Limit Theorems For Some Graphs
In Computational Geometry,” Annals of Applied Probability, vol. 11,
no. 4, pp. 1005–1041, 2001.
13