finding all dominator trees

  • Follow


Hi,

Is there an efficient algorithm for finding all dominator trees of a
graph? That is, for every node, find its dominator tree. I'm looking
for something better than simply running a dominator tree algorithm for
each node in the graph.

Amir

0
Reply amichail (76) 11/23/2006 2:18:11 AM

> Is there an efficient algorithm for finding all dominator trees of a
> graph? That is, for every node, find its dominator tree. I'm looking
> for something better than simply running a dominator tree algorithm
> for each node in the graph.

Unless I am missing some subtlety of your situation, you should just be
able to run the dominator tree algorithm for the root node of the
graph. Dominator trees have the property that each subtree for a node
corresponds to the dominator tree for that node.

See "Efficiently Computing Static Single Assignment Form and the
Control Dependence Graph" by Cytron et al.

0
Reply diablovision 11/27/2006 2:54:02 AM


On Thursday 23 Nov 2006 02:18, you wrote:
> Is there an efficient algorithm for finding all dominator trees of a
> graph? That is, for every node, find its dominator tree. I'm looking
> for something better than simply running a dominator tree algorithm for
> each node in the graph.

Once you have the domiator tree for the root, you have the dominator
tree for every node (take the subtree rooted at that node).

Pingali and Bilardi developed optimal algorithms for dominators, control
dependence and Static Single Assignment construction:

%A Keshav Pingali
%A Gianfranco Bilardi
%T Optimal Control Dependence Computation and the Roman Chariots Problem
%J |TOPLAS|
%S 19
%N 3
%P 462-491
%D |MAY|, 1997
%U http://www.cs.cornell.edu/Info/Projects/Bernoulli/papers/toplas97.ps

%A Gianfranco Bilardi
%A Keshav Pingali
%T The Static Single Assignment Form and its Computation
%R Cornell University Technical Report
%U http://www.cs.cornell.edu/Info/Projects/Bernoulli/papers/ssa.ps
%D |JUL|, 1999
%P 1-37
%K SSA

I have implemented these algorithms the perl modules Blocks.pm
and Control_Dep.pm in the FermaT Program Transformation System:

http://www.cse.dmu.ac.uk/~mward/fermat.html

--
			Martin

martin@gkc.org.uk http://www.cse.dmu.ac.uk/~mward/
Don't send email to: d3457f@gkc.org.uk

0
Reply Martin 11/27/2006 10:44:21 PM

diablovision@yahoo.com wrote:
> > Is there an efficient algorithm for finding all dominator trees of a
> > graph? That is, for every node, find its dominator tree. I'm looking
> > for something better than simply running a dominator tree algorithm
> > for each node in the graph.
>
> Unless I am missing some subtlety of your situation, you should just be
> able to run the dominator tree algorithm for the root node of the
> graph. Dominator trees have the property that each subtree for a node
> corresponds to the dominator tree for that node.
>

There is no root.  This is an arbitrary graph.

Amir

> See "Efficiently Computing Static Single Assignment Form and the
> Control Dependence Graph" by Cytron et al.

0
Reply Amir 11/27/2006 10:44:34 PM

Amir Michail wrote:
> diablovision@yahoo.com wrote:
> > > Is there an efficient algorithm for finding all dominator trees of a
> > > graph? That is, for every node, find its dominator tree. I'm looking
> > > for something better than simply running a dominator tree algorithm
> > > for each node in the graph.
> >
> > Unless I am missing some subtlety of your situation, you should just be
> > able to run the dominator tree algorithm for the root node of the
> > graph. Dominator trees have the property that each subtree for a node
> > corresponds to the dominator tree for that node.
> >
>
> There is no root.  This is an arbitrary graph.
>
> Amir
>
> > See "Efficiently Computing Static Single Assignment Form and the
> > Control Dependence Graph" by Cytron et al.


This is from Torczon's book: "In a CFG, node i <dominates> node j if
every path from the entry node to j passes through i". What this tells
us is that we need a root (entry) node.

Suppose that we have a graph G, directed or not, doesnt matter, and
lets say that i, j are in V(G) (G's vertex set), now 1) i dom j, or 2)
j dom i or 3) neither. How can we know that if we dont have a root
node from where to start?. In teo. all 3 are posible. If we take a
node "n" as a root (the node from which my paths will start) then it
could happen that i dom j. But if we take a node "m", "m \== n" it
could happen that j dom i. So either u have a root node already or u
choose one (it doesnt matter) or u try every posible node, O(V(G) *
O(Dom)).  Something like that. But if u do that, I feel that many
cases are going to repeat or it could be like a fixed point algorithm
for all the graph. Domination can be computed using fix point.

Alfredo Cruz

0
Reply s1l3ntr3p 11/28/2006 5:13:49 AM

s1l3ntr3p wrote:
> Amir Michail wrote:
> > diablovision@yahoo.com wrote:
> > > > Is there an efficient algorithm for finding all dominator trees of a
> > > > graph? That is, for every node, find its dominator tree. I'm looking
> > > > for something better than simply running a dominator tree algorithm
> > > > for each node in the graph.
> > >
> > > Unless I am missing some subtlety of your situation, you should just be
> > > able to run the dominator tree algorithm for the root node of the
> > > graph. Dominator trees have the property that each subtree for a node
> > > corresponds to the dominator tree for that node.
> > >
> >
> > There is no root.  This is an arbitrary graph.
> >
> > Amir
> >
> > > See "Efficiently Computing Static Single Assignment Form and the
> > > Control Dependence Graph" by Cytron et al.
>
>
> This is from Torczon's book: "In a CFG, node i <dominates> node j if
> every path from the entry node to j passes through i". What this tells
> us is that we need a root (entry) node.

The idea is to consider each node in turn as a root and build a
dominator tree with respect to that root. A naive way to do this is to
run a dominator tree algorithm for each such root separately.

But maybe there's a more efficient way (even though the relationship
between these dominator trees may not be obvious in general).

The application here has nothing to do with compilers: I would like to
compute a dominator tree for each person in a social network to build
a better social news service:

http://targetyournews.com/ajax/TargetYourNews.html#cmd%3DMore%26link_id%3D656584

Amir
0
Reply Amir 11/29/2006 5:51:06 AM

s1l3ntr3p wrote:
> Amir Michail wrote:
>> diablovision@yahoo.com wrote:
>>>> Is there an efficient algorithm for finding all dominator trees of a
>>>> graph? That is, for every node, find its dominator tree. I'm looking
>>>> for something better than simply running a dominator tree algorithm
>>>> for each node in the graph.
>>> Unless I am missing some subtlety of your situation, you should just be
>>> able to run the dominator tree algorithm for the root node of the
>>> graph. Dominator trees have the property that each subtree for a node
>>> corresponds to the dominator tree for that node.
>>>
>> There is no root.  This is an arbitrary graph.
>
> This is from Torczon's book: "In a CFG, node i <dominates> node j if
> every path from the entry node to j passes through i". What this tells
> us is that we need a root (entry) node.

Yes, but isn't what Amir is asking for the set of dominator trees
obtained when one chooses each node in turn as the entry node?  So any
given node will be the root for one of the trees, but he wants some
way to calculate all of them.

- Brooks

--
The "bmoses-nospam" address is valid; no unmunging needed.
0
Reply Brooks 11/29/2006 5:52:26 AM

Amir Michail wrote:
> s1l3ntr3p wrote:
> > Amir Michail wrote:
> > > diablovision@yahoo.com wrote:
> > > > > Is there an efficient algorithm for finding all dominator trees of a
> > > > > graph? That is, for every node, find its dominator tree. I'm looking
> > > > > for something better than simply running a dominator tree algorithm
> > > > > for each node in the graph.
> > > >
> > > > Unless I am missing some subtlety of your situation, you should just be
> > > > able to run the dominator tree algorithm for the root node of the
> > > > graph. Dominator trees have the property that each subtree for a node
> > > > corresponds to the dominator tree for that node.
> > > >
> > >
> > > There is no root.  This is an arbitrary graph.
> > >
> > > Amir
> > >
> > > > See "Efficiently Computing Static Single Assignment Form and the
> > > > Control Dependence Graph" by Cytron et al.
> >
> >
> > This is from Torczon's book: "In a CFG, node i <dominates> node j if
> > every path from the entry node to j passes through i". What this tells
> > us is that we need a root (entry) node.
>
> The idea is to consider each node in turn as a root and build a
> dominator tree with respect to that root. A naive way to do this is to
> run a dominator tree algorithm for each such root separately.
>
> But maybe there's a more efficient way (even though the relationship
> between these dominator trees may not be obvious in general).
>
> The application here has nothing to do with compilers: I would like to
> compute a dominator tree for each person in a social network to build
> a better social news service:
>
> http://targetyournews.com/ajax/TargetYourNews.html#cmd%3DMore%26link_id%3D656584
>
> Amir
Hi everyone:

Well with the risk of sounding repetitive and I become a pain, u know
where... Here I come...
I hope Im not too wrong...

We all have stated what a dominator is: i dominates j (i in Dom(j)) iff
every posible path P from the entry (root) node n0 to j we have that i
is in V(P) (Vertex set of P).

What does it tell us? Well, acordding to my coffee and pizza, in Graph
Theory it tells us that i is a cut vertex. A cut vertex is a vertex of
a graph such that when u remove it from the graph and all edges
incidents to it we desconects at least 2 vertices of the graph, i.e. we
have more components.

So, lets say we have a graph G and cut vertices i,j,k then the cut is:
(V, V-C) where C={i,j,k}. Now if we remove i from G, we know that we
have at least 2 components. Lets call them K1 and K2. Now is it true
that i dominates each node of K2 (K1) when the root node is in K1 (K2)?
I guess it is. Then the problem in our hands is reduced to know all the
cut vertices of a graph G, and if that problem has something is many
many solutions. The naive solution runs in time O(V(G)*E(G)), a more
smart solution, still "easy" was proposed by Robert E. Tarjan (whos
work in compilers is well know) using DFS or BFS around 1970's and it
runs in O(V(G) + E(G)). But there is a lot of work on that specialy if
u know certain caracteristics of ur graph (I have read the link u
posted Amir... and Im not quite sure that the graph would follow a
predeterminate "shape"). For example, Hon-Chan Chen proposed a linear
algorithm to discover cut vertices and bridges (the same as cut
vertices but with edges) in trapeziod graphs. Etc. U can also check
parallel algoriths for that propouse, for example, u could check
Roosta's book: Seyed H. Roosta, "Parallel Processing and Parallel
Algorithms, Theory and Computation", Springer-Verlag, 2000.

As a note: If we have a strongly connected graph G then we do not have
any dominator but the nodes themselfs that it seems they wouldnt work
for you.

As I said, I dont know if Im right or Im wrong...

Alfredo Cruz
0
Reply s1l3ntr3p 11/29/2006 4:08:06 PM

Amir Michail wrote:

> But maybe there's a more efficient way (even though the relationship
> between these dominator trees may not be obvious in general).

I'm not sure what you mean with dominator "trees". Isn't it sufficient
to find the immediate dominator for each each node, in order to create
any other dominator structure subsequently? AFAIR all the basic
algorithms are O(n).

A root node is not normally required by graph algorithms, the only
requirement IMO is an ordered list of all nodes. Within that list more
orders can be implemented, e.g. by (reverse) DFS numbering, to simplify
procedures like finding dominators.

The results of properly implemented algorithms are independent from any
order of the nodes. For completeness: A graph can have zero or more
"root" nodes, which are not reachable from any other nodes. There can
exist singletons and isolated subgraphs, too, which must be treated as
separate graphs.

DoDi
0
Reply Hans 11/29/2006 4:51:18 PM

>> > > diablovision@yahoo.com wrote:
>> > > > > Is there an efficient algorithm for finding all dominator trees of a
>> > > > > graph? That is, for every node, find its dominator tree. I'm looking
>> > > > > for something better than simply running a dominator tree algorithm
>> > > > > for each node in the graph.

"s1l3ntr3p" <s1l3ntR3p@gmail.com> writes:
> Hi everyone:
>
> Well with the risk of sounding repetitive and I become a pain, u know
> where... Here I come...
> I hope Im not too wrong...
>
> We all have stated what a dominator is: i dominates j (i in Dom(j)) iff
> every posible path P from the entry (root) node n0 to j we have that i
> is in V(P) (Vertex set of P).
>
> What does it tell us? Well, acordding to my coffee and pizza, in Graph
> Theory it tells us that i is a cut vertex. A cut vertex is a vertex of
> a graph such that when u remove it from the graph and all edges
> incidents to it we desconects at least 2 vertices of the graph, i.e. we
> have more components.
>
....
> As a note: If we have a strongly connected graph G then we do not have
> any dominator but the nodes themselfs that it seems they wouldnt work
> for you.
>
> As I said, I dont know if Im right or Im wrong...
>
> Alfredo Cruz

Yes, there are more efficient algorithms for computing the sets of
dominators in a graph considering each vertex in the graph as a root.
Both reachable vertexes and strongly-connected-components (cycles)* are
part of the algorithm.

First, if there is one unique vertex that can reach all other vertexes
(verticies if you prefer) in the graph, consider that vertex the root.
The dominator algorithm given that root will calculate the correct
dominators (the dominator tree) for every vertex (v1) in the graph
assuming that each other vertex (v2) is the root.  That is, for any
target vertex v1 and root vertex v2, some vertex in the dominator tree
of v1 will be the dominator of v1 given the root v2.

Note, in the above case, the root had no incoming edges (and is the
only vertex with no incoming edges).  So, the next generallization is
to deal with multiple vertexes with no-incoming edges.  If there is
more than one vertex in the graph with no-incoming edges, create a
fake-root vertex and draw an edge from that fake-root to each vertex
that has no-incoming edges.  Now, run the dominator algorithm from
that root, and the algorithm will work correctly again.

We, are left now, with only the case where all of the vertexes of the
graph have incoming edges.  In that case, all the vertexes in the
graph are elements of strongly-connected-components (cycles).  Assume
without-loss-of generality that there is one cycle which contains all
the vertexes of the graph.  We can assume that because, the second
part of this explanation showed us how to deal with disjoint graphs by
making a fake-root to join them.  The only question is how to pick a
root element for a cycle.  If you work through the dominator
algorithm, you will see that picking any element that is part of the
"outermost" cycle (and not part of any inner cycles), will achieve the
same results.

That should give you enough hints that you can work through the
possibilities and fill out the algortihm (and prove that it works once
you have it filled out), so I'll leave the rest to you.  The basic
point is the you can either select a root from the given graph (or
construct a fake-root if no appropriate element of the given graph
exists) and from that [fake-]root, run the normal dominator algorithm
once and get the data you need.  You don't need to run the dominator
algorithm n-times considering each distinct vertex a root.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark                    Internet   :  compres@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
------------------------------------------------------------------------------

* from above, If I recall correctly, a strongly-connected-component is
  a set of vertexes such that each element in the set is reachable
  from any other element in the set.  If that is not true (because an
  scc is something else), then that set is a cycle (it is a cycle even
  if it is also known as an scc).  I think the terms scc and cycle are
  synonymous, but I haven't used the term scc in so long, I may be
  incorrect on its definition (and in which case, I just mean cycle).
0
Reply Chris 11/30/2006 7:06:36 AM

Chris F Clark wrote:
> ...
>
> Yes, there are more efficient algorithms for computing the sets of
> dominators in a graph considering each vertex in the graph as a root.
> Both reachable vertexes and strongly-connected-components (cycles)* are
> part of the algorithm.
>
> First, if there is one unique vertex that can reach all other vertexes
> (verticies if you prefer) in the graph, consider that vertex the root.
> The dominator algorithm given that root will calculate the correct
> dominators (the dominator tree) for every vertex (v1) in the graph
> assuming that each other vertex (v2) is the root.  That is, for any
> target vertex v1 and root vertex v2, some vertex in the dominator tree
> of v1 will be the dominator of v1 given the root v2.
>

I don't understand this solution. How would it work in this example?

A -> B <-> C

The dominator tree for A:

A
|
B
|
C

The dominator tree for B:

B
|
C

The dominator tree for C:

C
|
B

Amir
0
Reply Amir 12/4/2006 2:32:18 AM

Amir Michail wrote:
> Chris F Clark wrote:
> > ...
> >
> > Yes, there are more efficient algorithms for computing the sets of
> > dominators in a graph considering each vertex in the graph as a root.
> > Both reachable vertexes and strongly-connected-components (cycles)* are
> > part of the algorithm.
> >
> > First, if there is one unique vertex that can reach all other vertexes
> > (verticies if you prefer) in the graph, consider that vertex the root.
> > The dominator algorithm given that root will calculate the correct
> > dominators (the dominator tree) for every vertex (v1) in the graph
> > assuming that each other vertex (v2) is the root.  That is, for any
> > target vertex v1 and root vertex v2, some vertex in the dominator tree
> > of v1 will be the dominator of v1 given the root v2.
> >
>
> I don't understand this solution. How would it work in this example?
>
> A -> B <-> C
>
> The dominator tree for A:
>
> A
> |
> B
> |
> C
>
> The dominator tree for B:
>
> B
> |
> C
>
> The dominator tree for C:
>
> C
> |
> B

The problem seems to be the definition of dominance. Dominance in
compiler land is defined with respect to some (already chosen) root
node, because the definition states that a node X dominates Y iff every
path from the root node R to Y goes through node X.

In your example, the dominator tree for C is not correct if we chose A
to be the root node. C cannot dominate B because there exists a path
from A to B that does not go through C (the direct path).

I'm not sure what property you are going after in the general graph
problem, but the standard definitions of dominance (in compiler land)
are going to assume the existence of a unique root node.

Perhaps if you collapse cycles (do a SCC decomposition) of your graph
first, and consider cyclic structures to be a single node, the
resulting DAG can be sorted topologically, and the top node can be
considered the root?

Because, in truth, the dominance relation doesn't make any sense unless
you have either or both of the following conditions:

1. a unique entry node.
2. an acyclic graph.

Hope this helps.

0
Reply diablovision 12/4/2006 1:28:54 PM

diablovision@yahoo.com wrote:
> Amir Michail wrote:
> > ...
> > I don't understand this solution. How would it work in this example?
> >
> > A -> B <-> C
> >
> > The dominator tree for A:
> >
> > A
> > |
> > B
> > |
> > C
> >
> > The dominator tree for B:
> >
> > B
> > |
> > C
> >
> > The dominator tree for C:
> >
> > C
> > |
> > B
>
> The problem seems to be the definition of dominance. Dominance in
> compiler land is defined with respect to some (already chosen) root
> node, because the definition states that a node X dominates Y iff every
> path from the root node R to Y goes through node X.
>
> In your example, the dominator tree for C is not correct if we chose A
> to be the root node. C cannot dominate B because there exists a path
> from A to B that does not go through C (the direct path).
>

But we don't choose A to be the root. In my problem, the dominator tree
for a node X is determined by taking X as the root. So a simple way to
get what I want is to run a dominator tree algorithm with respect to
each node in the graph taken as root, but I was hoping there would be a
more efficient way to do this. And as part of this more efficient
algorithm, maybe we could have an efficient way of storing all of these
dominator trees, as they may share some common parts.

> I'm not sure what property you are going after in the general graph
> problem, but the standard definitions of dominance (in compiler land)
> are going to assume the existence of a unique root node.
>
> Perhaps if you collapse cycles (do a SCC decomposition) of your graph
> first, and consider cyclic structures to be a single node, the
> resulting DAG can be sorted topologically, and the top node can be
> considered the root?

Why would this help in my problem?

>
> Because, in truth, the dominance relation doesn't make any sense unless
> you have either or both of the following conditions:
>
> 1. a unique entry node.
> 2. an acyclic graph.
>

My problem is well defined as discussed above, although it's a bit
different from what people normally do in compilers.

Amir

> Hope this helps.
0
Reply Amir 12/5/2006 11:19:39 AM

> Chris F Clark wrote:
>> An incorrect algorithm...

"Amir Michail" <amichail@gmail.com> writes:
> I don't understand this solution. How would it work in this example?
>
> A -> B <-> C
>
> The dominator tree for A:
>
> A
> |
> B
> |
> C
>
....
> The dominator tree for C:
>
> C
> |
> B

The issue is for cases where the desired root is a member of a cycle.

You can construct the dominator tree for C from the one for A. One
simply slits the tree at C, in this case pulls C from the bottom.
Next, one removes from the tree all vertexes unreachable from C
(that's the node A in this case) and adds C to the top along with any
descendants of C in the original tree. I was thinking this
modification would always work.  Thus, is you had the dominator tree
for A, you could calculate the dominator tree for every node reachable
from A.

However, there are cases where this won't work.

First, there is the problem with cycles.  If you pick a root that is a
member of a cycle (and that node doesn't dominate all nodes in the
cycle given the root originally used), it will have different
dominator for some elements of the cycle.  Your A->B->C->B graph shows
the basics of that case, a graph like A->B->C->D->E, D->B gives an
even better example.  Essentially loop back edges, resulting in an
extra path of domination for roots that are within the cycle.  I think
one could supplement the normal dominator graph with a ring to deal
with the elements in the cycle, or in the example A-B-C-D-E as the
normal dominator tree, with C-D-B-C as the dominators within the
cycle.  You use the cycle for the case where the desired root is
within a cycle to help you disconnet the original dominator tree and
construct a new dominator tree.

Next there are cases involve fork/join trees (DAGs), such as:
A->B->D and A->C->D

In this case, A immediately dominates D.  However, if you pick either
B or C as a root, they also dominate D when you have excluded the
other path (i.e. B dominates D from any root that can't reach C and
vice-versa).  Note, that if you the fork a the DAG (A in this case) is
reachable from the root you choose, then that root dominates the join
node of the DAG (D in this case).  However, any node along the path
from the fork of a DAG to the join of the DAG will have some
intermediate dominator, B and C are examples of this.  You could model
this by making multiple dominator trees, one the includes the fork of
the jag, and one for each path through the jag.  In the example, given
this would be A-D for the fork, and B-D and C-D, for the two paths.

Note, that both of these problems are local to a specific type of
sub-graph.  That is if you root node and target node are not part of
either a DAG or a cycle, then the information from any dominator tree
where the root node reached both vertexes will still contain the right
dominators.  It seems that dealing with these two issues would allow
one to modify the dominator algorithm to construct the extra trees
efficiently, especially if one allows sharing.  Perhaps if you work
through the two problem cases, you can fill such ans algorithm out.

Of course, don't forget that the two issues can be mixed, as in:
A->B->C->D->F->G->H->J, A->E->F, G->C, J->E, G->I->J which is 2 DAGs
(A->B->C->D->F, A->E->F and G->H-J, G->I->J) with 2 cycles
(C->D->F->G->C and E->F->G->H/I->J->E) interlinked.

Note, that it may help to replace these sub-problems when you
encounter them with nodes that summarize the subproblem.  So, if you
find a DAG in the graph, you do the computation on the DAG, and
replace the DAG with a single node that summarizes the DAG, similarly
for cycles.  However, such solutions often work best on reducible
graphs and the hardest cases are the irreducible graphs.  But even if
you simply dealth with irreducible sub-graphs as chunks and summarized
on them, it still might be effective.

My apologies for providing a wrong first answer.  I had thought about
the cycle case (and the DAG case also), but I made a mistake in
thinking that constructing the relevant dominator tree would be
trivial given the original dominator tree.  After reading your
question and giving the problem more thought, I realized that I was
wrong about that.  I hope this revision is sufficient to patch the
problems in the original algorithm.  I hope I haven't made the problem
worse.

Sorry,
-Chris

*****************************************************************************
Chris Clark                    Internet   :  compres@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
0
Reply Chris 12/5/2006 11:20:38 AM

On Monday 04 Dec 2006 13:28, diablovision@yahoo.com wrote:
> Because, in truth, the dominance relation doesn't make any sense unless
> you have either or both of the following conditions:
>
> 1. a unique entry node.
> 2. an acyclic graph.

Several people, including myself, assumed that when Amir
referred to "dominators in a graph", he meant a control
flow graph (with a unique entry node from which the meaning
of dominance can be defined). He actually wants to compute
the set of dominator trees for the set of control flow graphs
which can be constructed from a given graph by taking each node
in turn to be the entry node (and, presumably, ignoring nodes
which are not reachable from that node). Each of these CFGs
has the same nodes and edges, but a different entry node,
and therefore a different dominance relation and dominator tree.

--
			Martin

martin@gkc.org.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
Don't email: d3457f@gkc.org.uk

0
Reply Martin 12/5/2006 11:20:50 AM

On 5 Dec 2006 06:19:39 -0500, Amir Michail wrote:

> My problem is well defined as discussed above, although it's a bit
> different from what people normally do in compilers.

Well, actually if the language being compiled allowed ad-hoc supertypes,
then the resulting inheritance tree might look exactly as you described. B
and C were subtypes of each other.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

0
Reply Dmitry 12/6/2006 1:58:06 PM

First, my apologies to the group for so many posts on this topic,
which is actually only marginally connected to compilers, despite
being about dominators.  If people ask, I will take this to private
email.  However, given that graph problems are generally useful in
compilers, I will follow this a bit further.

Reading the web page, I realize that part of the reason for computing
dominators in your case is to find the sub-cycles (and that's often
true, often one cdomputes dominators to find cycles).  Thus, an
algorithm that assumes you know the cycles is not very useful.

However, let's try to distnguish what is different about your case
from typical compiler users of the dominator algorithm. In the typical
compiler case, one often (perhaps even generally) has a basic linear
flow, that is there is some root "entrypoint" (or small set of roots)
and some small set of exits and the flow is from the entries to the
exits.  Most flow is not bidirectional.

I think in a social networking case, one has a lot more bidirectional
flow. If I am your friend, you are my friend (and vice-versa).  There
are exceptions, but it is common enough an occurance to alter the
basic model.  This is particularly true of a social network that grows
from a group.  People in the group become listed in the social network
to exploit their contacts.  Thus, most people in the group are
connected, well at least for the intial (founding) group.  Thus, the
whole graph is likely to be one big cycle (strongly-connected-
component), in that if there is a path from me to you, there is likely
a path from you back to me and most likely that path can be found by
simply reversing the edges that got from me to you in the first place.

Now, in the cases like facebook there will be certain (many)
exceptions to the rule, which I am an example of.  I have a facebook
entry but I am connected to no one and I don't think anyone is
connected to me, because I did not join as part of a group of friends.
Moreover, if links get established, they may be more likely one-way
links, at least at first.  So, in some (many) social networking
databases, there will be outlier vertexes, which are either disjoint
from the network, or sources (point into the network, but are not
pointed to) or sinks (the reverse).  Note, because you are trying to
filter out "false" accounts, such vertexes may be particularly
interesting to identify.  Any disjoint, source, or sink vertex is
likely to be a false account.

However, for the most part of the network, we can exclude such
isolated vertexes.  In fact, I suspect that the body of the network
consists of one or more large cycles, and that the bulk of the network
is in fact one large cycle.  Indeed, I'm pretty certain, I've ready a
Scientific American article (on how computer viruses spread) where the
author "proved" just such a point for nodes on the internet and their
connections.  So, for simplicity, let's assume that any interesting
vertexs is part of this one large cycle.  If the assumption is wrong,
one can repeat the algorithm starting at some vertex that was omitted
from the set already covered, and continue until all vertexes have
been covered.

At the same time, I think the bi-directional nature of most links will
also spoil use of a traditional dominator algorithm.  Almost every
edge has a corresponding back-edge.  I suspect, you are more likely to
see 3 element cycles of the form a<->b<->c, than a->b->c->a.  That
does not mean their are not cutpoints or articulation points in the
graph, vertexes, which if removed the graph would be come disjoint.
It just means that I think the dominator algorithm isn't the best way
to find them, as there may be alternate paths from sub-clusters to
each other, but a small set of vertexes which if removed would cause
the graph to fall apart.  In other words, just becasue a vertex isn't
a dominator to a sub-cluster of the graph, doesn't mean it isn't one
of the key vertexes which if removed would cause the graph to fall
apart.  Again, I think such vertexes are called articulation points
and their are algorithms for finideng them, ande they would be more
useful starting points.

Note, if you still want to find dominators, I think that if you start
with any node in your graph, and find the dominators from that point,
while at the same time finding cycles, you will collect the info you
need (in roughly one pass over the graph).

Speculation: because of the special properties of social networking
graphs, I might be tempted to break (remove) back-edges caused by
bi-directional flows (e.g. a<->b) as I traverse an edge in either
direction.  I can't tell you exactly how that will effect the
algorithm, but intuitively, it feels right.  In fact, if easy to do, I
would process bi-directional edges before uni-directional ones.  I
would even be willing to make a special pass to identify
bi-directional edges (and mark them) before running the normal
algorithm.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark                    Internet   :  compres@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
0
Reply Chris 12/6/2006 1:58:33 PM

16 Replies
101 Views

(page loaded in 0.088 seconds)

9/9/2012 3:53:18 AM


Reply: