[Problem Solving #3] Polygon Phobia

wolfsinner · July 13, 2014

Problem #3 - Polygon Phobia

So here's Problem #3 of the series. I was really busy this week, so I didn't have time to come up with a problem. I used a problem from the Portuguese Inter-University Programming Marathon. I hope you guys enjoy it.

This one is simple, especially if you use a particular algorithm/data structure. You don't need to know more than basic math for this problem, in spite of what its name and geometrical basis suggests, you don't need to do any kind of geometrical calculations or even use any mathematical formulae. Make sure you understand what is requested, as the task itself involves very little math.

I recommend you read the problem description through the gateway or the PDF, instead of here, since I couldn't do some of the pretty formatting with BBCode.

Problem Description:

(This problem was a part of the 2013 edition of the Portuguese Inter-University Programming Marathon. You can download the original PDF file here.)

Bob is a young artist, whose current works are paintings with coloured line segments (like the one depicted in the figure below). But Bob is phobic about polygons, he simply cannot draw any closed chain of line segments.

In his next painting workshop Bob has decided to try a new approach, "the luck of the draw". He will start by carefully choosing the possible line segments and by writing the two different endpoints of each segment on a small piece of paper, just as if the painting canvas was a Cartesian coordinate system.

Then, all those pieces of paper will be put into a large jar, which will be very well shaken. At this point, the painting process will begin, using his "luck of the draw" approach. Whilst the jar is not empty, Bob will pick a piece of paper out of it, one at a time, and will then paint the corresponding line segment unless it closes an existing chain (because of his phobia).

Papers taken out of the jar will be put directly into the recycling bin.

A segment (P₁, P_n) closes an existing chain if the painting already has segments of the form:

(P₁, P₂), (P₂, P₃), (P₃, P₄), ..., (P_n-1, P_n)

where (p, q) denotes the line segment whose endpoints are p and q. Notice that (p, q) = (q, p). Notice also that chains are defined by segment endpoints and not by line intersections. For instance:

line segment ((1, 1), (5, 1)) closes the chain ((1, 1), (5, 5)), ((5, 5), (5, 1)), but
line segment ((1, 1), (6, 1)) does not close the chain ((1, 1), (5, 5)), ((5, 5), (5, 1)).

The figure shows the line segments Bob would paint if the pieces of paper picked from the jar contained the line segments presented in the table, in the specified order.

Task

Given a sequence of distinct line segments, each one defined by two different endpoints, the goal is to find out how many line segments Bob would actually paint.

Input:

(You can download the zip file containing all of the input files here.)

The first line of the input contains one integer, S, which is the number of (distinct) line segments. Each of the following S lines contains four integers, x₁, y₁, x₂, and y₂, which indicate that (x₁, y₁) and (x₂, y₂) are the endpoints of a line segment, defined in Cartesian coordinates. Integers in the same line are separated by a single space.

Input example 1:

10

5 40 195 40

5 40 125 100

70 90 70 10

150 10 70 10

150 10 70 90

125 100 195 40

170 80 150 40

150 40 150 10

70 90 170 80

70 90 150 40

Input example 2:

3

5 7 15 7

5 7 20 7

10 7 20 7

(All of the input data is too large to place here, so please use the zip file.)

Output:

The output has a single line with the number of line segments that Bob would paint.

Output example 1:

6

Output example 2:

3

How to test your solution:

To submit your output solutions you need to run your code with each of the input files provided in the zip file, and place your output in the appropriate text box.

So your program's output with input from file "input1.txt" should be placed in the "Output 1" box.

I recommend downloading the zip file, and using command-line input redirection to send the file contents into the standard input stream of your program. In Windows, Linux or Mac OS, you would do something like:

[command to run your program] < inputN.txt

You can submit your output solutions through the problem solving gateway here.

Final notes:

Create your account at the gateway with the same username as the one you use at the forums.
Include your code in spoiler tags, so that the thread isn't cluttered with code.
If you're having trouble solving the problem, ask the community for help! You may learn a lot from them.
Don't worry too much about efficiency if you're starting out. Solve the problem first, we'll then tell you how to improve it.
Please don't share the output solution if you've solved the problem.
If the problem description isn't clear, please ask for further explanation!

Askew · July 13, 2014

I feel inadequate after clicking this topic.

fizzlesticks · July 13, 2014

Soo I did it using graphs, I'm sure there is a better way.

But my code is embarassingly slow so I'll hold off on posting it for now.

Nuluvius · July 13, 2014

Disjoint-set data structure perhaps.

Nuluvius · July 13, 2014

Done

I'll post code shortly; I want to make it pretty.

Ciccioo · July 13, 2014

./polygonPhobia < input6.txt

*sits back and wait*

here's my solving code, i tried with C++ this time, i'm currently studying it so it may or may not be any good

the algorithm itself is a depth-first search this time, i'll try to make it bredth-first, hopefully it'll become much faster (edit: apparently not)

edit again: turns out, BFS is almost twice as slow as DFS in all the 3 'big' test cases provided, but actually i guess it's just about luck and how the test cases are structured, the two algorithms should be equivalent in average

there must be a better solution though

#include <iostream>#include <vector>#include <map>class point{	private:		int x, y;	public:		point(int X, int Y): x{X}, y{Y} {}		point(const point& p): x{p.x}, y{p.y} {}		bool operator == (const point& p) const{			return p.x == x && p.y == y;		}		bool operator < (const point& p) const{			return x < p.x || (x == p.x && y < p.y);		}};class node{	private:		point p;		unsigned int visitMark;		std::vector<node*> neighbors;	public:		node(const point& P): p(P), visitMark{0}, neighbors(0) {}		void addNeighbor(node* n){			neighbors.push_back(n);		}		bool canReach(const point& P, unsigned int visited){			if(p == P) return true;			visitMark = visited;			for(auto& n : neighbors)				if(n -> visitMark != visited)					if(n -> canReach(P, visited))						return true;			return false;		}};node* getOrAddNode(std::map<point, node*>& m, const point& P){	auto it = m.find(P);	if(it == m.end())		it = m.insert({P, new node(P)}).first;	return it -> second;}int main(){	size_t S;	std::cin >> S;	std::map<point, node*> points;	unsigned int visitMark = 1, addedNodes = 0;	for(size_t i = 0; i < S; i++){		int x1, y1, x2, y2;		std::cin >> x1 >> y1 >> x2 >> y2;		point A{x1, y1}, B{x2, y2};		node* nA = getOrAddNode(points, A);		node* nB = getOrAddNode(points, B);		if(!nA -> canReach(B, visitMark)){			addedNodes++;			nA -> addNeighbor(nB);			nB -> addNeighbor(nA);		}		visitMark++;	}	for(auto& P : points)		delete P.second;	std::cout << addedNodes;}

wolfsinner · July 13, 2014

Disjoint-set data structure perhaps.

That's it.

The optimal solution is to note the following: if you represent the drawing as a graph, closing a chain means introducing a cycle within the graph.

A fast way to detect whether a graph has a cycle is by managing its structure with a disjoint-set using a union-find algorithm. If adding an edge to the graph creates a cycle, then you don't add that edge, and as such Bob won't draw that line segment.

A good union-find has inverse-ackermann complexity, which is very good (it's amortized O(1)).

Nuluvius · July 13, 2014

So hard to do refactoring with screaming children running around <_<

I'm being naughty again by not following TDD. As before I hit all files at once, also note how I throw away the redundant data (line 12)

MainViewModel

UnionFindAlgorithm

Result (just a POD)

        private void ProcessFiles()        {            IsBusy = true;            _cs = new CancellationTokenSource();            UpdateCommands();            Task.Run(                () => Task.WaitAll(Directory.GetFiles(InputDirectory, "*.txt").Select(f => Task.Run(                    () =>                        {                            var rawText = File.ReadAllText(f).Trim().Split(new[] { Environment.NewLine }, StringSplitOptions.None).ToList();                            rawText.RemoveAt(0);                            var result = new Result { Index = f };                            var index = 0;                            var pointMapping = new Dictionary<Point, int>();                            var unionFindAlgorithm = new UnionFindAlgorithm(rawText.Count * 2);                            foreach (var line in rawText)                            {                                _cs.Token.ThrowIfCancellationRequested();                                var coordinates = line.Split(' ');                                var p1 = new Point(int.Parse(coordinates[0]), int.Parse(coordinates[1]));                                var p2 = new Point(int.Parse(coordinates[2]), int.Parse(coordinates[3]));                                if (!pointMapping.ContainsKey(p1))                                {                                    pointMapping.Add(p1, index++);                                }                                if (!pointMapping.ContainsKey(p2))                                {                                    pointMapping.Add(p2, index++);                                }                                var idx1 = pointMapping[p1];                                var idx2 = pointMapping[p2];                                if (!unionFindAlgorithm.IsConnectedSet(idx1, idx2))                                {                                    result.Total++;                                }                                unionFindAlgorithm.UnionSet(idx1, idx2);                            }                            AddResult(result);                        }, _cs.Token)).ToArray()), _cs.Token).ContinueWith(                        _ =>                        {                            IsBusy = false;                            UpdateCommands();                        });        }

    class UnionFindAlgorithm    {        private readonly int[] _p;        private readonly int[] _rank;        public UnionFindAlgorithm(int limit)        {            _p = new int[limit];            _rank = new int[limit];            for (var i = 0; i < limit; i++)            {                _p[i] = i;            }        }        public bool IsConnectedSet(int i, int j)        {            return FindSet(i) == FindSet(j);        }        public void UnionSet(int i, int j)        {            if (IsConnectedSet(i, j))            {                return;            }            var p = new Point(FindSet(i), FindSet(j));            if (_rank[p.X] > _rank[p.Y])            {                _p[p.Y] = p.X;            }            else            {                _p[p.X] = p.Y;                if (_rank[p.X] == _rank[p.Y])                {                    _rank[p.Y]++;                }            }        }        private int FindSet(int i)        {            return (_p[i] == i) ? i : (_p[i] = FindSet(_p[i]));        }    }

    class Result    {        public string Index { get; set; }        public int Total { get; set; }    }

fizzlesticks · July 13, 2014

Still slowish but much better than my original code.

def getSet(point,sets):    for i in range(0,len(sets)):        if point in sets[i]:            return i    return -1def main():    numLines = int(input())    sets = []    numSets = 0    drawnLines = 0        for i in range(0,numLines):        l = [int(x) for x in input().split()]        p1 = (l[0],l[1])        p2 = (l[2],l[3])        s1 = getSet(p1,sets)        if s1 == -1:            sets.append(set([p1]))            s1 = numSets            numSets += 1        s2 = getSet(p2,sets)        if s2 == -1:            sets.append(set([p2]))            s2 = numSets            numSets += 1                    if s1 != s2:            drawnLines += 1            sets[s1] |= sets[s2]            del(sets[s2])            numSets -= 1    print(drawnLines)if __name__ == "__main__":    from timeit import Timer    t = Timer(lambda: main())    print(t.timeit(number=1))

Ciccioo · July 13, 2014

Disjoint-set data structure perhaps.

A fast way to detect whether a graph has a cycle is by managing its structure with a disjoint-set using a union-find algorithm.

that's some pretty cool stuff, i would have never come up with that by myself

i did notice that i just needed to detect cycles, but i could just think of the naive way

in case anyone wants to read about the union-find algorithm, i'll leave a link to the small presentation i found, and the code that came out of that reading (not much more than a mere copypaste)

#include <iostream>#include <vector>#include <map>class unionFind{    private:        std::vector<size_t> parent;        size_t root(size_t A){            size_t *tmp, root = parent[A];            while(root != parent[root])                root = parent[root];            while(parent[A] != root){                    tmp = &parent[A];                    parent[A] = root;                    A = *tmp;            }            return root;        }    public:        unionFind(size_t n): parent(n){            for(size_t i = 0; i < n; i++)                parent[i] = i;        }        bool canReach(size_t A, size_t B){            return root(A) == root(B);        }        void unite(size_t A, size_t B){            parent[root(B)] = root(A);        }};class point{    private:        int x, y;    public:        point(int X, int Y): x{X}, y{Y} {}        point(const point& p): x{p.x}, y{p.y} {}        bool operator == (const point& p) const{            return p.x == x && p.y == y;        }        bool operator < (const point& p) const{            return x < p.x || (x == p.x && y < p.y);        }};size_t getOrAddPoint(std::map<point, size_t>& m, const point& P, size_t& id){    auto it = m.find(P);    if(it == m.end())        it = m.insert({P, id++}).first;    return it -> second;}int main(){    std::map<point, size_t> map;    size_t S;    std::cin >> S;    unionFind UF(2 * S);    size_t addedNodes = 0, uniqueID = 0;    for(size_t i = 0; i < S; i++){        int x1, y1, x2, y2;        std::cin >> x1 >> y1 >> x2 >> y2;        point A{x1, y1}, B{x2, y2};        size_t idA = getOrAddPoint(map, A, uniqueID);        size_t idB = getOrAddPoint(map, B, uniqueID);        if(!UF.canReach(idA, idB)){            addedNodes++;            UF.unite(idA, idB);        }    }    std::cout << addedNodes << std::endl;}

Maximation · July 15, 2014

in case anyone wants to read about the union-find algorithm, i'll leave a link to the small presentation i found, and the code that came out of that reading (not much more than a mere copypaste)

Very good, I will read through the presentation. Thank you.

Maximation · July 15, 2014

Will any of the segments in the input files be exactly the same as another line in the file?

wolfsinner · July 15, 2014

Will any of the segments in the input files be exactly the same as another line in the file?

If I understand your question correctly, all of the line segments within each file are distinct.

Maximation · July 15, 2014

If I understand your question correctly, all of the line segments within each file are distinct.

Thank you. You understood it correctly.

Maximation · July 17, 2014

Mh, I tried this :

But it returns 0 as linesDrawn.

Why does it do this?

Here is the full code :

for (int j = 0; j < connectingSegments.size(); j++) {					for (int k = 0; k < connectingSegments.get(j).size(); k++) {						if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							// if both of the endpoints of a segment are already in a linked list  then Bob will not paint it, so we do nothing						}						else if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() != connectingSegments.get(j).get(k).getX2() && segments[j].getY2() != connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else if ((segments[j].getX1() != connectingSegments.get(j).get(k).getX1() && segments[j].getY1() != connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else {							LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();							linkedListToAdd.add(segments[j]);							connectingSegments.add(linkedListToAdd);							linesDrawn++;						}					}				}

/* * Author : maeb * 2014 */import java.util.Scanner;import java.util.LinkedList;public class Main {	public static void main(String[] args) {		Scanner in = new Scanner(System.in);		int linesDrawn = 0;		/* integer on the first line represents the number of segments it will exist out of */		int numSegments = in.nextInt();		Segment[] segments = new Segment[numSegments];		LinkedList<LinkedList<Segment>> connectingSegments = new LinkedList<LinkedList<Segment>>();		try {			/* makes a Segment object with its 4 coordinates and stores all the Segment objects in the segments array */			for (int i = 0; i < numSegments; i++) {				int x1 = 0, y1 = 0, x2 = 0, y2 = 0;				for (int j = 0; j < 4; j++) {					if (j == 0)						x1 = in.nextInt();					if (j == 1)						y1 = in.nextInt();					if (j == 2)						x2 = in.nextInt();					if (j == 3)						y2 = in.nextInt();				}				segments[i] = new Segment(x1, y1, x2, y2);			}		} catch (Exception e) {}		for (int i = 0; i < segments.length; i++) {			if (connectingSegments.size() == 0) {				LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();				linkedListToAdd.add(segments[i]);				connectingSegments.add(linkedListToAdd);			}			else {				for (int j = 0; j < connectingSegments.size(); j++) {					for (int k = 0; k < connectingSegments.get(j).size(); k++) {						if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							// if both of the endpoints of a segment are already in a linked list  then Bob will not paint it, so we do nothing						}						else if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() != connectingSegments.get(j).get(k).getX2() && segments[j].getY2() != connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else if ((segments[j].getX1() != connectingSegments.get(j).get(k).getX1() && segments[j].getY1() != connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else {							LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();							linkedListToAdd.add(segments[j]);							connectingSegments.add(linkedListToAdd);							linesDrawn++;						}					}				}			}		}		System.out.println("Lines drawn : " + linesDrawn);	}}

/* *	Author: maeb *	2014 */public class Segment {	private int x1, y1, x2, y2;	public Segment (int x1, int y1, int x2, int y2) {		this.x1 = x1;		this.y1 = y1;		this.x2 = x2;		this.y2 = y2;	}	public int getX1() {		return x1;	}	public int getY1() {		return y1;	}	public int getX2() {		return x2;	}	public int getY2() {		return y2;	}}

wolfsinner · July 17, 2014

But it returns 0 as linesDrawn.

Why does it do this?

Notice that when the connectingSegments list is empty, you add a list with 1 element representing the first segment. Then you move on to the next segment, and if that segment's endpoints are not in that list you do nothing. You don't add that new segment anywhere, so it will never be compared to other segments. If there is not a segment with common endpoints to the first segment, you will never add anything besides the first line segment to these lists.

Actually, nevermind that. I don't know what I was thinking when I wrote that. I guess I was tired. Your problem is that you're always testing against the first segment, and you're never adding anything to any list (except for when the list is empty).

Notice that you're always comparing to segments[j]. It still wouldn't work, and if you altered it to work it still wouldn't scale efficiently.

This implementation might be an attempt at union-find though, so if it is, please tell me.

Notice that you should increment linesDrawn when you add the first segment as well.

I recommend you make an effort to understand the union-find algorithm though, because it really simplifies this whole process, and is very efficient.

Maximation · July 18, 2014

This implementation might be an attempt at union-find though, so if it is, please tell me.

I recommend you make an effort to understand the union-find algorithm though, because it really simplifies this whole process, and is very efficient.

Yes this was an attempt at union-find. I read up on disjoint-set structures and with that knowledge (although I still do not fully understand it) I implemented some sort of union-find algorithm. It looks not very good to me though, it seems way too complicated. Could you maybe explain more about the union-find / disjoint-set algorithm?

Thank you very much.

Ciccioo · July 18, 2014

Yes this was an attempt at union-find. I read up on disjoint-set structures and with that knowledge (although I still do not fully understand it) I implemented some sort of union-find algorithm. It looks not very good to me though, it seems way too complicated. Could you maybe explain more about the union-find / disjoint-set algorithm?

Thank you very much.

i'll leave here how i understood it

problem:

we have a graph, and we must avoid cycles (because a cycle, if you draw it on the paper, is a closed polygon, and bob is scared of those)

solution:

when you add an arc between two nodes A and B (when you draw a line on the segment AB), you first need to check if there is already a path that goes from A to B

If there is one, adding another [direct] path will close a cycle. so you skip it

implementation:

every time you add an arc/draw a line you need to check whether or not there is a path between the two end points. that's a lot of times, so you need a way to do that real quick

you start by noting that, if there is a path from A to B, and there is a path from B to C, then there is of course a path from A to C: you just go from A to B, then from B to C

continuing the previous example, if there is a path from a certain node D to A, then there is a path from D to B as well as from D to C, because they're all connected

if E can reach C, then E can reach A, B, C, D

see what's going on here? if a node can find a path to any of those nodes in that group, then there is a path to all of them. you're creating a set of nodes that can reach each other

these are the rules:

if a node X can reach any of the nodes in the set s, then X belongs to s too
if two nodes belong to the same set, then there is a path between them
if there are two sets s and t, a node A that belongs to s, a node B that belongs to t, and there is a path (an arc) between A and B, then you can unite s and t in a single set, because every node of s can reach every node of t passing through the arc AB

so, to solve this problem you need to keep track of all these "sets of reachability". the union-find algorithm uses a smart data structure to handle these sets

i think the presentation i linked does a good job at explaining how it works, but if you have questions about the actual implementation you can indeed ask

Maximation · July 18, 2014

i'll leave here how i understood it

problem:

we have a graph, and we must avoid cycles (because a cycle, if you draw it on the paper, is a closed polygon, and bob is scared of those)

solution:

when you add an arc between two nodes A and B (when you draw a line on the segment AB), you first need to check if there is already a path that goes from A to B

If there is one, adding another [direct] path will close a cycle. so you skip it

implementation:

every time you add an arc/draw a line you need to check whether or not there is a path between the two end points. that's a lot of times, so you need a way to do that real quick

you start by noting that, if there is a path from A to B, and there is a path from B to C, then there is of course a path from A to C: you just go from A to B, then from B to C

continuing the previous example, if there is a path from a certain node D to A, then there is a path from D to B as well as from D to C, because they're all connected

if E can reach C, then E can reach A, B, C, D

see what's going on here? if a node can find a path to any of those nodes in that group, then there is a path to all of them. you're creating a set of nodes that can reach each other

these are the rules:

if a node X can reach any of the nodes in the set s, then X belongs to s too

if two nodes belong to the same set, then there is a path between them

if there are two sets s and t, a node A that belongs to s, a node B that belongs to t, and there is a path (an arc) between A and B, then you can unite s and t in a single set, because every node of s can reach every node of t passing through the arc AB

so, to solve this problem you need to keep track of all these "sets of reachability". the union-find algorithm uses a smart data structure to handle these sets

i think the presentation i linked does a good job at explaining how it works, but if you have questions about the actual implementation you can indeed ask

Very good. I already read the presentation you linked, it indeed explains it very well.

There are still some things on how to implement it in this problem that I do not understand. Some points can have two parents because there are two points to go to from there, is that correct? How should I implement that? How would I go about checking paths in a set? According to the presentations and articles I read I should recursively call the parent on each point, but what if there are two parents?

As you can see there are still some things on the implementation I do not understand yet.

Thank you for helping me!

Maximation · July 18, 2014

Also, should I keep refering to points using the x1 and y1 or x2 and y2 coordinates of the segment?

wolfsinner · July 18, 2014

Ciccioo wrote a pretty good explanation for a practical union-find, but here's a general explanation of it.

The general idea for union-find is to group elements together that satisfty a certain property in respect to each other. This can be anything, and that's why union-find is such a versatile algorithm.

Its basis lies in set theory. Here are some basics of set theory:

{} is an empty set. It is a set with 0 elements.
{ x ∈ D | P(x) } is the set of elements in domain D, which satisfy condition P. You can think of P as a boolean function that checks something in the given value, and returns true if that value satisfies that condition. P is called a predicate.
{1, 2, 3} is an example of a set.
|S| is the cardinality function. It is equivalent to a "count" function. It returns the number of elements in a set. So |{1, 2, 3}| = 3.
Sets can contain sets. So {{1, 2}, {3}, {4, 5, 6}} is a set of 3 elements, and all of those 3 elements are sets.
S₁ U S₂ is set-union. Set union is, as the name suggests, a binary operation that produces the union of the two sets provided. Remember that there are never duplicates in a set.
- {1, 2} U {3, 4} = {1, 2, 3, 4}
- {1, 2} U {2, 3} = {1, 2, 3}
S₁ U S₂ is set-intersection. Set intersection is, as the name suggests, a binary operation that produces the intersection of the two sets provided. So any element that S₁ has in common with S₂ is present in the result set.
- {1, 2} ∩ {3, 4} = {}
- {1, 2} ∩ {2, 3} = {2}

Now that you know some set theory, union-find works as follows. For n elements, you start by creating a set that contains sub-sets of cardinality 1, for each element n.

For n = 5:

S = {{1}, {2}, {3}, {4}, {5}}, you can visualize these as separate graph vertices.

This means that none of those elements respects the required property amongst each other. Let's look at the example use for this problem. What you want to make sure is that there are no cycles. You know that introducing an edge (x -> y) would create a cycle if there is already a path from x to y.

So let's say that the condition is: within a sub-set of S, all of the elements have a path to each other. Let's imagine that we add edge (1->2). What would it do?

1 and 2 are in different sub-sets of S. This means there isn't a path between the two.
So, we join {1} and {2} to produce {1, 2}. So now S = {{1, 2}, {3}, {4}, {5}}.

Now let's try to add edge (3->4).

3 and 4 are in different sub-sets of S. This means there isn't a path between the two.
So, we join {3} and {4} to produce {3, 4}. So now S = {{1, 2}, {3, 4}, {5}}.

Now, let's try to add edge (2->3).

2 and 3 are in different sub-sets of S. This means there isn't a path between the two.
So, we join {1, 2} and {3, 4} to produce {1, 2, 3, 4}. So now S = {{1, 2, 3, 4}, {5}}.

Notice now that there is a set {1, 2, 3, 4}. This means that there is a path between all of those vertices. If this seems confusing to you, try drawing this on a paper.

Now, let's try to add edge (2->4).

2 and 4 are in the same sub-set. This means that there is a path between them already. Adding this edge would introduce a cycle, so we don't want it in there.

Using this technique you know exactly when adding an edge introduces a cycle in the graph. Which is Bob's problem. I hope this explanation wasn't confusing.

Your task would be to transform this same idea into code, by using appropriate structures to represent these things. The page that Ciccioo supplied gives you a very good example of implemented union-find in a disjoint-set.

You also asked what union-find can be used for, besides "cycle checking". It can be used for an enormous amounts of applications. Most notable uses are for checking graph connectivity, and finding a minimum spanning tree in a graph. The last one is especially useful for optimizing network design costs.

Ciccioo · July 18, 2014

There are still some things on how to implement it in this problem that I do not understand. Some points can have two parents because there are two points to go to from there, is that correct? How should I implement that? How would I go about checking paths in a set? According to the presentations and articles I read I should recursively call the parent on each point, but what if there are two parents?

As you can see there are still some things on the implementation I do not understand yet.

nope

don't look at it as a graph, there is no path and there is no multiple parents, there are no parents at all: there are just sets. a uniform blob of nodes, that you know can reach each other

the union-find algorithm uses a tree to represent those sets, but remember it's a set. the root of the tree has no meaning, it could be any other node of the set, because in a set every element is at the same level as the others

so how does it work? the whole thing is done so that you can check whether or not two nodes can reach each other, that is equivalent to say that those two nodes belong to the same set

with this particular data structure, a set is identified by the root of the tree that represents it. if you want to know if two nodes belong to the same set, you just have to check if the root of those two nodes are the same node

the implementation goes like this: you have N nodes, and you have an array of N integers

the Nth element of the array tells you what is the parent of the Nth node

if a node is parent of itself, then that node is the root of its set

int N = 10;int[] parent = new int[N];for(int i = 0; i < N; i++)           parent[i] = i;                // initialization: every node is parent of itself (every node is the root of its own little set, which only contains him)parent[3] = 4;                        // 4 is now the parent of 3. 4 is the root of this little tree, so, 4 identifies the set (3, 4) that just got createdparent[5] = 3;                        // 3 is the parent of 5. the root of this tree is still 4parent[1] = 2;parent[6] = 2;                        // 2 is the root of this tree, and it has two childs. it represents a set of 3 nodes (1, 2, 6)parent[2] = 3;                        //2 is not parent of itself, so it's not a root anymore. its whole tree is now attached to the tree whose root is 4//1, 2, 3, 4, 5, 6 are all in the same set, and can all reach each other, and you can tell that because the root of all of them is 4

Also, should I keep refering to points using the x1 and y1 or x2 and y2 coordinates of the segment?

that was kind of the painful part for me

as you can see, every node needs to have a unique index, so that you can use it as the index of parent[]

you need a data structure to associate a point with an index

wolfsinner · July 18, 2014

that was kind of the painful part for me

as you can see, every node needs to have a unique index, so that you can use it as the index of parent[]

you need a data structure to associate a point with an index

A simple way to do this is to use a Hash Map, where the key is a String that represents the point (something like x + " " + y), and the value is a pointer to its parent.

Maximation could use the following in Java:

Map<String, Node> map = new HashMap<String, Node>();

Where Node is a structure that has a pointer to its parent node as well as other info that might come in handy. I hope I'm not adding to the confusion.

By the way Maximation, a Map is a dictionary/index, it associates a key to a value. For example:

Map<String, String> map = new HashMap<String, String>();map.put("Something", "this is value");System.out.println(map.get("Something")); //this would print: this is value

Maximation · July 18, 2014

-snip-

I read your posts. As I am really tired now I will read them again tomorrow morning and then start working on a new solution.

I am still not sure how to create a structure for the different points/nodes, how do I know how many of them there are? Oh well, I will come up with a solution tomorrow morning, I guess.

As always, thank you!

wolfsinner · July 18, 2014

I read your posts. As I am really tired now I will read them again tomorrow morning and then start working on a new solution.

I am still not sure how to create a structure for the different points/nodes, how do I know how many of them there are? Oh well, I will come up with a solution tomorrow morning, I guess.

As always, thank you!

Alright! If you need any help we're here.

In case you want something to base your implementation off, here's a UnionFind implementation I wrote some time back that I keep around for when I need it (it's similar to the one I used for this problem, I removed path compression for less confusion). Node and UnionFind are the interfaces, you don't really need to look at their contents. What matters is in NodeClass and UnionFindInMap.

Sign In

[Problem Solving #3] Polygon Phobia

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites