Jump to content

[Problem Solving #3] Polygon Phobia

wolfsinner

Problem #3 - Polygon Phobia

 

So here's Problem #3 of the series. I was really busy this week, so I didn't have time to come up with a problem. I used a problem from the Portuguese Inter-University Programming Marathon. I hope you guys enjoy it.

This one is simple, especially if you use a particular algorithm/data structure. You don't need to know more than basic math for this problem, in spite of what its name and geometrical basis suggests, you don't need to do any kind of geometrical calculations or even use any mathematical formulae. Make sure you understand what is requested, as the task itself involves very little math.

I recommend you read the problem description through the gateway or the PDF, instead of here, since I couldn't do some of the pretty formatting with BBCode.

 

Problem Description:

 
(This problem was a part of the 2013 edition of the Portuguese Inter-University Programming Marathon. You can download the original PDF file here.)
 
Bob is a young artist, whose current works are paintings with coloured line segments (like the one depicted in the figure below). But Bob is phobic about polygons, he simply cannot draw any closed chain of line segments.
In his next painting workshop Bob has decided to try a new approach, "the luck of the draw". He will start by carefully choosing the possible line segments and by writing the two different endpoints of each segment on a small piece of paper, just as if the painting canvas was a Cartesian coordinate system.
Then, all those pieces of paper will be put into a large jar, which will be very well shaken. At this point, the painting process will begin, using his "luck of the draw" approach. Whilst the jar is not empty, Bob will pick a piece of paper out of it, one at a time, and will then paint the corresponding line segment unless it closes an existing chain (because of his phobia).
Papers taken out of the jar will be put directly into the recycling bin.
A segment (P1, Pncloses an existing chain if the painting already has segments of the form:
 
(P1, P2), (P2, P3), (P3, P4), ..., (Pn-1, Pn)
 
where (p, q) denotes the line segment whose endpoints are p and q. Notice that (p, q) = (q, p). Notice also that chains are defined by segment endpoints and not by line intersections. For instance:
 
  • line segment ((1, 1), (5, 1)) closes the chain ((1, 1), (5, 5)), ((5, 5), (5, 1)), but
  • line segment ((1, 1), (6, 1)) does not close the chain ((1, 1), (5, 5)), ((5, 5), (5, 1)).

 

1.png
 
The figure shows the line segments Bob would paint if the pieces of paper picked from the jar contained the line segments presented in the table, in the specified order.
 
2.png
 
 
Task
Given a sequence of distinct line segments, each one defined by two different endpoints, the goal is to find out how many line segments Bob would actually paint.
 
 
Input:
 
(You can download the zip file containing all of the input files here.)
 
The first line of the input contains one integer, S, which is the number of (distinct) line segments. Each of the following S lines contains four integers, x1, y1, x2, and y2, which indicate that (x1y1) and (x2, y2) are the endpoints of a line segment, defined in Cartesian coordinates. Integers in the same line are separated by a single space.
 
Input example 1:
10
5 40 195 40
5 40 125 100
70 90 70 10
150 10 70 10
150 10 70 90
125 100 195 40
170 80 150 40
150 40 150 10
70 90 170 80
70 90 150 40
 
Input example 2:
3
5 7 15 7
5 7 20 7
10 7 20 7
 
(All of the input data is too large to place here, so please use the zip file.)
 
Output:
The output has a single line with the number of line segments that Bob would paint.
 
Output example 1:
6
 
Output example 2:
3

 

How to test your solution:

To submit your output solutions you need to run your code with each of the input files provided in the zip file, and place your output in the appropriate text box.

So your program's output with input from file "input1.txt" should be placed in the "Output 1" box.

 

I recommend downloading the zip file, and using command-line input redirection to send the file contents into the standard input stream of your program. In Windows, Linux or Mac OS, you would do something like:

[command to run your program] < inputN.txt

You can submit your output solutions through the problem solving gateway here.

 
Final notes:

  • Create your account at the gateway with the same username as the one you use at the forums.
  • Include your code in spoiler tags, so that the thread isn't cluttered with code.
  • If you're having trouble solving the problem, ask the community for help! You may learn a lot from them.
  • Don't worry too much about efficiency if you're starting out. Solve the problem first, we'll then tell you how to improve it.
  • Please don't share the output solution if you've solved the problem.
  • If the problem description isn't clear, please ask for further explanation!

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

Soo I did it using graphs, I'm sure there is a better way.

But my code is embarassingly slow so I'll hold off on posting it for now.

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

Done :P

 

I'll post code shortly; I want to make it pretty.

The single biggest problem in communication is the illusion that it has taken place.

Link to comment
Share on other sites

Link to post
Share on other sites

./polygonPhobia < input6.txt

*sits back and wait*

 

here's my solving code, i tried with C++ this time, i'm currently studying it so it may or may not be any good

the algorithm itself is a depth-first search this time, i'll try to make it bredth-first, hopefully it'll become much faster (edit: apparently not)

 

edit again: turns out, BFS is almost twice as slow as DFS in all the 3 'big' test cases provided, but actually i guess it's just about luck and how the test cases are structured, the two algorithms should be equivalent in average

there must be a better solution though

#include <iostream>#include <vector>#include <map>class point{	private:		int x, y;	public:		point(int X, int Y): x{X}, y{Y} {}		point(const point& p): x{p.x}, y{p.y} {}		bool operator == (const point& p) const{			return p.x == x && p.y == y;		}		bool operator < (const point& p) const{			return x < p.x || (x == p.x && y < p.y);		}};class node{	private:		point p;		unsigned int visitMark;		std::vector<node*> neighbors;	public:		node(const point& P): p(P), visitMark{0}, neighbors(0) {}		void addNeighbor(node* n){			neighbors.push_back(n);		}		bool canReach(const point& P, unsigned int visited){			if(p == P) return true;			visitMark = visited;			for(auto& n : neighbors)				if(n -> visitMark != visited)					if(n -> canReach(P, visited))						return true;			return false;		}};node* getOrAddNode(std::map<point, node*>& m, const point& P){	auto it = m.find(P);	if(it == m.end())		it = m.insert({P, new node(P)}).first;	return it -> second;}int main(){	size_t S;	std::cin >> S;	std::map<point, node*> points;	unsigned int visitMark = 1, addedNodes = 0;	for(size_t i = 0; i < S; i++){		int x1, y1, x2, y2;		std::cin >> x1 >> y1 >> x2 >> y2;		point A{x1, y1}, B{x2, y2};		node* nA = getOrAddNode(points, A);		node* nB = getOrAddNode(points, B);		if(!nA -> canReach(B, visitMark)){			addedNodes++;			nA -> addNeighbor(nB);			nB -> addNeighbor(nA);		}		visitMark++;	}	for(auto& P : points)		delete P.second;	std::cout << addedNodes;}
Link to comment
Share on other sites

Link to post
Share on other sites

 

That's it.

 

The optimal solution is to note the following: if you represent the drawing as a graph, closing a chain means introducing a cycle within the graph.

A fast way to detect whether a graph has a cycle is by managing its structure with a disjoint-set using a union-find algorithm. If adding an edge to the graph creates a cycle, then you don't add that edge, and as such Bob won't draw that line segment.

A good union-find has inverse-ackermann complexity, which is very good (it's amortized O(1)).

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

So hard to do refactoring with screaming children running around  <_<

 

I'm being naughty again by not following TDD. As before I hit all files at once, also note how I throw away the redundant data (line 12)  :P

 

MainViewModel

UnionFindAlgorithm

Result (just a POD)

        private void ProcessFiles()        {            IsBusy = true;            _cs = new CancellationTokenSource();            UpdateCommands();            Task.Run(                () => Task.WaitAll(Directory.GetFiles(InputDirectory, "*.txt").Select(f => Task.Run(                    () =>                        {                            var rawText = File.ReadAllText(f).Trim().Split(new[] { Environment.NewLine }, StringSplitOptions.None).ToList();                            rawText.RemoveAt(0);                            var result = new Result { Index = f };                            var index = 0;                            var pointMapping = new Dictionary<Point, int>();                            var unionFindAlgorithm = new UnionFindAlgorithm(rawText.Count * 2);                            foreach (var line in rawText)                            {                                _cs.Token.ThrowIfCancellationRequested();                                var coordinates = line.Split(' ');                                var p1 = new Point(int.Parse(coordinates[0]), int.Parse(coordinates[1]));                                var p2 = new Point(int.Parse(coordinates[2]), int.Parse(coordinates[3]));                                if (!pointMapping.ContainsKey(p1))                                {                                    pointMapping.Add(p1, index++);                                }                                if (!pointMapping.ContainsKey(p2))                                {                                    pointMapping.Add(p2, index++);                                }                                var idx1 = pointMapping[p1];                                var idx2 = pointMapping[p2];                                if (!unionFindAlgorithm.IsConnectedSet(idx1, idx2))                                {                                    result.Total++;                                }                                unionFindAlgorithm.UnionSet(idx1, idx2);                            }                            AddResult(result);                        }, _cs.Token)).ToArray()), _cs.Token).ContinueWith(                        _ =>                        {                            IsBusy = false;                            UpdateCommands();                        });        } 
    class UnionFindAlgorithm    {        private readonly int[] _p;        private readonly int[] _rank;        public UnionFindAlgorithm(int limit)        {            _p = new int[limit];            _rank = new int[limit];            for (var i = 0; i < limit; i++)            {                _p[i] = i;            }        }        public bool IsConnectedSet(int i, int j)        {            return FindSet(i) == FindSet(j);        }        public void UnionSet(int i, int j)        {            if (IsConnectedSet(i, j))            {                return;            }            var p = new Point(FindSet(i), FindSet(j));            if (_rank[p.X] > _rank[p.Y])            {                _p[p.Y] = p.X;            }            else            {                _p[p.X] = p.Y;                if (_rank[p.X] == _rank[p.Y])                {                    _rank[p.Y]++;                }            }        }        private int FindSet(int i)        {            return (_p[i] == i) ? i : (_p[i] = FindSet(_p[i]));        }    } 
    class Result    {        public string Index { get; set; }        public int Total { get; set; }    } 

The single biggest problem in communication is the illusion that it has taken place.

Link to comment
Share on other sites

Link to post
Share on other sites

Still slowish but much better than my original code.

def getSet(point,sets):    for i in range(0,len(sets)):        if point in sets[i]:            return i    return -1def main():    numLines = int(input())    sets = []    numSets = 0    drawnLines = 0        for i in range(0,numLines):        l = [int(x) for x in input().split()]        p1 = (l[0],l[1])        p2 = (l[2],l[3])        s1 = getSet(p1,sets)        if s1 == -1:            sets.append(set([p1]))            s1 = numSets            numSets += 1        s2 = getSet(p2,sets)        if s2 == -1:            sets.append(set([p2]))            s2 = numSets            numSets += 1                    if s1 != s2:            drawnLines += 1            sets[s1] |= sets[s2]            del(sets[s2])            numSets -= 1    print(drawnLines)if __name__ == "__main__":    from timeit import Timer    t = Timer(lambda: main())    print(t.timeit(number=1))

1474412270.2748842

Link to comment
Share on other sites

Link to post
Share on other sites

A fast way to detect whether a graph has a cycle is by managing its structure with a disjoint-set using a union-find algorithm.

that's some pretty cool stuff, i would have never come up with that by myself

i did notice that i just needed to detect cycles, but i could just think of the naive way

 

in case anyone wants to read about the union-find algorithm, i'll leave a link to the small presentation i found, and the code that came out of that reading (not much more than a mere copypaste)

#include <iostream>#include <vector>#include <map>class unionFind{    private:        std::vector<size_t> parent;        size_t root(size_t A){            size_t *tmp, root = parent[A];            while(root != parent[root])                root = parent[root];            while(parent[A] != root){                    tmp = &parent[A];                    parent[A] = root;                    A = *tmp;            }            return root;        }    public:        unionFind(size_t n): parent(n){            for(size_t i = 0; i < n; i++)                parent[i] = i;        }        bool canReach(size_t A, size_t B){            return root(A) == root(B);        }        void unite(size_t A, size_t B){            parent[root(B)] = root(A);        }};class point{    private:        int x, y;    public:        point(int X, int Y): x{X}, y{Y} {}        point(const point& p): x{p.x}, y{p.y} {}        bool operator == (const point& p) const{            return p.x == x && p.y == y;        }        bool operator < (const point& p) const{            return x < p.x || (x == p.x && y < p.y);        }};size_t getOrAddPoint(std::map<point, size_t>& m, const point& P, size_t& id){    auto it = m.find(P);    if(it == m.end())        it = m.insert({P, id++}).first;    return it -> second;}int main(){    std::map<point, size_t> map;    size_t S;    std::cin >> S;    unionFind UF(2 * S);    size_t addedNodes = 0, uniqueID = 0;    for(size_t i = 0; i < S; i++){        int x1, y1, x2, y2;        std::cin >> x1 >> y1 >> x2 >> y2;        point A{x1, y1}, B{x2, y2};        size_t idA = getOrAddPoint(map, A, uniqueID);        size_t idB = getOrAddPoint(map, B, uniqueID);        if(!UF.canReach(idA, idB)){            addedNodes++;            UF.unite(idA, idB);        }    }    std::cout << addedNodes << std::endl;}
Link to comment
Share on other sites

Link to post
Share on other sites

in case anyone wants to read about the union-find algorithm, i'll leave a link to the small presentation i found, and the code that came out of that reading (not much more than a mere copypaste)

 

Very good, I will read through the presentation. Thank you.

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Will any of the segments in the input files be exactly the same as another line in the file?

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Will any of the segments in the input files be exactly the same as another line in the file?

If I understand your question correctly, all of the line segments within each file are distinct.

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

If I understand your question correctly, all of the line segments within each file are distinct.

 

Thank you. You understood it correctly.

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Mh, I tried this :

But it returns 0 as linesDrawn.

Why does it do this?

 

Here is the full code : 

for (int j = 0; j < connectingSegments.size(); j++) {					for (int k = 0; k < connectingSegments.get(j).size(); k++) {						if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							// if both of the endpoints of a segment are already in a linked list  then Bob will not paint it, so we do nothing						}						else if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() != connectingSegments.get(j).get(k).getX2() && segments[j].getY2() != connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else if ((segments[j].getX1() != connectingSegments.get(j).get(k).getX1() && segments[j].getY1() != connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else {							LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();							linkedListToAdd.add(segments[j]);							connectingSegments.add(linkedListToAdd);							linesDrawn++;						}					}				}
/* * Author : maeb * 2014 */import java.util.Scanner;import java.util.LinkedList;public class Main {	public static void main(String[] args) {		Scanner in = new Scanner(System.in);		int linesDrawn = 0;		/* integer on the first line represents the number of segments it will exist out of */		int numSegments = in.nextInt();		Segment[] segments = new Segment[numSegments];		LinkedList<LinkedList<Segment>> connectingSegments = new LinkedList<LinkedList<Segment>>();		try {			/* makes a Segment object with its 4 coordinates and stores all the Segment objects in the segments array */			for (int i = 0; i < numSegments; i++) {				int x1 = 0, y1 = 0, x2 = 0, y2 = 0;				for (int j = 0; j < 4; j++) {					if (j == 0)						x1 = in.nextInt();					if (j == 1)						y1 = in.nextInt();					if (j == 2)						x2 = in.nextInt();					if (j == 3)						y2 = in.nextInt();				}				segments[i] = new Segment(x1, y1, x2, y2);			}		} catch (Exception e) {}		for (int i = 0; i < segments.length; i++) {			if (connectingSegments.size() == 0) {				LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();				linkedListToAdd.add(segments[i]);				connectingSegments.add(linkedListToAdd);			}			else {				for (int j = 0; j < connectingSegments.size(); j++) {					for (int k = 0; k < connectingSegments.get(j).size(); k++) {						if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							// if both of the endpoints of a segment are already in a linked list  then Bob will not paint it, so we do nothing						}						else if ((segments[j].getX1() == connectingSegments.get(j).get(k).getX1() && segments[j].getY1() == connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() != connectingSegments.get(j).get(k).getX2() && segments[j].getY2() != connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else if ((segments[j].getX1() != connectingSegments.get(j).get(k).getX1() && segments[j].getY1() != connectingSegments.get(j).get(k).getY1()) && (segments[j].getX2() == connectingSegments.get(j).get(k).getX2() && segments[j].getY2() == connectingSegments.get(j).get(k).getY2())) {							connectingSegments.get(j).add(segments[j]);							linesDrawn++;						}						else {							LinkedList<Segment> linkedListToAdd = new LinkedList<Segment>();							linkedListToAdd.add(segments[j]);							connectingSegments.add(linkedListToAdd);							linesDrawn++;						}					}				}			}		}		System.out.println("Lines drawn : " + linesDrawn);	}}
/* *	Author: maeb *	2014 */public class Segment {	private int x1, y1, x2, y2;	public Segment (int x1, int y1, int x2, int y2) {		this.x1 = x1;		this.y1 = y1;		this.x2 = x2;		this.y2 = y2;	}	public int getX1() {		return x1;	}	public int getY1() {		return y1;	}	public int getX2() {		return x2;	}	public int getY2() {		return y2;	}}

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

But it returns 0 as linesDrawn.

Why does it do this?

 

Notice that when the connectingSegments list is empty, you add a list with 1 element representing the first segment. Then you move on to the next segment, and if that segment's endpoints are not in that list you do nothing. You don't add that new segment anywhere, so it will never be compared to other segments. If there is not a segment with common endpoints to the first segment, you will never add anything besides the first line segment to these lists.

Actually, nevermind that. I don't know what I was thinking when I wrote that. I guess I was tired. :D Your problem is that you're always testing against the first segment, and you're never adding anything to any list (except for when the list is empty).

Notice that you're always comparing to segments[j]. It still wouldn't work, and if you altered it to work it still wouldn't scale efficiently.

This implementation might be an attempt at union-find though, so if it is, please tell me.

Notice that you should increment linesDrawn when you add the first segment as well.

 

I recommend you make an effort to understand the union-find algorithm though, because it really simplifies this whole process, and is very efficient.

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

This implementation might be an attempt at union-find though, so if it is, please tell me.

I recommend you make an effort to understand the union-find algorithm though, because it really simplifies this whole process, and is very efficient.

 

Yes this was an attempt at union-find. I read up on disjoint-set structures and with that knowledge (although I still do not fully understand it) I implemented some sort of union-find algorithm. It looks not very good to me though, it seems way too complicated. Could you maybe explain more about the union-find / disjoint-set algorithm?

Thank you very much.

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Yes this was an attempt at union-find. I read up on disjoint-set structures and with that knowledge (although I still do not fully understand it) I implemented some sort of union-find algorithm. It looks not very good to me though, it seems way too complicated. Could you maybe explain more about the union-find / disjoint-set algorithm?

Thank you very much.

i'll leave here how i understood it

 

problem:

we have a graph, and we must avoid cycles (because a cycle, if you draw it on the paper, is a closed polygon, and bob is scared of those)

 

solution:

when you add an arc between two nodes A and B (when you draw a line on the segment AB), you first need to check if there is already a path that goes from A to B

If there is one, adding another [direct] path will close a cycle. so you skip it

 

implementation:

every time you add an arc/draw a line you need to check whether or not there is a path between the two end points. that's a lot of times, so you need a way to do that real quick

 

you start by noting that, if there is a path from A to B, and there is a path from B to C, then there is of course a path from A to C: you just go from A to B, then from B to C

continuing the previous example, if there is a path from a certain node D to A, then there is a path from D to B as well as from D to C, because they're all connected

if E can reach C, then E can reach A, B, C, D

 

see what's going on here? if a node can find a path to any of those nodes in that group, then there is a path to all of them. you're creating a set of nodes that can reach each other

these are the rules:

  1. if a node X can reach any of the nodes in the set s, then X belongs to s too
  2. if two nodes belong to the same set, then there is a path between them
  3. if there are two sets s and t, a node A that belongs to s, a node B that belongs to t, and there is a path (an arc) between A and B, then you can unite s and t in a single set, because every node of s can reach every node of t passing through the arc AB

so, to solve this problem you need to keep track of all these "sets of reachability". the union-find algorithm uses a smart data structure to handle these sets

i think the presentation i linked does a good job at explaining how it works, but if you have questions about the actual implementation you can indeed ask

Link to comment
Share on other sites

Link to post
Share on other sites

i'll leave here how i understood it

 

problem:

we have a graph, and we must avoid cycles (because a cycle, if you draw it on the paper, is a closed polygon, and bob is scared of those)

 

solution:

when you add an arc between two nodes A and B (when you draw a line on the segment AB), you first need to check if there is already a path that goes from A to B

If there is one, adding another [direct] path will close a cycle. so you skip it

 

implementation:

every time you add an arc/draw a line you need to check whether or not there is a path between the two end points. that's a lot of times, so you need a way to do that real quick

 

you start by noting that, if there is a path from A to B, and there is a path from B to C, then there is of course a path from A to C: you just go from A to B, then from B to C

continuing the previous example, if there is a path from a certain node D to A, then there is a path from D to B as well as from D to C, because they're all connected

if E can reach C, then E can reach A, B, C, D

 

see what's going on here? if a node can find a path to any of those nodes in that group, then there is a path to all of them. you're creating a set of nodes that can reach each other

these are the rules:

  1. if a node X can reach any of the nodes in the set s, then X belongs to s too
  2. if two nodes belong to the same set, then there is a path between them
  3. if there are two sets s and t, a node A that belongs to s, a node B that belongs to t, and there is a path (an arc) between A and B, then you can unite s and t in a single set, because every node of s can reach every node of t passing through the arc AB

so, to solve this problem you need to keep track of all these "sets of reachability". the union-find algorithm uses a smart data structure to handle these sets

i think the presentation i linked does a good job at explaining how it works, but if you have questions about the actual implementation you can indeed ask

 

Very good. I already read the presentation you linked, it indeed explains it very well.

 

There are still some things on how to implement it in this problem that I do not understand. Some points can have two parents because there are two points to go to from there, is that correct? How should I implement that? How would I go about checking paths in a set? According to the presentations and articles I read I should recursively call the parent on each point, but what if there are two parents?

As you can see there are still some things on the implementation I do not understand yet.

 

Thank you for helping me!

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Also, should I keep refering to points using the x1 and y1 or x2 and y2 coordinates of the segment?

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

Ciccioo wrote a pretty good explanation for a practical union-find, but here's a general explanation of it.

The general idea for union-find is to group elements together that satisfty a certain property in respect to each other. This can be anything, and that's why union-find is such a versatile algorithm.

 

Its basis lies in set theory. Here are some basics of set theory:

  • {} is an empty set. It is a set with 0 elements.
  • { x ∈ D | P(x) } is the set of elements in domain D, which satisfy condition P. You can think of P as a boolean function that checks something in the given value, and returns true if that value satisfies that condition. P is called a predicate.
  • {1, 2, 3} is an example of a set.
  • |S| is the cardinality function. It is equivalent to a "count" function. It returns the number of elements in a set. So |{1, 2, 3}| = 3.
  • Sets can contain sets. So {{1, 2}, {3}, {4, 5, 6}} is a set of 3 elements, and all of those 3 elements are sets.
  • S1 U S2  is set-union. Set union is, as the name suggests, a binary operation that produces the union of the two sets provided. Remember that there are never duplicates in a set.
    • {1, 2} U {3, 4} = {1, 2, 3, 4}
    • {1, 2} U {2, 3} = {1, 2, 3}
  • S1 U S2  is set-intersection. Set intersection is, as the name suggests, a binary operation that produces the intersection of the two sets provided. So any element that S1 has in common with S2 is present in the result set.
    • ​{1, 2} ∩ {3, 4} = {}
    • {1, 2}  {2, 3} = {2}

 

Now that you know some set theory, union-find works as follows. For n elements, you start by creating a set that contains sub-sets of cardinality 1, for each element n

For n = 5:

S = {{1}, {2}, {3}, {4}, {5}}, you can visualize these as separate graph vertices.

 

This means that none of those elements respects the required property amongst each other. Let's look at the example use for this problem. What you want to make sure is that there are no cycles. You know that introducing an edge (x -> y) would create a cycle if there is already a path from x to y.

So let's say that the condition is: within a sub-set of S, all of the elements have a path to each other. Let's imagine that we add edge (1->2). What would it do?

  • 1 and 2 are in different sub-sets of S. This means there isn't a path between the two.
  • So, we join {1} and {2} to produce {1, 2}. So now S = {{1, 2}, {3}, {4}, {5}}.

Now let's try to add edge (3->4).

  • 3 and 4 are in different sub-sets of S. This means there isn't a path between the two.
  • So, we join {3} and {4} to produce {3, 4}. So now S = {{1, 2}, {3, 4}, {5}}.

Now, let's try to add edge (2->3).

  • 2 and 3 are in different sub-sets of S. This means there isn't a path between the two.
  • So, we join {1, 2} and {3, 4} to produce {1, 2, 3, 4}. So now S = {{1, 2, 3, 4}, {5}}.

Notice now that there is a set {1, 2, 3, 4}. This means that there is a path between all of those vertices. If this seems confusing to you, try drawing this on a paper. 

 

Now, let's try to add edge (2->4).

  • 2 and 4 are in the same sub-set. This means that there is a path between them already. Adding this edge would introduce a cycle, so we don't want it in there.

 

Using this technique you know exactly when adding an edge introduces a cycle in the graph. Which is Bob's problem. I hope this explanation wasn't confusing.

Your task would be to transform this same idea into code, by using appropriate structures to represent these things. The page that Ciccioo supplied gives you a very good example of implemented union-find in a disjoint-set.

 

You also asked what union-find can be used for, besides "cycle checking". It can be used for an enormous amounts of applications. Most notable uses are for checking graph connectivity, and finding a minimum spanning tree in a graph. The last one is especially useful for optimizing network design costs.

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

There are still some things on how to implement it in this problem that I do not understand. Some points can have two parents because there are two points to go to from there, is that correct? How should I implement that? How would I go about checking paths in a set? According to the presentations and articles I read I should recursively call the parent on each point, but what if there are two parents?

As you can see there are still some things on the implementation I do not understand yet.

nope

don't look at it as a graph, there is no path and there is no multiple parents, there are no parents at all: there are just sets. a uniform blob of nodes, that you know can reach each other

 

the union-find algorithm uses a tree to represent those sets, but remember it's a set. the root of the tree has no meaning, it could be any other node of the set, because in a set every element is at the same level as the others

 

so how does it work? the whole thing is done so that you can check whether or not two nodes can reach each other, that is equivalent to say that those two nodes belong to the same set

with this particular data structure, a set is identified by the root of the tree that represents it. if you want to know if two nodes belong to the same set, you just have to check if the root of those two nodes are the same node

 

the implementation goes like this: you have N nodes, and you have an array of N integers

the Nth element of the array tells you what is the parent of the Nth node

if a node is parent of itself, then that node is the root of its set

int N = 10;int[] parent = new int[N];for(int i = 0; i < N; i++)           parent[i] = i;                // initialization: every node is parent of itself (every node is the root of its own little set, which only contains him)parent[3] = 4;                        // 4 is now the parent of 3. 4 is the root of this little tree, so, 4 identifies the set (3, 4) that just got createdparent[5] = 3;                        // 3 is the parent of 5. the root of this tree is still 4parent[1] = 2;parent[6] = 2;                        // 2 is the root of this tree, and it has two childs. it represents a set of 3 nodes (1, 2, 6)parent[2] = 3;                        //2 is not parent of itself, so it's not a root anymore. its whole tree is now attached to the tree whose root is 4//1, 2, 3, 4, 5, 6 are all in the same set, and can all reach each other, and you can tell that because the root of all of them is 4

Also, should I keep refering to points using the x1 and y1 or x2 and y2 coordinates of the segment?

that was kind of the painful part for me

as you can see, every node needs to have a unique index, so that you can use it as the index of parent[]

you need a data structure to associate a point with an index

Link to comment
Share on other sites

Link to post
Share on other sites

that was kind of the painful part for me

as you can see, every node needs to have a unique index, so that you can use it as the index of parent[]

you need a data structure to associate a point with an index

 

A simple way to do this is to use a Hash Map, where the key is a String that represents the point (something like x + " " + y), and the value is a pointer to its parent.

Maximation could use the following in Java:

Map<String, Node> map = new HashMap<String, Node>();
Where Node is a structure that has a pointer to its parent node as well as other info that might come in handy. I hope I'm not adding to the confusion.

 

By the way Maximation, a Map is a dictionary/index, it associates a key to a value. For example:

Map<String, String> map = new HashMap<String, String>();map.put("Something", "this is value");System.out.println(map.get("Something")); //this would print: this is value

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

-snip-

 

 

-snip-

 

I read your posts. As I am really tired now I will read them again tomorrow morning and then start working on a new solution.

I am still not sure how to create a structure for the different points/nodes, how do I know how many of them there are? Oh well, I will come up with a solution tomorrow morning, I guess.

As always, thank you!

Learning

Link to comment
Share on other sites

Link to post
Share on other sites

I read your posts. As I am really tired now I will read them again tomorrow morning and then start working on a new solution.

I am still not sure how to create a structure for the different points/nodes, how do I know how many of them there are? Oh well, I will come up with a solution tomorrow morning, I guess.

As always, thank you!

Alright! If you need any help we're here. :)

In case you want something to base your implementation off, here's a UnionFind implementation I wrote some time back that I keep around for when I need it (it's similar to the one I used for this problem, I removed path compression for less confusion). Node and UnionFind are the interfaces, you don't really need to look at their contents. What matters is in NodeClass and UnionFindInMap.

Want to solve problems? Check this out.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×