Recently my team needed to find out if a point was inside a given polygon and, as usual, we found some code on the internet that did what we wanted. And, as usual, there wasn’t much explanation as to what it was doing. Here’s the code (after we converted it to Ruby):

1

2

3

4

5

6

7

8

9

**10**

11

12

13

14

15

16

17

def contains_point?(point)

c = false

i = -1

j = self.size - 1

while (i += 1) < self.size

if ((self[i].y <= point.y && point.y < self[j].y) ||

(self[j].y <= point.y && point.y < self[i].y))

if (point.x < (self[j].x - self[i].x) * (point.y - self[i].y) /

(self[j].y - self[i].y) + self[i].x)

c = !c

end

end

j = i

end

return c

end

end

So, you know, it works and I’m grateful to somebody for posting it but we had no idea how it works. So we wrapped 2 tests around it (one where the point is not in the polygon and another where it is). But that one method had a Flog score of 80 – it was killing me. Also 2 tests where nowhere near exercising all the possible paths through this code and since it was completely un-obvious as to what was going on this was just a bug waiting to happen.

I decided to refactor this method so that it made a wee bit more sense. I did a little bit of reading on the internet about the ray casting algorithm, got out a pen and paper, and figured out what the hell was going on. Before I go on to explain how I changed the code, I should probably explain that we are using GeoRuby to give us a bunch of geometric classes and we added the contains_point? method to GeoRuby::SimpleFeatures::LinearRing. Inside a LinearRing instance self[0] is the first point in the polygon, self[1] is the second, and so on. Also, point.x gives you the x value of the point. Oh, and self.size returns the number of points in the polygon

Here’s the refactored code:

1

2

3

4

5

6

7

8

9

**10**

11

12

13

14

15

16

17

18

19

**20**

21

22

23

24

25

26

27

28

def contains_point?(point)

contains_point = false

i = -1

j = self.size - 1

while (i += 1) < self.size

a_point_on_polygon = self[i]

trailing_point_on_polygon = self[j]

if point_is_between_the_ys_of_the_line_segment?(point, a_point_on_polygon, trailing_point_on_polygon)

if ray_crosses_through_line_segment?(point, a_point_on_polygon, trailing_point_on_polygon)

contains_point = !contains_point

end

end

j = i

end

return contains_point

end

private

def point_is_between_the_ys_of_the_line_segment?(point, a_point_on_polygon, trailing_point_on_polygon)

(a_point_on_polygon.y <= point.y && point.y < trailing_point_on_polygon.y) ||

(trailing_point_on_polygon.y <= point.y && point.y < a_point_on_polygon.y)

end

def ray_crosses_through_line_segment?(point, a_point_on_polygon, trailing_point_on_polygon)

(point.x < (trailing_point_on_polygon.x - a_point_on_polygon.x) * (point.y - a_point_on_polygon.y) /

(trailing_point_on_polygon.y - a_point_on_polygon.y) + a_point_on_polygon.x)

end

And here’s what I think this algorithm is doing: If you take any point and draw a ray from it, in any direction, and that ray crosses through the sides of a polygon 0 or and even number of times then the point is outside the polygon. If the ray crosses the sides of a polygon an odd number of times then it is inside said polygon. If you don’t believe me get out a pen and paper and draw it or look at this wikipedia article.

So the contains_point? method starts out by assuming the point is outside the polygon since its ray has yet to cross any of the borders. Then it needs to look at each line segment of the polygon (otherwise known as a border or edge) so, if you look at the i and j iterators, they are working their way around the polygon and are used to define a point on the polygon and the point immediately preceding it. Now the algorithm looks to see if the point in question is between the max and min y values for a given line segment using the appropriately named point_is_between_the_ys_of_the_line_segment? method. If that’s true then a horizontal ray could travel through the current line segment. But it might not, depending on which way it is traveling. So the next method ‘ray_crosses_through_line_segment?,’ checks to make sure. Now if the ray has crossed through the line segment, or edge, of the polygon you set contains_point to the opposite of whatever it was. Why? Well if it had crossed 0 times contains_point would be false and one more crossing would bring it to a total of 1 (an odd number) and so contains_point should now be true. Every time you find another crossing you flip from odd to even or vice-versa and so you must flip the variable also.

Not too hard, huh? But if there are many sides to your polygon (and we sometimes have lots of sides) then this can be quite a slow operation to do en masse (and we are planning to do it very en masse) so while doing some research for the refactor I found an optimization:

1

2

3

4

5

6

7

8

9

def outside_bounding_box?(point)

bb_point_1, bb_point_2 = bounding_box

max_x = [bb_point_1.x, bb_point_2.x].max

max_y = [bb_point_1.y, bb_point_2.y].max

min_x = [bb_point_1.x, bb_point_2.x].min

min_y = [bb_point_1.y, bb_point_2.y].min

point.x < min_x || point.x > max_x || point.y < min_y || point.y > max_y

end

A bounding box is the smallest possible box that can completely contain a polygon. If you draw a bounding box around the polygon, check to see if the point is outside the bounding box, and it is then you’re done. You fail fast because if the point is outside the bounding box there is no way it is inside the polygon. GeoRuby gives you a bounding_box method for free (but it’s not too hard to find the max and min x’s and y’s of a polygon) so I was able to create the above method and insert it into the original pretty easily:

1

2

3

4

def contains_point?(point)

return false if outside_bounding_box?(point)

# rest of method as shown above

end

Now contains_point? will give up early if there’s no chance.