Tuesday, July 17, 2012

Two problems in one: density and minimum enclosing circle

I think a lot about density. If there is a relatively homogeneously distributed element in a clearly defined space, density is easy: amount of substance divided by amount of space. But it makes much less sense for a small, discrete set of elements in a poorly defined space. Consider this example of six cities.



There is not a uniform density, but an investigation of local density turns up the question of how to define the boundaries. One possibility is to draw in the perpendicular bisectors, like so:


The polygons indicate all the places in the space which are closest to each point. (This is a Voronoi diagram.) The numbers indicate the areas of each polygon. To the good, the polygon around D is the largest, reflecting that D is the least dense. On the other hand, neither E nor F is given "credit" for how close they are to each other. I think density at a point ought to be more influenced by the local density than by the global density -- small, isolated, dense clusters ought to be given a rating of "relatively dense." So for that reason, I abandoned areas. How one draws them makes an enormous difference, and the above scheme is the only non-arbitrary way I can see to draw them -- and it does not work.

That is why I started thinking about distances. The density at a point ought to be inversely proportional to the "average" distance a traveler must go to get to any other point, weighted more heavily for the shorter distances. This makes a lot of sense to me. A neighborhood is dense if a resident can walk to many shops, restaurants, and parks, even if it is a small, isolated pocket of the city; a neighborhood is relatively dense if a resident can drive less than 5 minutes to many shops, restaurants, and parks; and a neighborhood is not dense if the closest shops, restaurants, and parks are not close at all. Therefore, we must weight the closest amenities more heavily in the average, like this:



where d_ i_ j is the distance between two points, and p is a function based inversely on distance. I do not believe it works if p is a linear function, so I tried this for p:


where k is any number greater than the maximum length in the whole set. This formula results in very, very heavily weighting the shortest distance. A lot of other formulas could be used for p that do not weight the shortest distance quite so heavily. Here is what the results look like (avg. distances in blue, some actual path lengths marked in black):



A and B are the most dense (close proximity to each other), followed by E and F (again, local density), then largely trailed by D, and even farther, C. One surprising result is that A, which is farther from D, E, and F, and closer only to C, has a lower average distance than B. One way to think about it is that the farther distances of D, E, and F mean that A's average is more influenced by B's proximity (higher distances push down p, so that these longer distances count for very little in the average). In real terms, this might mean that someone living in city A is less likely to travel to cities D, E, and F than someone living in city B -- thus, city A's residents have a lower overall average trip. Another way to think about it is that, under this measure, density is increased by isolation: the self-contained bubble. If two small cities had equal density, but one was in the orbit of a major city, which is less dense? The orbit city residents are more likely to venture into the major city, so perhaps this should count as less dense. An interesting question is whether a function p can be created which does not have this effect.

* * *

Another, seemingly dissimilar problem, is finding a minimum enclosing circle for a set of points. The minimum enclosing circle (MEC) is the smallest-area circle, and by definition, it must go through either three of the points (forming an acute triangle) or two (on the diameter of the circle). There are a lot of algorithms to find the MEC, but here is one I have not seen written up before.

To start, find the centroid of the points (merely average the x-coordinates, and then the y-coordinates, to find the x- and y-coordinate of the centroid, respectively), like so:



Next, compare the distances from each point to the centroid:


Point F is the farthest (and is definitely on the MEC). So, move the centroid closer to F by weighting F more heavily in the centroid calculation (ah, this is why it was in the same post). Here is the centroid's movement:


It turns out that F only needs to go up to a weight of 1.7 before C is now the farthest point from the adjusted centroid (and therefore, C is definitely on the MEC). So, move the centroid closer to C. In this case, it takes only a slight tweak to C before A, C, and F are equidistant from the adjusted centroid (so now, these are the three points on the MEC, and the adjusted centroid is its center). However, in other cases, it takes a lot of bouncing back and forth between C and F until a third point ties. In essence, bouncing back and forth between C and F moves the adjusted centroid along the perpendicular bisector of C and F toward the midpoint of C and F:



Of course, the centroid does not have to reach the midpoint; it only needs to approach the midpoint until a third point is equidistant (in this case, point A). And here is the minimum enclosing circle:



It works.

No comments:

Post a Comment