Splines extraction on Kopf-Lischinski algorithm, part 1
If you don’t follow this blog and don’t know what is the Kopf-Lischinski algorithm, this small section is for you. Kopf-Lischinski algorithm is a vectorization algorithm specialized on pixel art images giving excellent results (see their supplementary material page). This “summer”, I’m working on bringing this algorithm to Inkscape in the form of a library. This post is like a “progress update status”.
Well, one of the phases in the algorithm is splines extraction. This phase creates a pipeline (output from one step is the input of the next) like the following:
- It takes the connectivity graph
- It generates a generalized Voronoi diagram
- It groups the voronoi cells to identify visible edges
- It generates the splines
In a future post, I’ll explaing a fast method (already implemented in libdepixelize) to compute the Voronoi diagram exploring special properties of the input graph. In this post, I’ll explain a fast method to add polygons together exploring special properties of the generated Voronoi diagram and the data layout used in libdepixelize.
Polygons can be represented by vertice points. libdepixelize store them in clockwise order, with the first point being part of the polygon’s northwest/top-left and all Voronoi cells generated in step 2 are convex polygons. Another important feature of the generated Voronoi diagram is that all “connected” cells share a common edge.
The algorithm can be described in 4 major steps:
- Find the largest common edge
- Remove the in-between points of the common edge
- Choose one polygon and refer to it as P (choose the smallest polygon for better performance)
- Shift P such as P’s head and tail are points of the common edge
- Choose the other and refer to it as Q
- Remove one of the common edge’s points in Q
- Replace the remaining point that is part of the common edge in Q by P’s points
The Voronoi cells are iterated one by one, line by line, from the left-most cell to the right-most cell. We check if (1) we already have a group of the current color, (2) the existing group has a common edge with current Voronoi cell and, then, (3) we add the current Voronoi cell to the existing polygon. Beware that the Voronoi cells from the input are always convex polygons, but the existing polygon (that groups the Voronoi cells) might be concave.
Let’s see an example of the algorithm. In the example, we are iterating in the second line and we found an existing grouping polygon with a matching color (the two entities are connected), then we must add them togheter. The image below represents the situation we are interested in:
Polygon A is represented by the list of points [1, 2, 3, 4, 5] (common edge’s points in underline). Polygon B is represented by the list of points [6, 7, 8, 9, 10] (common edge’s points in underline). Points 4 and 8 are equal and points 5 and 7 are equal. Polygon A is the grouping polygon while polygon B is the current Voronoi cell. We shift B’s list to [8, 9, 10, 6, 7] and use it to get the final polygon [1, 2, 3, 8, 9, 10, 6, 7].
Let’s do it one more time:
Polygon A is the grouping polygon, represented by [1, 2, 3, 8, 9, 10, 6, 7]. Polygon C is the current Voronoi cell and is represented by [11, 12, 13, 14]. This time the largest common edge have 3 points and point 8/11 is the in-between point, then we remove it. The resulting A polygon is [1, 2, 3, 9, 10, 6, 7] and the resulting C polygon is [12, 13, 14]. When we replace the common edge interval in A by C, we get the final polygon, [1, 2, 12, 13, 14, 10, 6, 7]. The final polygon is show in the image below:
One last note that might be useful in later steps is the access pattern used by the algorithm. When we add a voronoi cell to the grouping polygon, there is a hot area and a cold area. The cold area is the area where there will never be a common edge. These areas always are concetrated in the same places, like exemplified by the following image:
Do the previous step look simple? I want to keep things simple, but there may be additional info we may want to store for each node. This (polygon-union) is the last step where we still can gather locality info without executing a deep lookup.
Let’s refer to nodes where at most two different grouping polygons share a point as valence-2 nodes. There are valence-2 nodes, valence-3 nodes and valence-4 nodes. Valence-2 nodes are always smooth and valence-4 nodes never are smooth, but valence-3 nodes vary.
Most of the points are shared by nodes of different polygons and when we have three valence-3 nodes, exactly only one of them will be smooth. We apply Kopf-Lischinski algorithm heuristics to determine which one will be and store this info for later usage. I want to play more with the code to determine the best phase to compute this data, but I’m thinking about merging Voronoi generation and node type computing in one.
The complicated part about the splines extraction on Kopf-Lischinski algorithm is the overlapping between these last steps. I wasn’t able to get the code right before “put all the cards in the table”. Now I’ll code using this guide as a reference and I’ll provide the remaining algorithm in the next post.
A bit of performance
So, Kopf-Lischinski algorithm resembles a compiler. You have several data representation types and each type offers different operations. You explore the operations each type offers and convert it to another representation until you reach the final result. In several of the type representations used by the Kopf-Lischinski algorithm, you have a matrix and you access each element and its neighbours.
The particular implementation for libdepixelize stores and access the elements linearly like [1, 2, 3, 4, 5, 6, 7, 8, 9]. It could make good use of processor cache, but ain’t that easy. Suppose we are iterating on element 5, then we need to access all its neighbours, but only neighbours 4 and 6 may be in cache, especially in large images. This is the first problem in cache usage of the implementation, but we cannot remove this usage pattern, because it’s part of the algorithm and there is a data dependency among the elements.
What we can is reduce it, because this same pattern is used again and again over most of the passes. Another day I was reading about Xiph.org’s Daala codec and an interesting idea idea hit me: overlapping transforms. We can make each element store information about its neighbours that may not be in cache (1, 2, 3, 7, 8 and 9). This task would require more CPU time, but this cost would be mostly needed during the intialization of the first data representation only and it would be alleviated in the next passes. This change may increase memory requirements also, but the CPU would thank us for a major increase in cache hits.
Another problem with the current data access pattern is that each element store a complex object that may point to other regions of memory and add a level of indirection that can contribute to cache miss. One idea that can increase the cache hit is the one behind the Small String Optimization. This change would highly increase data locality and fits well in Kopf-Lischinski algorithm, because the complex objects stored by each element in every phase tends to have a small maximum number of subelements.
I need to combine the previous ideas and measure the impact. I have two computers and I can request access to the cluster in my university, but it’d be great to have a large database of pixel-art images ready to test. Another interesting project to use during the measurements is Stabilizer.
All that said, these are only optimizations exploring knowledge of computer architecture and I want to improve performance exploring better algorithms also. There are already a few of these, but Nathan Hurst pointed me a hint to research about Eigenvalues and I need to reserve some time to do it.
Oh, and I LOVE to work on projects where I can toy with my knowledge and projects where I can learn more and it’s so much fun to work in this project that I cannot share the joy I have right now with you guys, but I can tell this: This is the last section on this blog post, but is the first section I wrote and I only started to write the other sections after I finished this one.