I’ve invested some effort to improve libdepixelize response to evil patterns. Because libdepixelize won’t discard color information, the connections of the similarity graph aren’t a transitivity relation and extra topological patterns can happen. I refer to the extra patterns as evil patterns. The name comes from the fact that I enjoy to play Zelda and defeat the evil from the Hyrule land. Same happened in the last libdepixelize’s commits, where I overcame some patterns to improve the output quality. Quality can be something subjective sometimes, then I was a little conservative and limited my changes to things that I could reason about. The effort end up in new samples to the libdepixelize documentation, new rules for the algorithm and its related documentation and lines of code in libdepixelize itself.
The new rules are added as an extra step on the process. They are not treated like the Kopf-Lischinski’s heuristics to resolve crossing connections. I think maybe would be possible to get some of the new rules and describe versions that are more general and can be added to the “container” that holds the old heuristics. To make this happen, I’d need to define the concept of “similar color” as a set and operations on top of the set, the notion of interval and other things and logical reasoning on top of all that. A bit of mathematical work to improve the quality a little more, but I wanna to investigate the use of La*b* colors (an old suggestion by Nathan) to replace the current concept of “similar colors”. I didn’t replaced the current space color until now, because YUV and La*b* behave differently and I couldn’t just convert the YUV constants that define the boundary between dissimilar colors to La*b* to achieve better results. The current YUV constants were taken from HQx filter and I need to define a methodology to find new constants.
The advantage of a rule that is more general and unifies behaviour is just the beauty of a simpler rule that handles the real problem, as opposed to several branches that are consequence of the real problem. It’d abstract the nature of the problem better. It’d make the code simpler. It’d handle more branches. Being a single rule that affect more branches, it’d be easier to test and better at convincing me that the improvement is real and there will be no loss of quality in other images.
It’d be interesting to investigate the range of voting in all heuristics and try to come up with “fair” multipliers/strength.
New idea to represent splines
Previously I used a technique to carefully insert new nodes to obey the technique “adjust splines” from the Kopf-Lischinski paper while being limited by the old splines representation. This technique has caused troubles to several later steps.
The first problem was to remove the extra invisible nodes that are present in the output SVG. These extra node not only make the output larger and make rendering consume more CPU time, but also can be troublesome for artists wanting to manurally edit the generated image. I’ve made several unsuccessful attempts to remove these nodes and I’m sure it’s possible, but I’ll try to move away from this problem trying a completely different approach to the problem.
The second problem is similar, in essence, to the first one, described in the paragraph above. When I originally disabled the optimization step in the Inkscape GUI and marked it as experimental, one of the reasons was because it required “extra preprocessing steps” (it was a pretty vague explanation, but I should try to improve my communication skills). With no extra invisible points, the optimization step will be way simpler. The step related to optimization that avoid overlapping shapes and holes will partly go away and the approach I mentioned previously (“A new idea to keep the shape of optimized splines correct“) will be affected.
The idea is to split splines. I was previously using a list of points. The new idea is to use a list of splines, where spline itself is a list of points. I hope that the new representation will allow a representation closer to “arbitrary”, just enough to apply the operation “adjust splines”. The new representation should do without extra points and easy the last processing steps (in terms of processing power and my productivity). Of course this change requires a lot of refactoring and will take a bit of time to be finished. Also, the “hope” word used previously means that I haven’t thought about all implications and I’m sharing this idea very early. I’m trying to improve my communication skills and this means that you’ll see some “flood” on this blog. The “flood” might include ideas that eventually prove to be bad at a later point and doesn’t hit the libdepixelize code.
First, the software is free software, then anyone can use it (even the development versions). So, if you find a crash or something that obviously is a bug, fill a bug report and I’ll fix it.
Second, I still want to improve the quality of the algorithm, then a good pixel art database can help me a lot. Currently the algorithm behaves bad at images with too many gradients, but this doesn’t mean that a lot of images with gradients will help me to improve the quality of the algorithm. I need to publish a page explaining what kind of image can help me to improve the quality of the algorithm.
Third, you can help me with profiling info to improve the performance of the algorithm. I’ll probably send a message to the Inkscape mailing list with the title “request for beta testers/profiling data on libdepixelize”. I’ll probably (re-)share the message through Google+, this blog and maybe more. If you follow any of these communication channels, you’ll know when I need help.
Thanks for the support.
The project doesn’t have a logo and uses an ugly icon in the Inkscape GUI dialog. I was thinking about use the character introduced by Jabier on the Inkscape mailing list to represent libdepixelize project:
This image already is highlighted on the algorithmic documentation report anyway and is unique enough.
I’ll have a test at the end of the week and later I’ll share more of my time to play with POWER and LLVM. Then libdepixelize will have to wait a little until I can do more commits.
I’m aiming to deliver all the improvements related to the original Kopf-Lischinski algorithm before Inkscape 0.91. Later I’ll try to improve performance. “Beyond Kopf-Lischinski” extensions have no timeline and depend on creativity.
Lots of months without updates, I thought it was about time to post something.
Interestingly, this is the post that document the slowest progress, but it was the post that consumed more from my time.
The frogatto game database
I found an awesome pixel art database distributed with the Frogatto game and from now on I’ll use it on my research and tests. The license is kind of confuse (you don’t know precisely all the things you can do) and I’d prefer a Creative Commons license, but it’s safer (for me) than use copyrighted graphics from Nintendo or other companies. Not only the license is better, but also the beauty of these graphics is outstanding.
Most of the images don’t have an Alpha channel and use a placeholder color as the removable background, but there are some images where real use of Alpha channel (not only on-off) is there.
I want also to add that I liked the Frogatto game so much that I was thinking about the possibility to join forces with their developers to provide a hi-rez version of the game using the Kopf-Lischinski algorithm. Maybe I can borrow some processing power from some cluster at my university to generate the SVG files.
The Alpha channel heuristic
If you’re unfamiliar with Kopf-Lischinski heuristics, I suggest you to read the “trace pixelart” manual bundled with the very (bzr) latest version of Inkscape.
The frogatto game pixel art database made me realize one simple fact: The alpha channel heuristic to resolve similarity graph ambiguities that I was planning to develop is pointless for most of the cases, because the extra color patch will work against the heuristic. The extra color patch will see different colors and will create square-like pixels. The below image from the Frogatto database (with a red background inserted) sumarize this problem:
Do you know how a good a conversion from current libdepixelize would be? Well, it would generate a similar image as output, but a an alpha channel heuristic wouldn’t help. Also, there aren’t really any cross-connections to resolve here (because the alpha channel + extra color patch turn each one of the white pixels into different colors/shapes). Even if there was an ambiguity in the similarity graph, I don’t see how an alpha channel heuristic would help (mainly because the lack of problem hurts my imagination).
There is, although, one case where I see an improvement/extra-safety-guarantee that could be achieved through an alpha channel heuristic. Look at the magnified image below:
The image is ugly and the reason is because I created it. It’s so ugly that maybe you don’t understand what drawing I tried to achive here, then I need to explain. The drawing is a part of a chain. There were chain images on the Frogatto database, but they weren’t affected by the issue I wanted to mention.
The image squares are pixels and the image was magnified 12x. The Alpha channel information is still there. The other heuristhics (long curve and sparse pixels) will likely vote to keep the chain shape, then the result is already good and there is no need for an Alpha channel heuristic. Also, it’s possible (but very unlikely) that the transparent color has random bits that will create no connections and won’t affect the chain shape.
Well, pretty much an Alpha channel heuristic is useless, but kind of nice to have as an extra safety over the other heuristics. But I won’t create this extra safety guarantee, because the above image is artificial (you won’t find it in any game) and I haven’t find a good Alpha channel database affected by this issue yet. If I do find such database, I can reconsider this issue, because even if the heuristic go wrong for some images, the libdepixelize design allow you to disable or invert any heuristic (neat feature).
Just to be clear for once… I won’t waste my time looking for a pixel art database making good use of alpha channels affected by this issue. I don’t even bother anymore. But… if you do find such database, just share it with me and I’ll see what I can do. In fact, this is one of the changes in libdepixelize where I don’t need to invest much effort or thought (apart from the pixel art database research).
A new idea to keep the shape of optimized splines correct
One of the things that I’m really failing to achieve is to keep the shape of optimized splines correct. I had a new idea that I want to test soon and I’ll describe in the text below.
So, the idea is to create an index of all points that share the same position before the optimization begins. Every time a point is optimized, the position for all points sharing that position change. The problem is, among others, to keep invisible lines invisible.
The approach is simple, but it’s a bunch of code to do. Stay tuned.
A new competitive filter
The post is getting kind of big, then I’ll leave analysis of one possibly interesting algorithm for later.
Correctness (C++ memory-wise) tests
This was a task originally suggested by Nathan. I think he meant to use Valgrind, but I went on a different direction and I went on LLVM suite (clang sanitizers) instead.
First test is simple and it didn’t show much. This was done using clang static analyzer tool and theoretically can be as good as the compiler warnings. The output was “No bugs found.”.
Second test was the combo address sanitizer+leak detector, also using LLVM tools. I fixed some bugs regarding the use of the popt command line arguments parsing library and suppressed another warning (source outside of libdepixelize cannot be fixed within libdepixelize). Looks like the use of RAII helped to keep libdepixelize free of memory leaks and the use of standard containers instead self-made structures helped to keep libdepixelize free of wrong-address errors (although I do some weird pointer arithmetic at some places to try to improve the performance, but the analyzer hasn’t spotted errors).
Third test was the memory sanitizer, that would help to spot the use of non-initialized variables, but this sanitizer aborts on the first error and due to an error within popt I couldn’t go farther. I tried to use the suggested attributes and also the blacklist file, but it didn’t help. In short, I’d need to rewrite the binary without popt to inspect libdepixelize code through the memory sanitizer, then I’m postponing this task. At least the next planned big refactor is more sensible to addresses, which has a sanitizer working fine.
I tested all output modes (voronoi, grouped voronoi, non-smooth and default). The set of images used was chosen to increase the test of code paths executed, including the images that are used as examples in the original paper to describe the heuristics and smoothing techniques and some big images taken from real-world games just to stress the code. Something better would be some unit tests. This set of image was used for all tests described in this post.
This was also a task originally suggested by Nathan. The task was not to find the slowest element in the processing pipeline, because maybe the element is the slowest, because indeed the task is tough and it may be doing the best possible. The task was to find which processing element wasn’t scaling linearly with the size of the input and improve it.
I haven’t measured memory consumption like old Rasterman told us, but memory consumption isn’t my focus with libdepixelize. I use a templated argument to customize the precision and you can “freely” (actually it’s limited by 2geom) use any type (float, double, long double, …) you think fits your application the best. I try to avoid cache misuse and too many allocations and memory usage is just a consequence of this design principle. Thus, don’t expect any changes focused on memory consumption. Memory usage may receive some love, but just as a mean to achieve better performance. Also, this isn’t the kind of application you hope to keep around and you just want to see it finishing fast. A last thought is that structuring packing could decrease memory usage without affecting the algorithm, which can improve the use of the cache and the performance.
You can find the result below (libdepixelize profiling output + zsh + python tricks) and the method at the end of this section. The image was generated using R.
And below, you have a plot of the same data using a logarithmic scale to move away the curves from 0.
I’m sorry for being that n00b, incapable of producing a legend that don’t cover parts of the curves.
Now it’s clear what processing units I need to investigate to improve the performance.
Improving cache usage
One of the ideas to improve the performance was to use a cache oblivious algorithm. This means that I should use a different memory layout to make better use of cache. I wanna know what kind of improvement I should expect and I did a small test converting a large image to a single row and did some measurements (you can find a code snippet below as a proof), but the libdepixelize’s code take different paths and the comparassion is very unfair (to not say useless).
I will do a second approach mentioned in this blog post to measure cache misuse. I’ll have to implement the new memory layout and compare the results to be sure about the improvement. Before comparing the results, I’ll have to rerun the old tests, because my computer will be using newer libraries and newer compilers and the comparassion wouldn’t be fair.
Let’s define a run as the execution of the script below (given that libdepixelize was configured with the option OUTPUT_PROFILE_INFO enabled).
I did 3 runs and aggregated all common values per image (even equal steps in different output modes, like “Tracer::Kopf2011::_disconnect_neighbors_with_dissimilar_colors(Tracer::PixelGraph)”, with exception of “Tracer::Splines construction”, which would be unfair). I discarded the largest (the slowest) value for each field per image, because I was considering it as cold cache effect and it’s more difficult to measure/replicate this condition (I’d need to learn how to clear the cache without rebooting the PC), then I’m more interested in hot cache condition. Then, I computed the arithmetic mean. Now I have a table of values to print for a set of images. I’d like to compare the time taken by some processing element (this is how to interpret each value) as the “difficult” increases. The difficult was related to each image and to rank them I just sorted them by number of pixels.
And in case you want aggregate data like I did, I’m sorry, but I cannot help you. I used at least 3 technologies to do interactive exploration (ZSH + BASH + Python) and I cannot provide scripts that don’t exist to you. And about the other tools used, they were used just because I know how to use them (accidental vs planned) and the communication among them was rather unusual (clipboard text + simple files + json formatted files + MongoDB + CSV). The description of the method given above is just better than the tools by themself. Accept my apologies in the form of an AA cow:
I ran all the tests on this machine and I kept myself from using this machine while the tests were being executed. Yes, not a very “scientific” approach, but I think it’s on the right track and I need to reserve some time to development. The purpose is just to identify the “worst” processing element anyway.
All posts related to the Kopf-Lischinski on this blog were collected, updated and extended in the “libdepixelize algorithmic documentation”. If you have any doubts about this algorithm, use it as a guide: http://bazaar.launchpad.net/~inkscape.dev/libdepixelize/trunk/view/head:/the%20algorithmic%20documentation%20report/index.pdf
Posts that were grouped in this pdf:
Before vote and choose any option, know that this ain’t democracy and I don’t promise to follow the winner, but I’m interested in the popular option because I care. Also, before vote anything, read the rest of this post to understand the consequences of each choice.
How does the full Kopf-Lischinski algorithm looks like? Just like the image below:
Like it? Me too. How close is libdepixelize of getting this result? This depends on what libdepixelize will aim to, but let’s skip this discussion for later. First, the below image has an ultra-summarized answer to this question:
Like it? Now look some images generated by libdepixelize for the same input used to generate the image above:
The output from the left is buggy by Kopf-Lischinski and my own standard. If this result were correct, then it should group these two gray blocks in a single block and I should code a bit more to make sure this happen.
The output from the right is “innovative” and I’m coding since the beginning to preserve as much color information as possible and I don’t need to fix anything, but I need to extend Kopf-Lischinski algorithm to handle the “unfilled” space between two (non-)grouping blocks. The result wouldn’t be Kopf-Lischinski, but it’d preserve more color information and if Kopf-Lischinski algorithm is ever going to be patented, we’d have a different algorithm (partially patent-free).
Now let me explain the technical details using simple and summarized language. Kopf-Lischinski algorithm uses a similarity graph to group blocks of similar color aiming two results:
- Define a new shape to the image (and I very much like the result)
- Remove colors (and I very dislike this choice)
The algorithm remove color information to simplify the vectorization process, but I came all this way without being limited by its “not that much increased complexity” and I think I can extend the algorithm to include descriptions of how to handle more color information in these final steps.
If you think the “non-enhanced” version would be easy to implement, it’s because you haven’t thought about any image where a gradient is present. Close blocks of colors would have similar colors, but distant blocks would have different colors anyway, so there is not an easy way to choose which blocks should be grouped. Most of the ideas I thought about (since the beginning of the project) would have different results depending on the “iteration order” and as such I consider this a too “fragile”/unstable algorithm. The only “correct” solutions I can think of at the moment are too expensive and based on the benchmark present in the paper, I doubt them were chosen.
Keep in mind that any path I choose will bring me code to produce.
This is a sequel of a series of posts. The last one was “part 0” and this order is a bit insane, but nothing to worry about. Check the first one to understand what all this is about. I hope this will be the final post in this series.
Once upon a time… a polygon with holes appeared
The polygon-union described in the previous posts works correctly and there are no better answers than the correct answer, but these polygons were meant to be B-Splines and when I described them, I already knew that the mentioned algorithm would be insufficient.
Consider the following image.
The left polygon is the polygon that you see before and/or after the polygon-union. They are visually indistinguishable. The right polygon has an outline to allow you understand its internal representation. You can understand how the polygon-union generates a polygon with a “hole” seeing the right polygon.
If the polygon-union don’t generate visually identifiable pixels, then there shouldn’t exist any problem, but when we convert the polygon to a B-Spline, some areas of the image won’t be filled. The polygons won’t fit correctly, like show in the image below.
The solution is to create some holes. With the representation of polygons used in libdepixelize, this task is very simple and key operation to accomplish the goal is to encounter the “edges that the polygon share with itself”. I’ll explain this task further in the following paragraphs.
The above image has 2 pairs of hidden points, pair <5, 6>, that is equal to pair <11, 10>, and pair <12, 13>, that is equal to pair <18, 17>. Well, to extract internal polygons that represent the holes, you just iterate over the points and, for each point, try to find another point that is equal to it, then get the internal sequence and use it to construct the internal polygon.
- Remark #1: libdepixelize will do these steps in backward order, to avoid too much moves of elements in the internal vector.
- Remark #2: The polygon is represented in clockwise order, but the holes will be represented in counter-clockwise order, but there is no reason to normalize. You can safely ignore this (de)normalization feature.
I forgot to mention that edges not always are two points only and you should use the same “largest common edge” from the algorithm of previous posts.
Things became more complicated than I predicted and now involve some recursive functions and I needed to extend (extend != change) the algorithm. To see the whole extension, check out the Splines::_fill_holes function in libdepixelize source code and where this function is used.
Things became even more complicated than the complicated things that I didn’t anticipate. Complicated code to fix the position of T-junction nodes created a pattern where the polygon itself shared a point with itself and this pattern propagated wrongly in the “holes-detection” algorithm and afterwards.
The algorithm “3rd ed.” check if the common edge has “lenth” 1 to solve the problem.
And then, the polygon were meant to be a B-Spline
It’s a series about splines extraction and it’s fair to end with this step.
The algorithm is very simple. The points you get through the polygon-union algorithm in the previous posts will be the control points of quadratic Bézier curves and the interpolating/ending points of each Bézier curve will be the middle points between two adjacent control points. This is the special case where all points are smooth. The image below can help you get the idea.
If you have non-smooth points, then all you need to do is interpret them as interpolating points, instead of control points. You’ll have three interpolating points that are part of two straight lines. There is a bit more of theory to understand why the generated B-Spline is smooth, but we’re ALMOST done! In next post (a small one, I hope), I’ll detail how to get correctly positioned points for T-junctions.
This post is a continuation of “Splines extraction on Kopf-Lischinski algorithm, part 1“. The title contains “part 0”, because the algorithm step described here should be executed before the algorithm step described in “part 1”.
This blog post is kind of unstructured, but don’t worry, because I’m aware of that and is intentional.
Generalized Voronoi diagram generation
The algorithm described in this section was kind of documented already, but the documentation wasn’t good enough to be part of a post, then it was keep as a private home-made PDF.
Well, a Voronoi diagram is a black box where you put some points (the seeds) and you get some polygons (the cells). Each polygon contains all points that are closer to its seed than any other seed. There is a good article on Wikipédia and I won’t explain any further.
Kopf-Lischinski algorithm executes a bunch of operations on a graph and it uses a Voronoi diagrams to extract a visual representation from this graph. The simplest form of a Voronoi diagram works with 2D points-seeds, but we have higher-dimentional Voronoi diagrams, Voronoi diagrams using different distance functions and even Voronoi diagrams using complex non-points seeds. We are interested in these Voronoi diagrams using complex non-points seeds.
The below image has a representation of the output graph of the Kopf-Lischinski algorithm and its Voronoi diagram representation. The graph nodes are represented by circles, where nodes with the same color are similar and usually are connected. The connections of the nodes are represented by green lines.
The generalized Voronoi diagram is visual, then there is no special meaning to explain it in the previous image. The seeds of this diagram aren’t points, they are line segments. You just need to break a green line in two and add each half to its containing cell. You can note that some polygons aren’t convex.
The graph used as input has certain constraints that enable us to use some simple and fast operations instead of a full-featured and complex algorithm.
If a Voronoi cell is a polygon containing all points that are closer to its seed than any other seed, we can determine the edge of a Voronoi cell by the midpoint of two adjacent seeds. If we generate a vertex for each of its 8 directions, we will get an accurate Voronoi diagram.
We can get a simplified Voronoi diagram by forgetting about the top, bottom, left and right vertices (if we just generate the diagonal vertices). The simplified version doesn’t contain concave polygons.
The act of generating diagonal vertices is more complex than the act of generating the other vertices. We need to check if there is a connection with the other cell and, if this connection exists, generate two vertices. If the connection doesn’t exist, we generate a single vertex, but its position depends on the existence of a connection between its neighbors. Look the following figure.
All information we need to generate the Voronoi diagram is located within its neighbors and the only extra tool we need to generate the points is the midpoint procedure. This is old news and it was already implemented in libdepixelize.
When we generate B-Splines using Kopf-Lischinski algorithm we need a way to separate points that create sharp corners and smooth points. The Kopf-Lischinski algorithm has techniques just to handle this issue. In libdepixelize, I decided to merge the point smoothness computation and Voronoi generation in one single phase. This is the phase where we can gather lot of info about the graph and we can propagate the smoothness data to later phases.
One note about the algorithm of this “metadata extraction” section is that some points will disappear during polygon union and we don’t care if the metadata about these points are accurate, then we can simplify the algorithm exploring this property. The below image will be citated all time during this time to how explain this algorithm:
There are two types of vertices generated in this special Voronoi diagram:
- White type: The type of the one labelled with “#1”. Node “#1” is contained by three cells, one purple and two yellows. The two yellow cells will be merged during polygon-union. There are two possibilities for the remaining cell in this kind of node:
- When it has the same color. Then the node will disappear during polygon-union and we don’t care about its “smoothness” attribute.
- When it has a different color. Then we say it is a valence-2 node.
- Cyan type: The type of the one labelled with “#2”. This type of node can appear in the border of the image and isn’t smooth or in the center of 4 cells and its smoothness isn’t predictable in advance. If it appears in center of four cells, then:
- It can be in the middle of 4-connected cells and we don’t care about its “smoothness” attribute, because this node will disappear.
- It can be in the middle of a valence-2 node and will be smooth.
- It can be a valence-3 node and things start to get complex. After the polygon union, this node will be part of three different polygon and only two of these three nodes will share the same value for the “smoothness” attribute.
Problem explained, it’s time to algorithm! The algorithm is kind of repetitive (if we iterate the bottom-right node, compare with bottom and right nodes, then do all it again, but use nodes of different directions…), but the principle is the same for all “repetitive” code, then I’m only going to put the important parts and you’ll be able to see the full algorithm implemented in libdepixelize within a few hours. Lastly, this isn’t the first algorithm I came up with and I made a lot of iteractions already (lots of crumpled paper on my room’s floor), but I think the result can perform pretty fast and I’m content. This post is kind of a documentation and it will help me to not forget what I wanted to do in the code.
The above image represents the analysed data in the following code example, except for the fact that we don’t know what are the colors of the cells. We are iterating on the middle point of the image and the current iterated cell is cell A. The algorithm also uses the concept of “shading edge” and “contour edge” described in Kopf-Lischinski algorithm paper.
All images in this post were created using Inkscape and its awesome alignment tools.
If you don’t follow this blog and don’t know what is the Kopf-Lischinski algorithm, this small section is for you. Kopf-Lischinski algorithm is a vectorization algorithm specialized on pixel art images giving excellent results (see their supplementary material page). This “summer”, I’m working on bringing this algorithm to Inkscape in the form of a library. This post is like a “progress update status”.
Well, one of the phases in the algorithm is splines extraction. This phase creates a pipeline (output from one step is the input of the next) like the following:
- It takes the connectivity graph
- It generates a generalized Voronoi diagram
- It groups the voronoi cells to identify visible edges
- It generates the splines
In a future post, I’ll explaing a fast method (already implemented in libdepixelize) to compute the Voronoi diagram exploring special properties of the input graph. In this post, I’ll explain a fast method to add polygons together exploring special properties of the generated Voronoi diagram and the data layout used in libdepixelize.
Polygons can be represented by vertice points. libdepixelize store them in clockwise order, with the first point being part of the polygon’s northwest/top-left and all Voronoi cells generated in step 2 are convex polygons. Another important feature of the generated Voronoi diagram is that all “connected” cells share a common edge.
The algorithm can be described in 4 major steps:
- Find the largest common edge
- Remove the in-between points of the common edge
- Choose one polygon and refer to it as P (choose the smallest polygon for better performance)
- Shift P such as P’s head and tail are points of the common edge
- Choose the other and refer to it as Q
- Remove one of the common edge’s points in Q
- Replace the remaining point that is part of the common edge in Q by P’s points
The Voronoi cells are iterated one by one, line by line, from the left-most cell to the right-most cell. We check if (1) we already have a group of the current color, (2) the existing group has a common edge with current Voronoi cell and, then, (3) we add the current Voronoi cell to the existing polygon. Beware that the Voronoi cells from the input are always convex polygons, but the existing polygon (that groups the Voronoi cells) might be concave.
Let’s see an example of the algorithm. In the example, we are iterating in the second line and we found an existing grouping polygon with a matching color (the two entities are connected), then we must add them togheter. The image below represents the situation we are interested in:
Polygon A is represented by the list of points [1, 2, 3, 4, 5] (common edge’s points in underline). Polygon B is represented by the list of points [6, 7, 8, 9, 10] (common edge’s points in underline). Points 4 and 8 are equal and points 5 and 7 are equal. Polygon A is the grouping polygon while polygon B is the current Voronoi cell. We shift B’s list to [8, 9, 10, 6, 7] and use it to get the final polygon [1, 2, 3, 8, 9, 10, 6, 7].
Let’s do it one more time:
Polygon A is the grouping polygon, represented by [1, 2, 3, 8, 9, 10, 6, 7]. Polygon C is the current Voronoi cell and is represented by [11, 12, 13, 14]. This time the largest common edge have 3 points and point 8/11 is the in-between point, then we remove it. The resulting A polygon is [1, 2, 3, 9, 10, 6, 7] and the resulting C polygon is [12, 13, 14]. When we replace the common edge interval in A by C, we get the final polygon, [1, 2, 12, 13, 14, 10, 6, 7]. The final polygon is show in the image below:
One last note that might be useful in later steps is the access pattern used by the algorithm. When we add a voronoi cell to the grouping polygon, there is a hot area and a cold area. The cold area is the area where there will never be a common edge. These areas always are concetrated in the same places, like exemplified by the following image:
Do the previous step look simple? I want to keep things simple, but there may be additional info we may want to store for each node. This (polygon-union) is the last step where we still can gather locality info without executing a deep lookup.
Let’s refer to nodes where at most two different grouping polygons share a point as valence-2 nodes. There are valence-2 nodes, valence-3 nodes and valence-4 nodes. Valence-2 nodes are always smooth and valence-4 nodes never are smooth, but valence-3 nodes vary.
Most of the points are shared by nodes of different polygons and when we have three valence-3 nodes, exactly only one of them will be smooth. We apply Kopf-Lischinski algorithm heuristics to determine which one will be and store this info for later usage. I want to play more with the code to determine the best phase to compute this data, but I’m thinking about merging Voronoi generation and node type computing in one.
The complicated part about the splines extraction on Kopf-Lischinski algorithm is the overlapping between these last steps. I wasn’t able to get the code right before “put all the cards in the table”. Now I’ll code using this guide as a reference and I’ll provide the remaining algorithm in the next post.
A bit of performance
So, Kopf-Lischinski algorithm resembles a compiler. You have several data representation types and each type offers different operations. You explore the operations each type offers and convert it to another representation until you reach the final result. In several of the type representations used by the Kopf-Lischinski algorithm, you have a matrix and you access each element and its neighbours.
The particular implementation for libdepixelize stores and access the elements linearly like [1, 2, 3, 4, 5, 6, 7, 8, 9]. It could make good use of processor cache, but ain’t that easy. Suppose we are iterating on element 5, then we need to access all its neighbours, but only neighbours 4 and 6 may be in cache, especially in large images. This is the first problem in cache usage of the implementation, but we cannot remove this usage pattern, because it’s part of the algorithm and there is a data dependency among the elements.
What we can is reduce it, because this same pattern is used again and again over most of the passes. Another day I was reading about Xiph.org’s Daala codec and an interesting idea idea hit me: overlapping transforms. We can make each element store information about its neighbours that may not be in cache (1, 2, 3, 7, 8 and 9). This task would require more CPU time, but this cost would be mostly needed during the intialization of the first data representation only and it would be alleviated in the next passes. This change may increase memory requirements also, but the CPU would thank us for a major increase in cache hits.
Another problem with the current data access pattern is that each element store a complex object that may point to other regions of memory and add a level of indirection that can contribute to cache miss. One idea that can increase the cache hit is the one behind the Small String Optimization. This change would highly increase data locality and fits well in Kopf-Lischinski algorithm, because the complex objects stored by each element in every phase tends to have a small maximum number of subelements.
I need to combine the previous ideas and measure the impact. I have two computers and I can request access to the cluster in my university, but it’d be great to have a large database of pixel-art images ready to test. Another interesting project to use during the measurements is Stabilizer.
All that said, these are only optimizations exploring knowledge of computer architecture and I want to improve performance exploring better algorithms also. There are already a few of these, but Nathan Hurst pointed me a hint to research about Eigenvalues and I need to reserve some time to do it.
Oh, and I LOVE to work on projects where I can toy with my knowledge and projects where I can learn more and it’s so much fun to work in this project that I cannot share the joy I have right now with you guys, but I can tell this: This is the last section on this blog post, but is the first section I wrote and I only started to write the other sections after I finished this one.