I’m currently enjoying the summer time with a bit of research, I’ve been set on to doing for a long time. The topic is de-mosaic, and I’ve had my mind set on trying out using neural networks for this task.
Again, I base my choice of are on an AviSynth plugin, named ‘nnedi‘ (short for Neural Network Edge Directed Interpolation), which was written by tritical, AKA Kevin Stone. It is a filter that doubles the height of the image with very impressive results. Unlike many of his other works, this is not GPL, so there is no direct reference, but even though he wants to keep some aspects secret, due to his research, he is very open and has already given me some pointers in the right direction.
My basic idea is to see if something in the same line can be used for bayer grid de-mosaic, so you feed the local area of the pixel value to be interpolated into a neural network, and learn it to output the missing values.
Right now I’m still in the research phase, and trying to mock together some code just to see how well the theory maps to practice, and I’ve already learned a lot – but starting form virtually nothing, that doesn’t say much! ;)
I’ll keep you posted when I get something more working.
I’ve done a fair bit of work with machine learning as a CS student, specifically on classification. My first impression is that this sounds like the wrong tool for the job. What’s the motivation for using a machine learning approach as opposed to an analytical?
It might very well be sub-optimal, but that’s why I’m doing the research :)
Anyway – for interpolating images (ie guessing “missing” values), it shows significant improvements over analytical approaches, since it’s able to detect longer running patterns and creating more natural looking images.
I saw a very interesting article, where they used traditional eedi (Enhanced Edge Directed Interpolation) for demosaic, with some pretty good results. Basically it used EEDI for the green channel, and used that to interpolate the blue/red channel.
Since nnedi has by far the best results over the “known” analytical algorithms (MEDI, ICBI, iNEDI), I would try to see how far I would get with that. Another bonus is that the NN could also be trained to output surrounding missing colours.
But if all fails, I can go back and look at some of the approaches above. :)
Hmm… it sounds a bit familiar from the world of databases where machine learning approaches are often used in complex query planning. Such solutions are rarely optimal, but they’re good enough that spending an eternity on the query planning itself certainly isn’t worth it. :-)
Anyway, this strikes me as the type of problem for which the best solutions will be analytical, but as lazy programmers, we might be excused for just letting the computer do the thinking for us. :-)