No, I am not one of them :) Thanks for the reference! I am drawing my inspiration from Bailey, and more recently Ma et al.
They label an image line by line and merge the labels during the blanking period.
If you start merging labels while the image is processed then data might get lost if the merged label occurs after the merge.
The paper that you reference divides the image into regions, so that the merging can start earlier, because labels used in one region are independent of the other regions.
If it starts earlier, it also ends earlier, so that new data can be processed.
In my case, there is no need for such high performance, just a real time requirement of 100fps for 640x480 images, where CCL is used for feature extraction.
The work by Bailey and his group is good enough, and the reference can be done in the future, if there is need for more throughput!
My workflow is a lot different from the one that you describe.
I don't use any soft cores, and write everything in VHDL!
I have used soft cores before, but they were kind of not to my liking.
I miss the short feedback loop (my PC is a Mac and the synthesis tools run in a VM).
After trying out a couple of environments, I ended up using open source tools---GHDL for VHDL->C++ compilation and simulation, and GTKwave for waveform inspection.
Usually, I start with a testbench a testbench that instantiates my empty design under test.
The testbench reads some test image that I draw in photoshop.
It prints some debugging values, and the wave inspection helps to figure out what's going on.
If it works in the simulator, it usually works on the FPGA!
But the biggest advantage is that it takes just some seconds to do all that.
I will give the softcore approach another chance once my deadline is over!
One quick note. Sometimes in image processing you can gain advantages by frame-buffering (to external SDR or DDR memory, not internal resources) and then operating on the data at many times the native video clock rate.
If your data is coming in at 13.5MHz and you can run your internal evaluation core at 500MHz there's a lot you can do that, all of a sudden, appears "magical".
The paper that you reference divides the image into regions, so that the merging can start earlier, because labels used in one region are independent of the other regions. If it starts earlier, it also ends earlier, so that new data can be processed.
In my case, there is no need for such high performance, just a real time requirement of 100fps for 640x480 images, where CCL is used for feature extraction. The work by Bailey and his group is good enough, and the reference can be done in the future, if there is need for more throughput!
My workflow is a lot different from the one that you describe. I don't use any soft cores, and write everything in VHDL! I have used soft cores before, but they were kind of not to my liking. I miss the short feedback loop (my PC is a Mac and the synthesis tools run in a VM).
After trying out a couple of environments, I ended up using open source tools---GHDL for VHDL->C++ compilation and simulation, and GTKwave for waveform inspection.
Usually, I start with a testbench a testbench that instantiates my empty design under test. The testbench reads some test image that I draw in photoshop. It prints some debugging values, and the wave inspection helps to figure out what's going on.
If it works in the simulator, it usually works on the FPGA! But the biggest advantage is that it takes just some seconds to do all that.
I will give the softcore approach another chance once my deadline is over!