Jump to content

Banner.jpg.b83b14cd4142fe10848741bb2a14c66b.jpg

Bit of coding and my 383 is now an integrating video camera :D


NickK

Recommended Posts

  • Replies 81
  • Created
  • Last Reply

Sign language in video as well nick...brilliant. ..lol only kidding great work. ..what's next got to keep you busy...Davy

*pout* Sandrine said almost the same thing.. and laughed at your comment!

I have an idea.. *evil grin*

Link to comment
Share on other sites

2013-08-07 20:45:25.388 ExampleApplication[15092:303] Changing live view to ART-11002 (fd120000)

2013-08-07 20:45:25.489 ExampleApplication[15092:303] Testing cam with testExposure

2013-08-07 20:45:51.906 ExampleApplication[15092:2203] image w:4007, h:2671, stack w:4007 h:2671

...

2013-08-07 20:48:16.993 ExampleApplication[15092:18707] Processing subsequent image..

2013-08-07 20:48:20.789 ExampleApplication[15092:18707] translate source (0,0) 4007x2670, dest 0,0

Hmm.. playing a little - 4 seconds to align, translate and stack for the 22MB frame from the 11000, so not much different from the 17MB 383L. More testing on that required.

I used the test camera (ie the option in the example app) and profiled the code - the bottleneck is OSX dispatch for the small frames the ATIK titan makes, but I may decide to make a fast-and-furious mode for the titan.

Link to comment
Share on other sites

  • 8 months later...

Hmm well I have been thinking about this more.. sort of makes a bit more difficult when you're doing things in parallel.

There's an interesting technique that I think will work with star detection (not that FFT uses it) but for deconvolution. Part of Einstein's relativity uses folds in space, I think it may be possible to use this for detecting stars more reliably.. still researching…

Link to comment
Share on other sites

Ok, so.. I'm starting to feel the pressure from Paul's sterling work ;):p

I have the OpenGL rendering (so the display is lightning fast) and the openCL pipeline feeding into this working with a static stack already.. no big surprise here. I've just laid out the design for a dark/flat/debayer/FFT registration/curve stretching/stacking on paper and additionally an optic flow based turbulence filter. Unlike doing CPU based processing, OpenCL requires a little planning in terms of memory as you have about 1GB of GPU space for all of the images, temporary data sets etc and finite processing power..

I think I have come to a nice compromise as the larger cameras with are pushing the GPU's 4096x4096 texture limit to the maximum, but high speed cameras provide the opportunity for super resolution and other such mechanisms.

In terms of the star detection, there's some hideous maths for ricci tensors (this was the General Relativity bit) etc that I have tentative grip on and thinking more about the PSF being both PSF vector field and probability makes me think that's not the way togo - the reality here is that the top most point of a star is not always the top of the PSF for deconvolution - it's one of the tops (and the corresponding PSF many not be parallel on the plane).. so I think this will become an iterative field fitting approach. 

Think of it this way - for each bucket of sand you drop onto the spot, the sand distributes itself - so the top of the peak for one bucket isn't the top if the next bucket is placed on it. Therefore by the 10th bucket - you have a mess.. or a probability distribution of PSF.. where the top of the peak is just due to the highest number of photo strikes (not that it's the top photons that have stacked up).

I think the better option for this would be to go with a compromise - use a vector field (mesh) of PSF at each point in the mesh.. then interpolate the corresponding PSF for deconvolution over the 2D image. This approach requires the detection of stars.. which as Paul will know - is not quite as straight forward as it seems. However for this approach we're not attempting to calculate the centroid or FHWM for accurate pin pointing - instead the PSF is what it is.. you take it as the mess it is.

Next after that is the remodelling of the image back into a deconvoluted image (sticking to linear) which may or may not be perfect representation but would be a cool feature.

Link to comment
Share on other sites

Hi Nick

I wish I understood half of what you say in your post!!!

Kudos to people like you and Paul81 that put so much effort into developing applications that we mere mortals can use.

Keep us updated on your progress.

CS

Paul

Link to comment
Share on other sites

Ahh deconvoiution is something I've done before - but not in realtime.

Starting point: http://stargazerslounge.com/topic/158573-what-did-the-point-spread-function-psf-ever-do-for-us/

PI processing steps: http://stargazerslounge.com/topic/157001-using-dynamic-psf-to-add-some-clarity/

Image "sharpening" is effectively a blind deconvolution where it uses an ideal representation and attempts to reshape according to that. The steps above explain what I'm seeking todo..

As you may have gathered (and I get the impression that Paul is the same) that we don't take no for an answer and that pushing the limits is all part of the fun.

Core notes here are:
* motion blur - this is an the average vector field (i.e. the same mess reoccurring over the entire image, i.e. the image shifted over repeatedly)

* drizzle - this can be done (even as a 45degree rotation to prevent blockiness) on the fly - GPUs have hardware pixel interpolation too so you get that for free :D

* each of the stars detected provides the ability to attempt to interpolate so you can create the PSF for each pixel (caveats apply to simplify)

There's a nice paper by Henry Stark (no not related to Craig Stark of PHD/Nebulosity) and Yongyi Yang snappily entitled "Vector Space Projections: A Numerical approach to signal processing and image processing, neural nets, and Optics."..

Link to comment
Share on other sites

Planets.

There's two ways to track:

a) follow the centroid - in this case you use the circular planet as a star and calculate the centre.. then track it

B) follow features - where the tracker follows features (i.e. bright bits and dark bits). Rotation of the planet causes long exposures to need additional processing to either track new features on the planet etc. Otherwise the planet will align as a long line over the period of the rotation.. not to mention features like the great red spot rotate..

The problem with high magnification is the atmospheric turbulence and the tracking attempting to follow the turbulence. One option is to use optic flow to provide a set of suggested movements (one for each feature) then if there's turbulence you simply filter out the suggested movements - think of 10 people giving you directions.. then go with the majority directions.

Solar also has the same issue however the feature mapping cannot watch the rotation as solar loops etc do not have a repeatable or definable pattern.

Use of viewing on Alt-Az

The curse of field rotation.. not a problem with the alignment that copes with rotation and transform.. however not many people will do live stacking on a single target for hours at a time - that's where traditional AP and PI comes in.

Link to comment
Share on other sites

Super resolution - or Drizzle - and getting detail out of small fuzzies.

People are probably sick of me mentioning it, however here is an example of the little bear paw with a 10.5cm refractor:

http://stargazerslounge.com/topic/156355-old-data-new-processing-bear-paw-galaxy/

Nytecam did this too: http://stargazerslounge.com/topic/137276-bears-paw-galaxy/ and the 30cm (aperture = detail!) with reducer+lodestar combo works really well.

Also here's the smallest fuzzie I've managed realistically - MGC +08-15-056 - 28arc seconds across with drizzle of 4 exposures on a small 4" refractor:

http://stargazerslounge.com/topic/159646-under-the-microsope-28-across-its-mgc

post-9952-0-83807100-1345584503.png

You'll note the travelling dot - that is a hot pixel from the original images (5.45um) - I can't remember the last time I used a dark for viewing!

The above are with PI.. but the same technique is possible for live viewing.

The reason I keep banging on about it is that it is *perfect* for small sensors. The large numbers of frames from smaller sensors (especially those with small pixels) works really well. Second points is that it can be used for binned sensors - 2x2 binning giving more sensitivity and then using multiple frames to drizzle makes the system even more sensitive!

Link to comment
Share on other sites

Hi Nick, sounds like you have been very hard at work! Lots of great info in your posts! I certainly feel your pain with star detection, its not as easy as it first sounds. I have gone around several ideas but I think things are half coming together on that front. All my stuff after star detection (through to stacking) seems to hold up thus so far. I have a couple of bits to work on and then I'll try every data set I have and see what else breaks! Lol!

Why do I feel this thing will be never ending?! Haha!  :rolleyes:

Link to comment
Share on other sites

  • 2 weeks later...

Hehe - yup there's an amazing amount of things you can do (and more than I can think of ;)) with a set of 3D boxes of data (I mean astro subs)! Ie a 4D (x,y, depth - over time) image to play with.

I thought I'd elaborate further based on the suggestions about light pollution. Some musings based on my experience of LP from my old place in Bracknell - the place in Guildford doesn't seem as bad..

No mistake - the best form of observing through light pollution is to remove the light pollution itself - natually a LP filter is one method, a scanning spectrograph is another (simply subtract the sodium spectrum) .. or for the super rich - move to a nice dark site.

Mono CCDs cannot distinguish between signal and noise (including LP). All they see is a bright spot.

So you can't remove LP from the image and reclaim what is lost in the noise as you don't have enough information about the targets that sit in the noise.

Now that last point in italics is rather important.. especially as LP is usually reflected of moving water vapour.. hence over time there are subtle changes in the LP. Now if your camera is sensitive enough.. with enough dynamic range.. it will detect small differences in the background - those differences are likely to be glimpses of fuzzes and targets that may exist within the LP (given that you can subtract bias and perform flats on the image).

Thinking around the making a LP background dark - to please the human eye..

Now by finding the average background - using a gaussian filter to remove detail - will give you a what looks like a uneven level over the image. However it also means if you set the histogram black level and then stretch the remaining signal in the image - you have a problem. The signal dynamic range is now that between the top of the gaussian level and full saturation of the pixel. In a single frame this means you could drop from 16 bit to 12 or even 8 bits per pixel in information.

By black level stretching - you requantise in the 4D domain. That is you could recover information during the stacking of the signal using a drizzle and interpolating. However. By using a non-linear stretch here.. you're going loose by compressing the lower values of noise and image into a smaller number of bits (or remove completely). So kiss goodbye to the small fuzzles that the human brain is far more adept to pick out than current computer algorithms.

My thinking here is if you perform stacking by upscaling the stacking first but in a special way (I'll explain in a second) then the range you have to calculate the noise/LP level and stretch is far greater.

So if you stack by just doubling the dynamic range using add - the image will look identical to it before. The only difference is that the level step size (i.e. what was 16 bit value of 1 is now a 32 bit value of 2) so all the calculations would just be scaled up. Therefore there needs to be a modifier to cause interpolation - or a calculation that gives 0.5 results for example. That could be as simple as averaging the same two pixels between two frames, by interpolating over time, or using the differences between gaussian background level to rescale..

In one line - the previous step is to looking to recover information at a sub-pixel 16bit value in a set of images using a function to pull out the inbetweeners between images.

With an upscaled pixel depth - it means you can look to cull the LP with more granularity, you can stretch more efficiently to detect targets within the LP.

You can do lots of different filtering (even FFT based fitters) in the time domain to get that detail out of the varying LP..

Link to comment
Share on other sites

Just thinking about this point "Thinking around the making a LP background dark - to please the human eye.."

The human eye is more sensitive to blue light. For mono images (or a special OSC mode) you can place the bright signal as red but then colour the identified parts of fuzzies as blue, leaving the LP level as darker.

Link to comment
Share on other sites

Not finished yet, by a long way, but for fun I've just timed all the FFTs used in registration on the 2048x2048 image sizes:

2014-05-19 19:00:07.645 ExampleApplication[1774:303] setImage start

2014-05-19 19:00:07.684 ExampleApplication[1774:303] setImage registration complete

So that's 0.39 seconds without optimisation, so it's looking good in beating the original partially optimised stacking of 4 seconds per image registration (17MB).

I still have a way to go.. but the pipeline is there.. and it's looking good so far.

Link to comment
Share on other sites

So when its done nick what will be the overview of what the attik will be able to do with your software. .will it stack the images in near real time. .and be a fast integration camera or will it stack long exposed images ??? Davy

Link to comment
Share on other sites

The driver work is a spare time thing for me, I don't work for ATIK nor the others that are using it, it's useful for them. The Example App was developed as a driver exercise tool and solves a problem - using my cameras on my mac..

However - aside from the driver work - I have a bit of a warped obsession with image processing. It's not the first time and this time the market has OpenCL which has been missing when I was looking at attempting to process images - here the visualisation of what I mean (this was from 2006-2008) but the processing at the time didn't have enough oomph :

post-9952-0-69545000-1400541289_thumb.pn

What I want is a "realtime" look through the telescope and see more of what's out there. To be honest the majority of my AP is more focused on "I wonder.." focused on small targets in the hour or so I may have. Realtime to me is you sit and look at it.. and the cameras I have are my current ones.

I think the "ExampleApp" will probably get a rename.. I have an idea of what I want to call it.. but for now I want to get the registration working and then get some data through it - I have some ideas for filters that will help overcome the less flexible aspects of FFT based alignment (no warping for example, but rotation, scaling and translation is supported). The fast element is that this can then be used for getting a peep into DSOs, plants and also solar with an offline ability allowing very fast captures to queue up the frames. however when you have 1000 frames of solar.. that needs speed so if I can get frames per second it's all good.

There's more I want to build on this base but it's not going to appear in a big bang. I have lots of ideas.. but there will naturally be a limit in the GPU horse power and memory available.

Will it do what PI/Neb/.. ad infinium does? No.. I have PI and it's not designed for that. It's about now visualisation.

Link to comment
Share on other sites

  • 3 weeks later...

It lives Igor..

Well I had enough time over the last couple of nights to get some hacking done and in true JamesF style, it's held together with gaffer tape and balancing of gravitational forces.. but behold the first image registration using the GPU exclusively.. 

post-9952-0-86235300-1402070888_thumb.pn

Now there's not much difference from the input image.. well zero (that's a good thing).. but each of the steps saves out images so I can see from those that each step is working (and given a bit of a prod means they rotate or translate etc as required).

There's still lots todo - but it's getting there.

It would be so easy.. if it wasn't for those pesky buggy GPU drivers!

Link to comment
Share on other sites

Well I've been playing around, seeing about maximum size of images for the GPU.

It works with 4096x4096 LRGB (as GPUs store these as floating point.. that's a 256MB image) but it starts feeling it as the driver is starting to swap textures in/out of the GPU space as my GPU has 1GB (i.e. 1000MB) of memory (and a GPU is then tied to PCI-E bus speed :Envy: ). Not that a 35MB image or each frame from a camera over USB is going to be "realtime" :D

I'll probably limit the maximum size to a 2048*2048 for the realtime.. either by cropping, sub framing or binning :) makes the camera download faster :D

Link to comment
Share on other sites

2014-06-10 18:28:48.170 ExampleApplication[1512:303] setImage start2014-06-10 18:28:48.209 ExampleApplication[1512:303] IMG: /Users/Nick/Desktop/fft/rotatedImage2014-06-10 18:28:48.227 ExampleApplication[1512:303] IMG: /Users/Nick/Desktop/fft/registeredImage2014-06-10 18:28:48.228 ExampleApplication[1512:303] setImage update stack complete2014-06-10 18:28:48.228 ExampleApplication[1512:303] setImage render stack complete

Not optimised at all, actually processing LRGB (rather than mono), with titan sized images is looking good - 0.057 seconds to rotate and translate each image frame. So that's 17 FPS registration :D 2048x2048 images are being LRGB processed in less than a second but the poor kodak sensor takes longer than that to read..

Now mono optimisation could be done in two ways - batch 4 frames and process simultaneously (more efficient) giving something like 60FPS.. or true mono that has some neat optimisations that could be done but would then probably suffer calling overheads..

Now I can add more processing to do all sorts of funky stuff.. but for now that seems like a pretty good .. well actually it's bananas.

Link to comment
Share on other sites

Nothing says it better than a video..

Once I have a lens on the camera it will be more sensitive. The camera is bare with just the sensor pointing to the screen.. but it still managed to pick up the movement light caused by my hand..

You can see the text going. in the back ground - they're the frames.. 

Link to comment
Share on other sites

Trying the mono option… 

So, currently the performance is (with rotate and translate):

* pseudo test camera (ie 659x492 static image from a file) gives 30 fps mono, profiling the code 38% in the CPU

* Titan (659x492) gives 10 fps mono, profiling the code shows time spent 46% in the CPU, logically that means 54% in the GPU or spent waiting on I/O - at the moment it takes an image, processes, then takes the next, processes it..  using multiple threads it could download the next whilst processing the last.. maybe in the next version.

You'll note vingetting - that is a technique to prevent the registration detecting the hard edge of the image as a registration reference point.

Still have a some hardcore testing on the algorithm but I wanted to show the progress so far.

First the real camera at 10fps:

Next the pseudo camera at 30fps - yes 30FPS!:

As the technique uses FFT with an image, the main issue is the bounding point giving a stronger matching characteristic (stronger matching signal) than the image content - to overcome this I've implemented a 2D Hann window (Hamming window is almost the same but different parameters). This makes the image edges appear like a soft fade, so it prevents the FFT locking onto the strong hard edge of the image. Now this only has to be applied to the FFT input but for now the rendering also uses this (hence the image looks like a 1920s film). The stack is a few frames and is cleared (which makes it look even more like a 1920s film).. The GPU mono mode uses the red channel - again this can be sorted out during stacking so not a serious issue.

I've optimised GPU kernels by partially unrolling loops (i.e. steps of 4 etc), however the main aspect is probably invocation of the kernels. The only PCI-E data transfer is the new image that is uploaded, everything else is already in the GPU - including the rendering to the window. There is more GPU optimisation I can do - however if the single thread is causing slow down, then that would be the next best thing todo.

There is a case where I can start processing the next frame on the CPU however from experience GPUs are only just starting to be able to perform an upload and GPU calculation at the same time. They're built for speed and such are simple beasts.

15/06/2014 11:28:52.460 ExampleApplication[20629]: FPS: 3015/06/2014 11:28:52.482 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.516 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.550 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.550 ExampleApplication[20629]: reference frame 988815/06/2014 11:28:52.566 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.600 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.635 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.669 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.704 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.739 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.775 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.811 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.846 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.879 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.914 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.948 ExampleApplication[20629]: setImage start15/06/2014 11:28:52.982 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.022 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.058 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.092 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.092 ExampleApplication[20629]: reference frame 990415/06/2014 11:28:53.107 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.141 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.176 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.210 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.249 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.283 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.318 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.353 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.392 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.426 ExampleApplication[20629]: setImage start15/06/2014 11:28:53.460 ExampleApplication[20629]: FPS: 30
Link to comment
Share on other sites

Impressive stuff Nick!

Not sure if it's of any interest, but I came across this FFT-based image registration algorithm a while ago (Matlab).

Martin

%% Efficient subpixel image registration by cross-correlation. 

% Registers two images (2-D rigid translation) within a  fraction 

% of a pixel specified by the user. Instead of computing a zero-padded FFT 

% (fast Fourier transform), this code uses selective upsampling by a

% matrix-multiply DFT (discrete FT) to dramatically reduce computation time and memory

% without sacrificing accuracy. With this procedure all the image points are used to

% compute the upsampled cross-correlation in a very small neighborhood around its peak. This 

% algorithm is referred to as the single-step DFT algorithm in [1].

%

% [1] Manuel Guizar-Sicairos, Samuel T. Thurman, and James R. Fienup, 

% "Efficient subpixel image registration algorithms," Opt. Lett. 33, 

% 156-158 (2008).

Link to comment
Share on other sites

Yup, already using DFT.. most image routines on computers use DFTs but refer to them as FFTs.

The main issue with GPUs is that they're not normally great at dynamic changes in dimensions, so I use a static size for rotation - this is why you see zero space outside to make it easy for the GPU.

You'll note that the algorithm you've indicated states it's translation. I'm doing rotation and translation. If I was to just translate then I can cut the image size down. 

I'll probably add a DFT upscale for a subregion to make the registration more accurate, it would be an easy step to implement. However for now - I think I need to re-architect the threading model used to permit multi GPU (OSX Mavericks supports both GPUs being used) but also get the ability to get the CPU involved more.

Link to comment
Share on other sites

  • 3 weeks later...

Update video :)

Still not optimised, but I've fixed a large number of bugs (as is the case when GPUs are involved - not all are mine!):

If you look carefully at the edge of the image, you'll see the border move as I press against the scope. I still need to re-optimise but I thought I'd show you the progress.

The system is taking a key frame every 16 frames, the remaining 15 frames after the key are aligned to it.

This is good progress - it seems to work with garden images, the planetary and solar images I have (including Ha proms).

Link to comment
Share on other sites

Testing DSOs.. using the data I got from Olly P's place for M42.. note these are set of 600 second, 10 second and 4 second images shot through the Pentax 4" APO. They're 17MB 383L images - without any stretching etc. I should perhaps do a saturation fix - masking on the fly :D

post-9952-0-63167300-1404757948_thumb.pn

Took sunday off of anything computery/DIY so ended up oil painting..

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. By using this site, you agree to our Terms of Use.