Jump to content

NLC-Banner.thumb.jpg.acb5ba835b9e8bf0718b90539633017d.jpg

Storing/Archiving Image Data


Recommended Posts

Open question to AP astronomers, what data type is best to Archive?

 Having started my “end-of-year” data clean up I am amazed at the high amount of stored image data. Running 3 systems fitted with new high pixel cameras the scale of the activity is daunting. With each image frame size between 50-117Mb I appear to have about 4TB of data across 3 PC’s that needs management. 

May I raise the following points for consideration;

What data type for archiving:

1. Light and Cal data in their raw state - Highest amount of data to archive.

2. Raw Lights with master Cal frames only - Reduced level of data.

3. Only calibrated ( Pixinsight or other data processing software) master lights - should they be Cal only (set of registered frames) or Cal and integrated ie master Lu, Re, Gr, Bl, Ha, S2, O3. lights.

4. What/any ( Pixinsight or other data processing software) stage processed files ie raw RGB, Lrgb, HSO etc.

Any thoughts on this matter is greatly appreciated.

Martin

 

 

 

 

Link to comment
Share on other sites

I've just spent the recent cloudy period clearing up several terra bytes of data..   For each object I kept all calibrated and cosmetically corrected subs and all LRGB Ha Oiii Sii masters together with  several versions of the final result.   I ditched all the intermediate files from processing as I couldn't remember what any of them were for and figured that if I want to reprocess its best to start form scratch.. 

I kept all the  calibrated cc'd subs because the future may hold better ways of normalising and integrating them (eg Pixinsight's recent Normalise Scale Gradient script) 

Dave

  • Like 1
Link to comment
Share on other sites

Hi Martin

Clearly storage is too cheap!    Also forgot to say that with my colour camera subs I only keep the calibrated files to save (a lot of)  space...  the other thing you need of course is some kind of system for naming 

Dave

Link to comment
Share on other sites

I do not even store the calibrated images, just the raw images, the stacked image, and then all my versions during processing. I can always recreate / reprocess from scratch using the raw files (although I have never done it so far).

I save everything of two hard drives (4 Tb each) and I have now filled 2 x 2 of those and I am about to start with the third couple.

  • Like 4
Link to comment
Share on other sites

I hit trouble when I started using a full-frame  mono CMOS camera at Roboscopes.com. Each FIT original sub is 120MB, and each Pixinsight processed sub 240MB. So a recent image with all the filters and the intermediate process steps generated by WBPP in Pixinsight is 241GB.

I now keep only the original subs and the master integrated frames. I found that I very rarely went back to calibrated, cosmetisized or registered subs. If there was a problem, I usually wanted to make a selection of the raw original subs , or chose different darks/flats, and re-run WBPP. So I found no value in retaining all the intermediates in .XISF format. I now dump those once I am reasonably happy with the integrated masters. If I need to reprocess I go back to the original subs.

Yes it takes more time to run WBPP again, but it is a background task I can leave running whilst I am doing other things or overnight.

For older images I store the original subs as a zip file in an archive.

I am still running out of space, but not so fast.

  • Like 2
Link to comment
Share on other sites

The same for me. I have been storing all data some time ago, but eventually gave up on it. Mono cameras are quite alright, but OSC ones are pain. One 24Mpx frame after debayer takes 280MB :( 100 frames project made with PixInsight batch processing with calibration frames takes 83GB ...

282666859_Zrzutekranu2022-01-10132249.png.b87e9fb4d4b86b4d87e2d0f7eeac37ac.png

I keep all the frames for photometry and spectroscopy projects now, but for pretty imaging I keep only stacks and processed images. I know that disk space is not expensive now, but I did not need any source frames ever for the last 10 years I do imaging. 

  • Like 1
Link to comment
Share on other sites

I use Seagate hard drives. The last couple I bought are 5 Tb and I think they were about 200 Euro each so not astronomically expensive. They will take me 2-3 years to fill up (if I am lucky). So far none of these hard drives have crashed and the oldest two I have are from 2014. If one crashes, which I expect will happen one day, I will just back the other one up on a new one.

Link to comment
Share on other sites

58 minutes ago, gorann said:

I use Seagate hard drives. The last couple I bought are 5 Tb and I think they were about 200 Euro each so not astronomically expensive. They will take me 2-3 years to fill up (if I am lucky). So far none of these hard drives have crashed and the oldest two I have are from 2014. If one crashes, which I expect will happen one day, I will just back the other one up on a new one.

Hi Gorann,

I belong to an old generation of IT users who believe that if the data is not stored in at least three separate places it does not exist. Paranoiacs-R-Us!

So my astro data is stored on a hard drive in my processing computer, on a NAS in another part of my home and in the cloud. Still works out cheaper than losing expensively obtained data sets!

 

Link to comment
Share on other sites

27 minutes ago, old_eyes said:

Hi Gorann,

I belong to an old generation of IT users who believe that if the data is not stored in at least three separate places it does not exist. Paranoiacs-R-Us!

So my astro data is stored on a hard drive in my processing computer, on a NAS in another part of my home and in the cloud. Still works out cheaper than losing expensively obtained data sets!

 

If you've no restored and verified, it still does not exist. ;)

Link to comment
Share on other sites

1 hour ago, UKDiver said:

If you've no restored and verified, it still does not exist. ;)

Yep, I do a trial restore from the cloud occasionally, not everything just a sample. And I test the NAS data once in a while. So far, so good 😉

  • Like 1
Link to comment
Share on other sites

I store only the following that massively reduces the space needed.  If long storage, they get compressed too (zip)

  • Raw Files
  • Master Flat, Bias, Darks
  • Final Image

From that I can re-produce anything I like and if I was to re-produce I would not want to produce the same image, but something different with new techniques or more data.

 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Thank you guys for your informative replies. Having initially considered saving the calibrated light frames and only processed master images would offer the most efficient data archive option, Lucas has reminded me of the importance of retaining raw frames for scientific purposes (photometry and spectroscopy).

In light of the advice you all kindly provided, I have decided to archive; the Raw files, master cal files and final processed Master images.  

This exercise has also reminded me of the importance for a structured file naming convention that supports not only day to day data storage but includes archiving and data mining in the future.

Thanks Martin

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...
On 15/01/2022 at 08:43, Xsubmariner said:

Thank you guys for your informative replies. Having initially considered saving the calibrated light frames and only processed master images would offer the most efficient data archive option, Lucas has reminded me of the importance of retaining raw frames for scientific purposes (photometry and spectroscopy).

In light of the advice you all kindly provided, I have decided to archive; the Raw files, master cal files and final processed Master images.  

This exercise has also reminded me of the importance for a structured file naming convention that supports not only day to day data storage but includes archiving and data mining in the future.

Thanks Martin

I have a template directory structure saved and copy each time I work on a new image. Here is my example below, but obviously change it to your needs.

I then only keep the raw data, calibration files and the final image.

If you set this up, your data management becomes a lot easier

image.png.9eb5325dd922306483d93fddf25d8792.png

 

image.png

Edited by Catanonia
  • Thanks 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. By using this site, you agree to our Terms of Use.