Xsubmariner Posted January 9 Share Posted January 9 Open question to AP astronomers, what data type is best to Archive? Having started my “end-of-year” data clean up I am amazed at the high amount of stored image data. Running 3 systems fitted with new high pixel cameras the scale of the activity is daunting. With each image frame size between 50-117Mb I appear to have about 4TB of data across 3 PC’s that needs management. May I raise the following points for consideration; What data type for archiving: 1. Light and Cal data in their raw state - Highest amount of data to archive. 2. Raw Lights with master Cal frames only - Reduced level of data. 3. Only calibrated ( Pixinsight or other data processing software) master lights - should they be Cal only (set of registered frames) or Cal and integrated ie master Lu, Re, Gr, Bl, Ha, S2, O3. lights. 4. What/any ( Pixinsight or other data processing software) stage processed files ie raw RGB, Lrgb, HSO etc. Any thoughts on this matter is greatly appreciated. Martin Link to comment Share on other sites More sharing options...
Laurin Dave Posted January 9 Share Posted January 9 I've just spent the recent cloudy period clearing up several terra bytes of data.. For each object I kept all calibrated and cosmetically corrected subs and all LRGB Ha Oiii Sii masters together with several versions of the final result. I ditched all the intermediate files from processing as I couldn't remember what any of them were for and figured that if I want to reprocess its best to start form scratch.. I kept all the calibrated cc'd subs because the future may hold better ways of normalising and integrating them (eg Pixinsight's recent Normalise Scale Gradient script) Dave 1 Link to comment Share on other sites More sharing options...
Xsubmariner Posted January 9 Author Share Posted January 9 Thanks Dave for your informative reply, I agree ditching intermediate files free’s up drive space and reduces demand on archive storage. Any other opinions out there? 1 Link to comment Share on other sites More sharing options...
Laurin Dave Posted January 10 Share Posted January 10 Hi Martin Clearly storage is too cheap! Also forgot to say that with my colour camera subs I only keep the calibrated files to save (a lot of) space... the other thing you need of course is some kind of system for naming Dave Link to comment Share on other sites More sharing options...
gorann Posted January 10 Share Posted January 10 I do not even store the calibrated images, just the raw images, the stacked image, and then all my versions during processing. I can always recreate / reprocess from scratch using the raw files (although I have never done it so far). I save everything of two hard drives (4 Tb each) and I have now filled 2 x 2 of those and I am about to start with the third couple. 4 Link to comment Share on other sites More sharing options...
old_eyes Posted January 10 Share Posted January 10 I hit trouble when I started using a full-frame mono CMOS camera at Roboscopes.com. Each FIT original sub is 120MB, and each Pixinsight processed sub 240MB. So a recent image with all the filters and the intermediate process steps generated by WBPP in Pixinsight is 241GB. I now keep only the original subs and the master integrated frames. I found that I very rarely went back to calibrated, cosmetisized or registered subs. If there was a problem, I usually wanted to make a selection of the raw original subs , or chose different darks/flats, and re-run WBPP. So I found no value in retaining all the intermediates in .XISF format. I now dump those once I am reasonably happy with the integrated masters. If I need to reprocess I go back to the original subs. Yes it takes more time to run WBPP again, but it is a background task I can leave running whilst I am doing other things or overnight. For older images I store the original subs as a zip file in an archive. I am still running out of space, but not so fast. 2 Link to comment Share on other sites More sharing options...
drjolo Posted January 10 Share Posted January 10 The same for me. I have been storing all data some time ago, but eventually gave up on it. Mono cameras are quite alright, but OSC ones are pain. One 24Mpx frame after debayer takes 280MB 100 frames project made with PixInsight batch processing with calibration frames takes 83GB ... I keep all the frames for photometry and spectroscopy projects now, but for pretty imaging I keep only stacks and processed images. I know that disk space is not expensive now, but I did not need any source frames ever for the last 10 years I do imaging. 1 Link to comment Share on other sites More sharing options...
gorann Posted January 10 Share Posted January 10 I use Seagate hard drives. The last couple I bought are 5 Tb and I think they were about 200 Euro each so not astronomically expensive. They will take me 2-3 years to fill up (if I am lucky). So far none of these hard drives have crashed and the oldest two I have are from 2014. If one crashes, which I expect will happen one day, I will just back the other one up on a new one. Link to comment Share on other sites More sharing options...
old_eyes Posted January 10 Share Posted January 10 58 minutes ago, gorann said: I use Seagate hard drives. The last couple I bought are 5 Tb and I think they were about 200 Euro each so not astronomically expensive. They will take me 2-3 years to fill up (if I am lucky). So far none of these hard drives have crashed and the oldest two I have are from 2014. If one crashes, which I expect will happen one day, I will just back the other one up on a new one. Hi Gorann, I belong to an old generation of IT users who believe that if the data is not stored in at least three separate places it does not exist. Paranoiacs-R-Us! So my astro data is stored on a hard drive in my processing computer, on a NAS in another part of my home and in the cloud. Still works out cheaper than losing expensively obtained data sets! Link to comment Share on other sites More sharing options...
UKDiver Posted January 10 Share Posted January 10 27 minutes ago, old_eyes said: Hi Gorann, I belong to an old generation of IT users who believe that if the data is not stored in at least three separate places it does not exist. Paranoiacs-R-Us! So my astro data is stored on a hard drive in my processing computer, on a NAS in another part of my home and in the cloud. Still works out cheaper than losing expensively obtained data sets! If you've no restored and verified, it still does not exist. Link to comment Share on other sites More sharing options...
old_eyes Posted January 10 Share Posted January 10 1 hour ago, UKDiver said: If you've no restored and verified, it still does not exist. Yep, I do a trial restore from the cloud occasionally, not everything just a sample. And I test the NAS data once in a while. So far, so good 1 Link to comment Share on other sites More sharing options...
Catanonia Posted January 12 Share Posted January 12 I store only the following that massively reduces the space needed. If long storage, they get compressed too (zip) Raw Files Master Flat, Bias, Darks Final Image From that I can re-produce anything I like and if I was to re-produce I would not want to produce the same image, but something different with new techniques or more data. 1 1 Link to comment Share on other sites More sharing options...
Xsubmariner Posted January 15 Author Share Posted January 15 Thank you guys for your informative replies. Having initially considered saving the calibrated light frames and only processed master images would offer the most efficient data archive option, Lucas has reminded me of the importance of retaining raw frames for scientific purposes (photometry and spectroscopy). In light of the advice you all kindly provided, I have decided to archive; the Raw files, master cal files and final processed Master images. This exercise has also reminded me of the importance for a structured file naming convention that supports not only day to day data storage but includes archiving and data mining in the future. Thanks Martin 2 Link to comment Share on other sites More sharing options...
Catanonia Posted January 25 Share Posted January 25 (edited) On 15/01/2022 at 08:43, Xsubmariner said: Thank you guys for your informative replies. Having initially considered saving the calibrated light frames and only processed master images would offer the most efficient data archive option, Lucas has reminded me of the importance of retaining raw frames for scientific purposes (photometry and spectroscopy). In light of the advice you all kindly provided, I have decided to archive; the Raw files, master cal files and final processed Master images. This exercise has also reminded me of the importance for a structured file naming convention that supports not only day to day data storage but includes archiving and data mining in the future. Thanks Martin I have a template directory structure saved and copy each time I work on a new image. Here is my example below, but obviously change it to your needs. I then only keep the raw data, calibration files and the final image. If you set this up, your data management becomes a lot easier Edited January 25 by Catanonia 1 Link to comment Share on other sites More sharing options...
Xsubmariner Posted January 26 Author Share Posted January 26 Thanks Cantanonia for sharing your file structure. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now