Jump to content

Digital Media Storage Methodologies

I have a mess of digital media to sort out, photos and videos, and I want a robust system to move forward with. I've been doing some research and want to offer up what I've found so far for discussion.

Three Tier

Format

YYYY->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01-04 Sleepover Evening at In Laws
      • Photos
    • 2020-01-05 Sleepover Morning at In Laws
      • Photos
    • 2020-01-05 Walk to the Park
      • Photos
    • 2020-01-05 Dinner at In Laws
      • Photos

Comments

  • Flattest, most denormalized methodology
  • Feels like year directories could get very bloated
  • Special events i.e. holidays, that span monthly boundaries would be visually contiguous
  • Complex day events could have further sub directories:
    • Those with many sources such as other people and camera types
    • Events that are long like graduations may be divided into sections

Four Tier

Variant One

Format

YYYY->YYYY-MM->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01
      • 2020-01-04 Sleepover Evening at In Laws
        • Photos
      • 2020-01-05 Sleepover Morning at In Laws
        • Photos
      • 2020-01-05 Walk to the Park
        • Photos
      • 2020-01-05 Dinner at In Laws
        • Photos

Other Variant Formats

YYYY->MM->DD Event->Photos
YYYY->MM Name->DD Event->Photos

Comments

  • Breaks things down further by month under the year level
  • More nested than Three Tier methodology
  • Holiday events that span month boundaries would be broken up by month so not visually contiguous
  • Same rules possible for complex events and multiple sources as Three Tier

Five Tier

Format

YYYY->YYYY-MM->YYYY-MM-DD->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01
      • 2020-01-04
        • 2020-01-04 Sleepover Evening at In Laws
          • Photos
      • 2020-01-05
        • 2020-01-05 Sleepover Morning at In Laws
          • Photos
        • 2020-01-05 Walk to the Park
          • Photos
        • 2020-01-05 Dinner at In Laws
          • Photos

Comments

  • Plus, all the naming variants as in Four Tier
  • Deep nesting
  • Could potentially have one or two directories in days
    • These could contain many or few photos
  • Same rules possible for complex events and multiple sources as Three Tier

General Comments

  • With the Three Tier methodology the concern of ‘keys’ is not there because everything is flat under the year directory.
  • > Three tiers then we might start thinking in terms of a relational database i.e. primary and secondary keys:
    • YYYY is a primary key [PK] at level one but is a foreign key at level two and so on
    • YYYY[FK]-MM[PK]
    • YYYY[FK]-MM[FK]-DD[PK]
  • Do we even want such a bloated naming convention with more than three tiers?
  • It might be unwieldy to deal with anything beyond four tiers, we are humans and not computers after all
  • It is undesirable to rely on additional software such as Lightroom
  • The system must be as future proof as possible
  • It should be intuitive for all that may encounter it – this will be handed down and added to through the generations
  • Metadata is nice but it should be an independent concern that can be dealt with later/separately

I would greatly appreciate your thoughts and input.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Lordess said:

Deep nesting

Careful you don't hit the 255char name/file limit

 

NOTE: I no longer frequent this site. If you really need help, PM/DM me and my e.mail will alert me. 

Link to comment
Share on other sites

Link to post
Share on other sites

I'm "2 tier* with media category first

 

- Photos

   - 200105 Event

   - 200402 Event2

 

- Videos

   - 200105 Event

   - 200402 Event2

 

Because often having mixed media types causes headaches such as resyncing folders in Lightroom from the top folder would import all the videos I don't want in there, requiring to manually sync the folders one by one...

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Kilrah said:

I'm "2 tier* with media category first

...

Yes I should have been more specific about that. Photos and Videos would be two distinct root directories as you have it. I think it would be ill contrived to combine these types together.

 

Two videos I found useful as reference points where:

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

I'd go 4th teir with the other variant, no need to have the year or month listed 3 times.

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

By way of a update I consider the research concluded. I asked the question in a number of different places and the consensus was the three tier system separated by media type at the root level:

Pictures|Videos->YYYY->YYYY-MM-DD Event->Media Files

 

Link to comment
Share on other sites

Link to post
Share on other sites

My storage system is as follows, so far on x4 2tb drives (two on prem, two off prem)

 

My folder structure can be found in the attached screenshots.


 

Screenshot at 2020-05-23 00-49-09.png

Screenshot at 2020-05-23 00-48-43.png

System Specs: Second-class potato, slightly mouldy

Link to comment
Share on other sites

Link to post
Share on other sites

You can follow the folder structure at the top of each screenshot.

For the first screenshot, in each folder, you'd find my RAW files from each of those shoots. I usually delete any RAWs where the exposure is way off, to save space (less of a problem now that I shoot mirrorless)

 

 For the second screenshot, "Finished Edits" would have the final edited TIFF file and PSD file. "RAWs" would contain the raw file (I like to keep two copies of the RAW file for redundancy; one here and one under "Masters". It also helps having the RAW file handy in case I ever want to revisit that shot so I know exactly which out of the bazillion exposures I settled on) "Misc" would be any misc edits that didn't make the final cut but I didn't want to throw away.

System Specs: Second-class potato, slightly mouldy

Link to comment
Share on other sites

Link to post
Share on other sites

On 5/15/2020 at 8:51 AM, Lordess said:

I have a mess of digital media to sort out, photos and videos, and I want a robust system to move forward with. I've been doing some research and want to offer up what I've found so far for discussion.

Three Tier

Format


YYYY->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01-04 Sleepover Evening at In Laws
      • Photos
    • 2020-01-05 Sleepover Morning at In Laws
      • Photos
    • 2020-01-05 Walk to the Park
      • Photos
    • 2020-01-05 Dinner at In Laws
      • Photos

Comments

  • Flattest, most denormalized methodology
  • Feels like year directories could get very bloated
  • Special events i.e. holidays, that span monthly boundaries would be visually contiguous
  • Complex day events could have further sub directories:
    • Those with many sources such as other people and camera types
    • Events that are long like graduations may be divided into sections

Four Tier

Variant One

Format


YYYY->YYYY-MM->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01
      • 2020-01-04 Sleepover Evening at In Laws
        • Photos
      • 2020-01-05 Sleepover Morning at In Laws
        • Photos
      • 2020-01-05 Walk to the Park
        • Photos
      • 2020-01-05 Dinner at In Laws
        • Photos

Other Variant Formats


YYYY->MM->DD Event->Photos

YYYY->MM Name->DD Event->Photos

Comments

  • Breaks things down further by month under the year level
  • More nested than Three Tier methodology
  • Holiday events that span month boundaries would be broken up by month so not visually contiguous
  • Same rules possible for complex events and multiple sources as Three Tier

Five Tier

Format


YYYY->YYYY-MM->YYYY-MM-DD->YYYY-MM-DD Event->Photos

Example

  • 2020
    • 2020-01
      • 2020-01-04
        • 2020-01-04 Sleepover Evening at In Laws
          • Photos
      • 2020-01-05
        • 2020-01-05 Sleepover Morning at In Laws
          • Photos
        • 2020-01-05 Walk to the Park
          • Photos
        • 2020-01-05 Dinner at In Laws
          • Photos

Comments

  • Plus, all the naming variants as in Four Tier
  • Deep nesting
  • Could potentially have one or two directories in days
    • These could contain many or few photos
  • Same rules possible for complex events and multiple sources as Three Tier

General Comments

  • With the Three Tier methodology the concern of ‘keys’ is not there because everything is flat under the year directory.
  • > Three tiers then we might start thinking in terms of a relational database i.e. primary and secondary keys:
    • YYYY is a primary key [PK] at level one but is a foreign key at level two and so on
    • YYYY[FK]-MM[PK]
    • YYYY[FK]-MM[FK]-DD[PK]
  • Do we even want such a bloated naming convention with more than three tiers?
  • It might be unwieldy to deal with anything beyond four tiers, we are humans and not computers after all
  • It is undesirable to rely on additional software such as Lightroom
  • The system must be as future proof as possible
  • It should be intuitive for all that may encounter it – this will be handed down and added to through the generations
  • Metadata is nice but it should be an independent concern that can be dealt with later/separately

I would greatly appreciate your thoughts and input.

I'm a professional photographer with 15 years experience and this is very similar to what I do.

  • 2020
    • MMDDYY_Job Name
      • Capture
        • Shot#/Look
          • RAW Photos
      • Output
        • QuickProof JPG's
          • Small JPG Photos
        • Retouching
          • Whatever Retouching Assets
          • Probably another folder of working PSD's
        • Deliver
          • Final Deliverable Photos (sometimes there will be a RAW and JPEG deliverable folder)
      • Selects
        • On Set Selects
      • Trash
        • Bad RAW Photos

I backup all my incomplete jobs (in their entirety), and final deliverable files on Dropbox. I use Dropbox Transfer or Hard Drives for asset delivery to client, and when it's all said and done, I'll have a backup of my jobs on 3 hard drives in 3 different locations (1 at the studio, 1 in the safe in my car, 1 in safety deposit box at the bank). All of those drives are also encrypted with 30-digit passwords.

 

Ever since I started doing this and abandonded the dumpster fire of Adobe Lightroom, i've been able to find images and jobs lightening quick.

 

If I do motion, it's a little different, I'll have a folder for proxies and stuff like that, but the overarching system is the same.

Work Rigs - 2015 15" MBP | 2019 15" MBP | 2021 16" M1 Max MBP | Lenovo ThinkPad T490 |

 

AMD Ryzen 9 5900X  |  MSI B550 Gaming Plus  |  64GB G.SKILL 3200 CL16 4x8GB |  AMD Reference RX 6800  |  WD Black SN750 1TB NVMe  |  Corsair RM750  |  Corsair H115i RGB Pro XT  |  Corsair 4000D  |  Dell S2721DGF  |
 

Fun Rig - AMD Ryzen 5 5600X  |  MSI B550 Tomahawk  |  32GB G.SKILL 3600 CL16 4x8GB |  AMD Reference 6800XT  | Creative Sound Blaster Z  |  WD Black SN850 500GB NVMe  |  WD Black SN750 2TB NVMe  |  WD Blue 1TB SATA SSD  |  Corsair RM850x  |  Corsair H100i RGB Pro XT  |  Corsair 4000D  |  LG 27GP850  |

Link to comment
Share on other sites

Link to post
Share on other sites

On 5/31/2020 at 5:54 AM, Action_Johnson said:

I'm a professional photographer with 15 years experience and this is very similar to what I do.

It is good to know that you follow that sort of structure too.

On 5/31/2020 at 5:54 AM, Action_Johnson said:

I'll have a backup of my jobs on 3 hard drives

I've recently been having some reflection on this. I too keep a local drive with a backup of all of my data, critical and not critical. I also keep multiple cloud based backups for just the critical data. The issue is that local HDDs are not generally going to be bit rot protected. It worried me quite a bit because this premis basically suggests all non Btrfs/ZFS protected storage is non viable. Personally I would only use the local HDD to recover the non critical data and defer to online storage for the rest. Just something to think about.

 

I wondered if anyone might have some insight as to a workflow for some of the online photo platforms. I made a separate topic about that here.

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, Lordess said:

The issue is that local HDDs are not generally going to be bit rot protected.

Every 6 months or so I do a full integrity check of my main drive and its 2 backups so any such thing can be detected and corrected, probability is almost nil that the same file/sector would fail on all 3.

 

Also I literally have 20+ year old PCs that still run flawlessly on the original drive and install, so it's not THAT likely at all.

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Kilrah said:

Every 6 months or so I do a full integrity check of my main drive and its 2 backups...

Just curious as to what you use for this integrity check? If an error were found then is there not any parity information present to attempt to correct it?

7 minutes ago, Kilrah said:

...it's not THAT likely at all.

I do agree with you.

Link to comment
Share on other sites

Link to post
Share on other sites

I use FreeFileSync, in "compare file contents" mode. It will read the entirety of each file which firstly exercises the drives to reach anything with data on, and will show any discrepancy.

 

No parity since all 3 copies are identical RAID0 sets.

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Kilrah said:

I use FreeFileSync, in "compare file contents" mode. It will read the entirety of each file which firstly exercises the drives to reach anything with data on, and will show any discrepancy.

Ah ok, so does it create some kind of hash the first time and then all subsequent passes compare against this?

Link to comment
Share on other sites

Link to post
Share on other sites

No, the standard sync is based on file date/time/size like most sync programs.

With compare file contents it actually... compares the entire contents, not hashes.

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Lordess said:

It is good to know that you follow that sort of structure too.

I've recently been having some reflection on this. I too keep a local drive with a backup of all of my data, critical and not critical. I also keep multiple cloud based backups for just the critical data. The issue is that local HDDs are not generally going to be bit rot protected. It worried me quite a bit because this premis basically suggests all non Btrfs/ZFS protected storage is non viable. Personally I would only use the local HDD to recover the non critical data and defer to online storage for the rest. Just something to think about.

 

I wondered if anyone might have some insight as to a workflow for some of the online photo platforms. I made a separate topic about that here.

3 hard drives in my mind protect against that, and honestly so much of what I shoot isn't so mission critical that it needs to be kept forever.

Work Rigs - 2015 15" MBP | 2019 15" MBP | 2021 16" M1 Max MBP | Lenovo ThinkPad T490 |

 

AMD Ryzen 9 5900X  |  MSI B550 Gaming Plus  |  64GB G.SKILL 3200 CL16 4x8GB |  AMD Reference RX 6800  |  WD Black SN750 1TB NVMe  |  Corsair RM750  |  Corsair H115i RGB Pro XT  |  Corsair 4000D  |  Dell S2721DGF  |
 

Fun Rig - AMD Ryzen 5 5600X  |  MSI B550 Tomahawk  |  32GB G.SKILL 3600 CL16 4x8GB |  AMD Reference 6800XT  | Creative Sound Blaster Z  |  WD Black SN850 500GB NVMe  |  WD Black SN750 2TB NVMe  |  WD Blue 1TB SATA SSD  |  Corsair RM850x  |  Corsair H100i RGB Pro XT  |  Corsair 4000D  |  LG 27GP850  |

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, Lordess said:

 I also keep multiple cloud based backups for just the critical data.

that is overkill. 1 good one like backblaze and you will be fine.

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, GDRRiley said:

that is overkill. 1 good one like backblaze and you will be fine.

Better to have data in several different locations with different providers.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Lordess said:

Better to have data in several different locations with different providers.

given they all have different geo locations no need to

Good luck, Have fun, Build PC, and have a last gen console for use once a year. I should answer most of the time between 9 to 3 PST

NightHawk 3.0: R7 5700x @, B550A vision D, H105, 2x32gb Oloy 3600, Sapphire RX 6700XT  Nitro+, Corsair RM750X, 500 gb 850 evo, 2tb rocket and 5tb Toshiba x300, 2x 6TB WD Black W10 all in a 750D airflow.
GF PC: (nighthawk 2.0): R7 2700x, B450m vision D, 4x8gb Geli 2933, Strix GTX970, CX650M RGB, Obsidian 350D

Skunkworks: R5 3500U, 16gb, 500gb Adata XPG 6000 lite, Vega 8. HP probook G455R G6 Ubuntu 20. LTS

Condor (MC server): 6600K, z170m plus, 16gb corsair vengeance LPX, samsung 750 evo, EVGA BR 450.

Spirt  (NAS) ASUS Z9PR-D12, 2x E5 2620V2, 8x4gb, 24 3tb HDD. F80 800gb cache, trueNAS, 2x12disk raid Z3 stripped

PSU Tier List      Motherboard Tier List     SSD Tier List     How to get PC parts cheap    HP probook 445R G6 review

 

"Stupidity is like trying to find a limit of a constant. You are never truly smart in something, just less stupid."

Camera Gear: X-S10, 16-80 F4, 60D, 24-105 F4, 50mm F1.4, Helios44-m, 2 Cos-11D lavs

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, GDRRiley said:

given they all have different geo locations no need to

You are still relying on only one provider though.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×