Jump to content
Search In
  • More options...
Find results that contain...
Find results in...

Need help determining duplicate files inside a 7z archive

So seeing today's video, I wanted to do something about my 300gb 7z archive that I have sitting in my D drive. It's a compressed file that contained all the files inside my PC at the time I made it, so the archive contains lots of files that I still have today. What I wanted to do is to delete all the files in the archive that I still have, leaving me with the files that I have deleted/lost over time. This turned out to be harder than I thought


My strategy was to use MD5 checksums to determine which files I had duplicates for, but this had had some problems:
1) You can't calculate MD5 checksums for entire folders, so the best I could do is to recursively enter each lowest-level folder and calculate the MD5 checksums for every single file, and list it on a txt file. Then I could compare the txt files to see what lines don't match. This runs into a running time problem though - even if the calculation of an MD5 checksum is quick, having thousands of files means that even a moderate 15gb folder takes minutes or even hours.

2) I don't know if it's possible to calculate the MD5 of a folder contained within a 7z file, so to be able to calculate the MD5, I would need to extract the file from the archive. This takes space, which I am currently very short on in my pc. Not to mention the waiting time for each extraction

3) Furthermore, in order to delete a file within a 7z archive, you need to have adequate disk space (apparantly). So even if I find two folders that contain the exact same files, I wouldn't be able to delete it and free up the space


Is there a way to tackle this problem? I know that if I buy some additional storage (maybe an external harddrive) I could get around the storage space issues, but that still wouldn't solve how long it would take to calculate all the checksums. 


Anything helps, thanks

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now