Jump to content

Sorting out corrupted files.

Go to solution Solved by rich1187,
17 hours ago, Bittenfleax said:

Many hours later:


forfiles /m *.pdf /c "cmd /c echo. & echo @path & pdfinfo.exe @file 1>@fname.txt 2>&1 & findstr /N "Syntax" @fname.txt && (echo File corrupted... Moving... & del @fname.txt & move "@path" "Corrupted") || (echo File not corrupted... Keeping... & del @fname.txt)" 1>Files.txt 2>&1
pause

Just put the pdinfo.exe and this batch file in the SAME folder with all of the PDF's. And also create a folder in there called "Corrupted". Then run this.

 

I tried this with a Working and Broken PDF. It separated them both and create a log file of what happened which is nice.

Thank you!!! I couldn't use this batch straight away, because the PDF's were not all in one folder, but in sub folders, so i used  the command listed below to copy all the pdf's in one folder and then used your batch to sort them. I can't thank You enough, really, thanks!! 

for /r %%p in (*.pdf) do copy %%p C:/*Destinationfolder

 

Hello, so basically the problem is that i accidentally formatted my hard drive and i ended up loosing my files. Using Photorec i got the files back, but many of them are corrupted, so now im trying to sort the corrupted files out from the normal. So i used PDFinfo to see which PDF's are corrupted with this command: 

FORFILES /S /M *.pdf /C "cmd /c echo. & echo @path & C:\Users\HP\Desktop\xpdfbin-win-3.04\bin64\pdfinfo.exe @file" 1>text.txt 2>&1

Now i have text document list with files that are corrupted and with the files that are fine, it looks something like this.

for normal/working pdf:

Quote

"D:\Users\**user\*folder1\file.pdf"  
Title:          
Subject:        
Keywords:       
Author:         
Creator:        Rich
Producer:      tectex_1.40.13
CreationDate:   07/02/15 18:33:47
ModDate:        07/02/15 18:33:47
Tagged:         no
Form:           none
Pages:          76
Encrypted:      no
Page size:      612 x 792 pts (letter) (rotated 0 degrees)
File size:      2215875 bytes
Optimized:      no
PDF version:    1.5

For corrupted:

Quote

"D:\Users\**user\folder1\file2.PDF"  
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't read xref table
Syntax Warning: PDF file is damaged - attempting to reconstruct xref table...
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

So now to my question, how can i save only normal/working files, without saving the corrupted?

My idea is that the extension of the files that are corrupted can be changed to, for example, to .old, and then use command: 

for /R c:\source %%f in (*.PDF) do copy "%%f" x:\destination\

Can someone help me with this one? The idea is here, but i just can't connect them together.

thanks in advance!

Link to comment
https://linustechtips.com/topic/581176-sorting-out-corrupted-files/
Share on other sites

Link to post
Share on other sites

Are all of these in one text file? Or is it a single text file for each PDF?

 

Just read the command -___- hahah. I will try some stuff. brb

CPU: i5 4670k @ 3.4GHz + Corsair H100i      GPU: Gigabyte GTX 680 SOC (+215 Core|+162 Mem)     SSD: Kingston V300 240GB (OS)      Headset: Logitech G930 

Case: Cosair Vengance C70 (white)                RAM: 16GB TeamGroup Elite Black DDR3 1600MHz       HDD: 1TB WD Blue                              Mouse: Logitech G602

OS: Windows 7 Home Premium                       PSUXFX Core Edition 750w                                                Motherboard: MSI Z97-G45               Keyboard: Logitech G510

Link to post
Share on other sites

Welcome to the forums @rich1187

Sorry cannot help with the command line text, however I do know that each and every *.pdf will have its own digital signature. That is how programs tell if you have more than one of the same *.pdf 

Those who deny freedom to others deserve it not for themselves (Abraham Lincoln,1808-1865; 16th US president).

Link to post
Share on other sites

Many hours later:

forfiles /m *.pdf /c "cmd /c echo. & echo @path & pdfinfo.exe @file 1>@fname.txt 2>&1 & findstr /N "Syntax" @fname.txt && (echo File corrupted... Moving... & del @fname.txt & move "@path" "Corrupted") || (echo File not corrupted... Keeping... & del @fname.txt)" 1>Files.txt 2>&1
pause

Just put the pdinfo.exe and this batch file in the SAME folder with all of the PDF's. And also create a folder in there called "Corrupted". Then run this.

 

I tried this with a Working and Broken PDF. It separated them both and create a log file of what happened which is nice.

CPU: i5 4670k @ 3.4GHz + Corsair H100i      GPU: Gigabyte GTX 680 SOC (+215 Core|+162 Mem)     SSD: Kingston V300 240GB (OS)      Headset: Logitech G930 

Case: Cosair Vengance C70 (white)                RAM: 16GB TeamGroup Elite Black DDR3 1600MHz       HDD: 1TB WD Blue                              Mouse: Logitech G602

OS: Windows 7 Home Premium                       PSUXFX Core Edition 750w                                                Motherboard: MSI Z97-G45               Keyboard: Logitech G510

Link to post
Share on other sites

17 hours ago, Bittenfleax said:

Many hours later:


forfiles /m *.pdf /c "cmd /c echo. & echo @path & pdfinfo.exe @file 1>@fname.txt 2>&1 & findstr /N "Syntax" @fname.txt && (echo File corrupted... Moving... & del @fname.txt & move "@path" "Corrupted") || (echo File not corrupted... Keeping... & del @fname.txt)" 1>Files.txt 2>&1
pause

Just put the pdinfo.exe and this batch file in the SAME folder with all of the PDF's. And also create a folder in there called "Corrupted". Then run this.

 

I tried this with a Working and Broken PDF. It separated them both and create a log file of what happened which is nice.

Thank you!!! I couldn't use this batch straight away, because the PDF's were not all in one folder, but in sub folders, so i used  the command listed below to copy all the pdf's in one folder and then used your batch to sort them. I can't thank You enough, really, thanks!! 

for /r %%p in (*.pdf) do copy %%p C:/*Destinationfolder

 

Link to post
Share on other sites

1 hour ago, rich1187 said:

Thank you!!! I couldn't use this batch straight away, because the PDF's were not all in one folder, but in sub folders, so i used  the command listed below to copy all the pdf's in one folder and then used your batch to sort them. I can't thank You enough, really, thanks!! 


for /r %%p in (*.pdf) do copy %%p C:/*Destinationfolder

 

It's alright! Don't forget to "Mark as Answered" in case anyone else has this problem!

CPU: i5 4670k @ 3.4GHz + Corsair H100i      GPU: Gigabyte GTX 680 SOC (+215 Core|+162 Mem)     SSD: Kingston V300 240GB (OS)      Headset: Logitech G930 

Case: Cosair Vengance C70 (white)                RAM: 16GB TeamGroup Elite Black DDR3 1600MHz       HDD: 1TB WD Blue                              Mouse: Logitech G602

OS: Windows 7 Home Premium                       PSUXFX Core Edition 750w                                                Motherboard: MSI Z97-G45               Keyboard: Logitech G510

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×