Jump to content

Indexing and 'caching' Imgur images to avoid linkrot for forums

So apparently Imgur is going to start removing(?)/banning uploads form anonymous users (also NSFW content but well, that isnt relevant for this forum after all). I remember klicking through dead forums from after the whole photobucket thing and would prefer if this didnt happen here.

Is there any way that the website (forum) could deal with this? Maybe check posts for imgur links and then sort of create a cache for them? (maybe through the wayback machine?) So that if the image gets removed, the thread doesnt become useless? (I assume it would probably mostly affect help posts?)

 

Sorry I dont really know too much about how that would work and I hope I posted it in the right place!

Pumpkins Deserve Hats!

 

Link to comment
Share on other sites

Link to post
Share on other sites

On this forum you can actually upload images which is neat though ig some ppl still using imgur

 

7 minutes ago, Pumpkin said:

I remember klicking through dead forums from after the whole photobucket thing and would prefer if this didnt happen here

I feel your pain

 

Looking at ancient posts from many years ago about overclocking some old 775 hardware or voltmods or some other form of modding and the fucking pictures gone 🗿

Link to comment
Share on other sites

Link to post
Share on other sites

IPS forum search is thankfully pretty robust. You can find most of the imgur links here:

 

Showing results for 'i.imgur.com'. - Linus Tech Tips

Showing results for 'imgur.com/a/'. - Linus Tech Tips

Showing results for 'imgur.com/gallery'. - Linus Tech Tips

 

Although these seem limited to 1000 results at a time

PLEASE QUOTE ME IF YOU ARE REPLYING TO ME

Desktop Build: Ryzen 7 2700X @ 4.0GHz, AsRock Fatal1ty X370 Professional Gaming, 48GB Corsair DDR4 @ 3000MHz, RX5700 XT 8GB Sapphire Nitro+, Benq XL2730 1440p 144Hz FS

Retro Build: Intel Pentium III @ 500 MHz, Dell Optiplex G1 Full AT Tower, 768MB SDRAM @ 133MHz, Integrated Graphics, Generic 1024x768 60Hz Monitor


 

Link to comment
Share on other sites

Link to post
Share on other sites

If one of the server mods could run a full search for anything 'imgur.com' and just give us the list of urls. We can use that to import on the ArchiveTeam's imgur scraper

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Thibaultmol said:

If one of the server mods could run a full search for anything 'imgur.com' and just give us the list of urls. We can use that to import on the ArchiveTeam's imgur scraper

Any preferred format? XML, CSV, plain text w/ 1 url per line?

PLEASE QUOTE ME IF YOU ARE REPLYING TO ME

Desktop Build: Ryzen 7 2700X @ 4.0GHz, AsRock Fatal1ty X370 Professional Gaming, 48GB Corsair DDR4 @ 3000MHz, RX5700 XT 8GB Sapphire Nitro+, Benq XL2730 1440p 144Hz FS

Retro Build: Intel Pentium III @ 500 MHz, Dell Optiplex G1 Full AT Tower, 768MB SDRAM @ 133MHz, Integrated Graphics, Generic 1024x768 60Hz Monitor


 

Link to comment
Share on other sites

Link to post
Share on other sites

57 minutes ago, rcmaehl said:

Any preferred format? XML, CSV, plain text w/ 1 url per line?

Txt file is fine, yeah.

I'll then have each forum page be downloaded and extract the imgur url lines, and upload those to the Archiveteam.

 

For anyone that wants to help: https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior

Or you can run the imgur downloader directly as a docker container if you're used to that: https://github.com/ArchiveTeam/imgur-grab

Link to comment
Share on other sites

Link to post
Share on other sites

There's 113k posts that link to imgur, so I don't think a list of imgur links is going to be feasible here; even if they do all get archived, it's not clear what we would do with them afterwards. That is from  a total of 16M posts, so if all imgur images do go away, that will impact a non-negligible fraction of posts.

 

Imgur is much less common in more recent posts though - of the last 2M posts, only a little over 3.5k reference imgur. There is certainly value in the very old posts, but posts' relevancy does tend to decay over time, and many of those old posts will likely never be viewed again. I do also wonder how often the image is actually central to a post - in troubleshooting posts for example, images are often provided by the OP to show their problem, but less often included in responses.

 

It's not clear what the imgur cleanup will actually mean in practice - imgur's announcement just says the decidedly vague

Quote

We will be focused on removing old, unused, and inactive content that is not tied to a user account

and none of the news outlets that picked it up were able to get any more details, despite having reached out to imgur for comment. It's possible that this will actually have no impact, but unless they provide some more details I'm not sure how we'll be able to confirm that.

HTTP/2 203

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, colonel_mortis said:

There's 113k posts that link to imgur, so I don't think a list of imgur links is going to be feasible here; even if they do all get archived, it's not clear what we would do with them afterwards. That is from  a total of 16M posts, so if all imgur images do go away, that will impact a non-negligible fraction of posts.

 

Imgur is much less common in more recent posts though - of the last 2M posts, only a little over 3.5k reference imgur. There is certainly value in the very old posts, but posts' relevancy does tend to decay over time, and many of those old posts will likely never be viewed again. I do also wonder how often the image is actually central to a post - in troubleshooting posts for example, images are often provided by the OP to show their problem, but less often included in responses.

 

It's not clear what the imgur cleanup will actually mean in practice - imgur's announcement just says the decidedly vague

and none of the news outlets that picked it up were able to get any more details, despite having reached out to imgur for comment. It's possible that this will actually have no impact, but unless they provide some more details I'm not sure how we'll be able to confirm that.

Archiveteam is saving them and uploading them to archive.org

Dashboard: https://tracker.archiveteam.org/imgur/

 

They're trying to collect as many imgur urls as possible, and seeing as ltt forums has a long history, and indeed a lot of imgur links, i think this would be a great addition 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×