Jump to content

Hello.

 

Trying to look for a scraper / script that is easy to understand and use.

 

Currently I have looked at:

http://simplehtmldom.sourceforge.net/

 

http://www.oooff.com/php-scripts/basic-php-scrape-tutorial/basic-php-scraping.php

 

https://gist.github.com/anchetaWern/6150297

 

And right now I use

$start_description1 = '<a href="secretlinko">';
  $end_description1 = '</a>';
$description1_start_pos = strpos($link, $start_description1) + strlen($start_description1);
$description1_end_pos = strpos($link, $end_description1) - $description1_start_pos;
$description1 = substr($link, $description1_start_pos, $description1_end_pos);

And it works, but it tend to break due to html code :P

 

So could anyone explain quick how to use one of the ones I found or some better, because I'm having issues understanding the simple ones since they jump right into parts they think I would know.

Back-end developer, electronics "hacker"

Link to comment
https://linustechtips.com/topic/758472-php-simple-scraper/
Share on other sites

Link to post
Share on other sites

This is some of the objects I'm trying to collect

<span style="padding-left: 3px;">47,262,677 Dollar</span>

<div class="successBox">
  <span style="text-shadow: 1px 1px 1px #000; color: #FFF;">
  <span style="display: block; font-weight: bold; color: #FFF;">Success</span>
  Action was success and you got <span styl="font-weight: bold;">400</span> Dollar.
  </span>
</div>

 

Back-end developer, electronics "hacker"

Link to comment
https://linustechtips.com/topic/758472-php-simple-scraper/#findComment-9586715
Share on other sites

Link to post
Share on other sites

I personally would use the PHP DOMDocument class. I think it's easier to fetch the full document as a XML object and grab the nodes you need.

3rd link you posted offers a pretty good example. You should try doing it that way.

CPU: i7-12700KF Grill Plate Edition // MOBO: Asus Z690-PLUS WIFI D4 // RAM: 16GB G.Skill Trident Z 3200MHz CL14 

GPU: MSI GTX 1080 FE // PSU: Corsair RM750i // CASE: Thermaltake Core X71 // BOOT: Samsung Evo 960 500GB

STORAGE: WD PC SN530 512GB + Samsung Evo 860 500GB // COOLING: Full custom loop // DISPLAY: LG 34UC89G-B

Link to comment
https://linustechtips.com/topic/758472-php-simple-scraper/#findComment-9587665
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×