Home » Php » web crawler – Scraping class name on a website using php

web crawler – Scraping class name on a website using php

Posted by: admin February 25, 2020 Leave a comment

Questions:

So, I want to scrap a class name from a website. Here’s the source of html code:

<td title="Complexity" class="cvss6" itemscope itemtype="http://schema.org/Rating">

I want to scrap the “cvss6” only, tried with this:

$nilai1 = explode('<td title="Complexity" class="', $kodeHTML);
$nilai_show2 = explode('" itemscope="', $nilai1[1]);echo "

<tr><td width='85%' align='left' bgcolor='#F5F5F5'>".$judul_show[0]."</td>";

if($nilai_show2[0] == 'cvss6') {
echo "<td width='15%' align='center' bgcolor='#FF0000'>High</td></tr>";

                    }

but it didn’t work, it just won’t show anything on my site. I managed to scrap it’s html plain text. But how do you scrap a text that is inside the class name?
Thanks

How to&Answers:

To answer your question you could use regular expression to find what you need,
with code below we try to find term that expect class=”something” with multiline flag (https://www.php.net/manual/fr/function.preg-match-all.php) :

preg_match_all(
    '/class="(.+?)"/m',
    '<b>exemple : </b><div class="test test1 test2 test3" align=left>This is a test</div class="t1 t2">',
    $out
);

var_dump($out[1]);

/* output
  array(2) {
    [0]=>
    string(22) "test test1 test2 test3"
    [1]=>
    string(5) "t1 t2"
  }
*/

Also i advice you to use an library to crawl web page with php.

https://symfony.com/doc/current/components/dom_crawler.html