how to remove link from simple dom html data


how to remove link from simple dom html data



I have this code, i get the info but with this i get the data + the link for example


require_once('simple_html_dom.php');
set_time_limit (0);

$html ='www.domain.com';
$html = file_get_html($url);
// i read the first div
foreach($html->find('#content') as $element){
// i read the second
foreach ($element->find('p') as $phone){
echo $phone;



Mobile Pixel 2 -
google << there the link



But i need remove these link, the problem is the next, i scrape this:


<p>the info that i really need is here<p>
<p class="text-right"><a class="btn btn-default espbott aplus" role="button"
href="brand/google.html">Google</a></p>



I read this:
Simple HTML Dom: How to remove elements?
But i cant find the answer



update: if i use this:


foreach ($element->find('p[class="text-right"]');



It will select the links but can't remove scrapped data




2 Answers
2



You can use file_get_content with str_get_html and replace it :


include 'simple_html_dom.php';

$content=file_get_contents($url);

$html = str_get_html($content);
// i read the first div
foreach($html->find('#content') as $element){
// i read the second
foreach ($element->find('p[class="text-right"]') as $phone){
$content=str_replace($phone,'',$content);
}
}
print $content;
die;





hi, with this code i get the same website (complete) what im trying to scrape
– AndrewPP
Jun 29 at 18:08



Or here a native version:



PHP-CODE


$sHtml = '<p>the info that i really need is here<p>
<p class="text-right"><a class="btn btn-default espbott aplus" role="button"
href="brand/google.html">Google</a></p>';

$sHtml = '

' . $sHtml . '
';
echo "org:n";
echo $sHtml;

echo "nn";

$doc = new DOMDocument();
$doc->loadHtml($sHtml);

foreach( $doc->getElementsByTagName( 'a' ) as $element ) {
$element->parentNode->removeChild( $element );
}

echo "res:n";
echo $doc->saveHTML($doc->getElementById('wrapper'));



Output


org:

the info that i really need is here


Google



res:

the info that i really need is here








https://3v4l.org/RhuEU





I already edit the code, but i see only the plain text of the url
– AndrewPP
Jun 29 at 18:20






Take a look at the HTML-Source (CTLR+U)
– Pilan
Jun 29 at 18:43







By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Export result set on Dbeaver to CSV

Opening a url is failing in Swift