Home » Php » php – I need to split text delimited by paragraph tag

php – I need to split text delimited by paragraph tag

Posted by: admin July 12, 2020 Leave a comment

Questions:
$text = "<p>this is the first paragraph</p><p>this is the first paragraph</p>";

I need to split the above into an array delimited by the paragraph tags. That is, I need to split the above into an array with two elements:

array ([0] = "this is the first paragraph", [1] = "this is the first paragraph")
How to&Answers:

Remove the closing </p> tags as we don’t need them and then explode the string into an array on opening </p> tags.

$text = "<p>this is the first paragraph</p><p>this is the first paragraph</p>";
$text = str_replace('</p>', '', $text);
$array = explode('<p>', $text);

To see the code run please see the following codepad entry. As you can see this code will leave you with an empty array entry at index 0. If this is a problem then it can easily be removed by calling array_shift($array) before using the array.

Answer:

For anyone else who finds this, don’t forget that a P tag may have styles, id’s or any other possible attributes so you should probably look at something like this:

$ps = preg_split('#<p([^>])*>#',$input);

Answer:

This is an old question but I was not able to find any reasonable solution in an hour of looking for stactverflow answers. If you have string full of html tags (p tags) and if you want to get paragraphs (or first paragraph) use DOMDocument.

$long_description is a string that has <p> tags in it.

$long_descriptionDOM = new DOMDocument();
// This is how you use it with UTF-8
$long_descriptionDOM->loadHTML((mb_convert_encoding($long_description, 'HTML-ENTITIES', 'UTF-8')));
$paragraphs = $long_descriptionDOM->getElementsByTagName('p');
$first_paragraph = $paragraphs->item(0)->textContent();

I guess that this is the right solution. No need for regex.

edit: YOU SHOULD NOT USE REGEX TO PARSE HTML.

Answer:

$text = "<p>this is the first paragraph</p><p>this is the first paragraph</p>";

$exptext = explode("<p>", $text);

echo $exptext[0];
echo "<br>";
echo $exptext[1];

//////////////// OUTPUT /////////////////

this is the first paragraph
this is the first paragraph

Answer:

Try this code:

<?php
$textArray = explode("<p>" $text);

for ($i = 0; $i < sizeof($textArray); $i++) {
    $textArray[$i] = strip_tags($textArray[$i]);
}

Answer:

If your input is somewhat consistent you can use a simple split method as:

 $paragraphs = preg_split('~(</?p>\s*)+~', $text, PREG_SPLIT_NO_EMPTY);

Where the preg_split will look for combinations of <p> and </p> plus possible whitespace and separate the string there.

As unnecessary alternative you can also use or to extract only complete paragraph contents using:

 foreach (htmlqp($text)->find("p") as $p) { print $p->text(); }

Answer:

Try the following:

<?php
$text = "<p>this is the first paragraph</p><p>this is the first paragraph</p>";

$array;

preg_replace_callback("`<p>(.+)</p>`isU", function ($matches) {
    global $array;
    $array[] = $matches[1];
}, $text);

var_dump($array);

?>

This can be modified, putting the array in a class that manage it with an add value method, and a getter.

Answer:

Try this.

<?php
$text = "<p>this is the first paragraph</p><p>this is the first paragraph</p>";
$array = json_decode(json_encode((array) simplexml_load_string('<data>'.$text.'</data>')),1);
print_r($array['p']);
?>