Home » Python » How to find children of nodes using Beautiful Soup

How to find children of nodes using Beautiful Soup

Posted by: admin November 29, 2017 Leave a comment

Questions:

I want to get all the <a> tags which are children of <li>

<div>
<li class="test">
    <a>link1</a>
    <ul> 
       <li>  
          <a>link2</a> 
       </li>
    </ul>
</li>
</div>

I know how to find element with particular class like this

soup.find("li", { "class" : "test" }) 

But i don’t know how to find all a which are children of <li class=test> but not any others

like i want to select

<a> link1 </a>
Answers:

Try this

li = soup.find('li', {'class': 'text'})
children = li.findChildren()
for child in children:
    print child

Questions:
Answers:

Theres a super small section in the DOCs that shows how to find/find_all direct children.

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#the-recursive-argument

in your case:

soup.find("li", { "class" : "test" },recursive=False)
soup.find_all("li", { "class" : "test" },recursive=False)

Questions:
Answers:

try this:

li = soup.find("li", { "class" : "test" })
children = li.find_all("a") # returns a list of all <a> children of li

other reminders:

The find method only gets the first occurring child element.
The find_all method gets all descendant elements and are stored in a list.

Questions:
Answers:

Perhaps you want to do

soup.find("li", { "class" : "test" }).find('a')

Questions:
Answers:

Yet another method – create a filter function that returns True for all desired tags:

def my_filter(tag):
    return (tag.name == 'a' and
        tag.parent.name == 'li' and
        'test' in tag.parent['class'])

Then just call find_all with the argument:

for a in soup(my_filter): # or soup.find_all(my_filter)
    print a