提问者:小点点

使用美丽的汤解析< ul >标签


考虑一下这段代码:

divTag = soup.find_all("div", {"class":"classname"})
print divTag
for tag in divTag:
    ulTag = soup.find_all("ul", {"class":"classname"})
    print ulTag
    for tag in ulTag:
        liTag = soup.find_all("li", {"class":"classname"})
        print liTag
        for tag in liTag:
            diTag = soup.find_all("div", {"class":"classname"})
            print diTag
            for tag in diTag:
                aTags = tag.find_next("a")
                value = aTags.string
                print value

它只打印“divTag”

更新:

<div class="classname">
<ul auto-load="true" class="classname" data-href="">
<li class="classname">
<div class="classname"><a href="">"value"</a>  string <a href="">string1</a> <a class="muted"><abbr class="timeago" title=" 1 Jun, 2015, 10:23 am">7 hours ago</abbr></a>
</div>
</li>
<li>
</li>
</ul>
</div>

我基本上想在'a'标签中提取“字符串”值。


共2个答案

匿名用户

具有next_sibling的完整解决方案

ulTag = soup.find("ul", {"class": "classname"})
aTags = ulTag.find_all("a")
for aTag in aTags:
    sibling = aTag.next_sibling
    siblingString = str(sibling).strip()
    if len(siblingString) > 0:
        print siblingString 

匿名用户

每当你在汤里寻找的时候。所以你失败了。您应该在标签的父标签中搜索标签。试着这样做:

divTag = soup.find_all("div", {"class":"classname"})
for ulTag in divTag:
    for liTag in ulTag.find_all("li", {"class":"classname"}):
        for tag in liTag.find_all("div", {"class":"classname"}):
            for aTag in tag.find_all('a'):
                print aTag.string

对于您提供的 html,输出为:

"value"
string1
7 hours ago