python - get text by beautifulsoap without using str.text.strip() -
i want text tag using beautiful soap, try code on computer(running mac osx yosemite) , works correctly when run code on linux server(running ubuntu 10.4) error:
mtemp = div_tag.text.strip()
attributeerror: 'nonetype' object has no attribute 'text'
and code it:
div_tag = soup.find('div', class_='span12 path_item') mtemp = div_tag.text.strip() print mtemp
i need text tag, don't know why code doesn't run on server , have find way pure text tag out using div_tag.text.strip() if can see div_tag content(text/ want html code) , div_tag self here:
صفحه اصلی مکانها گردشگری میراث فرهنگی کاخ موزه گلستان<div class="span12 path_item"> <a href="/" style="margin-right: 5px;"><i class="fa fa-arrow-left"></i> صفحه اصلی</a> <a href="/list/show-places" id="placeholderdivmaincontent_maincontent_maincontent_hamgardisiteview_navigationbar_asites" style="cursor:pointer"><i class="fa fa-angle-left"></i>مکانها</a> <a href="/list/show-places/category-tourism" id="placeholderdivmaincontent_maincontent_maincontent_hamgardisiteview_navigationbar_acategory" style="cursor:pointer"><i class="fa fa-angle-left"></i>گردشگری</a> <a href="/list/show-places/category-tourism/subcategory-59" id="placeholderdivmaincontent_maincontent_maincontent_hamgardisiteview_navigationbar_asubcategory" style="cursor:pointer"><i class="fa fa-angle-left"></i>میراث فرهنگی</a> <a id="placeholderdivmaincontent_maincontent_maincontent_hamgardisiteview_navigationbar_title"><i class="fa fa-angle-left"></i>کاخ موزه گلستان</a> </div>
firstly, selector not match class_
attribute have specified, since there 2 classes assigned div
.
to make beautifulsoup match more 1 class, need use css selector.
this code work don’t much, , improve if comes mind:
from bs4 import beautifulsoup bs #s = html soup = bs(s) d = soup.select('div.span12.path_item') e = bs( str(d[0]) ) x in e.find_all('a'): print x.text.strip()
Comments
Post a Comment