BeautifulSoup报"AttributeError: 'NoneType' object has no attribute 'split' "异常的原因以及解决办法

BeautifulSoup是一个Python的第三方库，用于从HTML或XML文件中提取数据。很多人在使用BeautifulSoup时会遇到”AttributeError: ‘NoneType’ object has no attribute ‘split'”的报错，造成不小的困扰。这种报错一般是由于程序在执行完成后，返回的对象为None，但是却在None对象上执行了split方法所导致。下面我将详细介绍这种错误的原因和解决办法的完整攻略。

1. 错误原因

在使用BeautifulSoup提取数据时，如果没有找到对应的HTML元素，会返回None对象。如果此时对None对象执行split方法，就会出现”AttributeError: ‘NoneType’ object has no attribute ‘split'”这种错误。

示例代码：

from bs4 import BeautifulSoup

html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
element = soup.find(class_='title1')
print(element.string.split())

在以上代码中，我们想要提取class为”title1″的HTML元素的文本内容，但是html_doc中不存在class为”title1″的元素，因此会返回None对象，而在接下来的print语句中我们却在None对象上执行了split方法，因此会出现”AttributeError: ‘NoneType’ object has no attribute ‘split'”的错误。

2. 解决办法

出现”AttributeError: ‘NoneType’ object has no attribute ‘split'”的错误，主要是因为程序未能正常获得需要抓取的HTML元素导致的，因此解决办法就是要确保程序能够准确获得需要的HTML元素。

2.1 判断是否找到HTML元素

在使用BeautifulSoup提取数据时，我们可以使用find()或find_all()方法来查找HTML元素，如果没有找到对应的元素，这两个方法都会返回None对象，因此我们需要判断返回的对象是否为None，以免对None对象执行操作导致程序出错。

示例代码：

from bs4 import BeautifulSoup

html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
element = soup.find(class_='title')
if element:
    print(element.string.split())
else:
    print("未找到对应的HTML元素")

在以上代码中，我们使用了if语句判断了返回的element对象是否为None，如果不为None则执行对应的操作，否则输出提示信息。

2.2 使用try…except语句

除了使用if语句来判断HTML元素是否存在外，我们还可以使用try…except语句来捕获异常。

示例代码：

from bs4 import BeautifulSoup

html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
element = soup.find(class_='title1')
try:
    print(element.string.split())
except AttributeError:
    print("未找到对应的HTML元素")

在以上代码中，我们使用了try…except语句来捕获AttributeError异常，如果catch到了这个异常，就输出未找到对应HTML元素的提示信息。

3. 总结

“AttributeError: ‘NoneType’ object has no attribute ‘split'”错误是因为程序在执行完成后，返回的对象为None，但是却在None对象上执行了split方法所导致。解决这种错误的方法就是要确保程序能够准确找到需要的HTML元素，可以使用 if 语句来判断元素是否存在，也可以使用 try…except 语句来捕获异常。

1. 错误原因

2. 解决办法

2.1 判断是否找到HTML元素

2.2 使用try…except语句

3. 总结

你可能也喜欢

BeautifulSoup报”TypeError: ‘NoneType’ object is not callable “异常的原因以及解决办法

BeautifulSoup报”ValueError: invalid literal for int() with base 10: ” “异常的原因以及解决办法

BeautifulSoup报”KeyError: ‘href’ “异常的原因以及解决办法