BeautifulSoup报"ValueError: invalid literal for int() with base 10: '' "异常的原因以及解决办法

BeautifulSoup报”ValueError: invalid literal for int() with base 10: ” “异常的原因以及解决办法

Post published:2023年4月15日
Post category:Python

在使用Python的BeautifulSoup库解析HTML文档时，可能会遇到报错”ValueError: invalid literal for int() with base 10: ” “的问题。这个错误通常发生在使用find()或find_all()等函数搜索HTML标签时，具体原因是属性值存在异常。

解决这个问题的办法有以下几种：

1.修改HTML文档中的属性值，使其符合标准格式。

2.对属性值进行特殊处理，例如过滤非数字字符等。

3.使用try-except语句捕获异常，避免程序崩溃。

下面是三种解决方法的详细说明：

1.修改HTML文档中的属性值

针对这个问题的原因是因为属性值不规范，我们可以通过修改属性值的方式来解决这个问题。例如，将其改为“int”类型或者确保其符合数字的规范格式。

2.对属性值进行特殊处理

在解析HTML文档时，我们可以对属性值进行特殊的处理，例如使用正则表达式或字符串处理函数过滤非数字字符等。这样可以防止错误的属性值进入到整个程序中。

3.使用try-except语句捕获异常

如果我们不能在源代码中修改HTML文档中的异常属性值，那么我们可以使用try-except语句捕获异常。例如，可以使用try-except语句来处理异常，避免程序崩溃：

from bs4 import BeautifulSoup

html_doc = """
<html>
<head>
<title>Test HTML</title>
</head>
<body>
<div id="test">100</div>
<div id="test2"></div>
</body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser')
try:
    test = soup.find('div', {'id': 'test'}).text
    test2 = int(soup.find('div', {'id': 'test2'}).text)
    print(test)
    print(test2)
except ValueError:
    print("Invalid attribute value")

在上述代码中，使用try-except语句来避免了在运行时出现异常。如果属性值不是数字，则程序会输出”Invalid attribute value”，否则会输出找到的属性值。

Tags: python-beautifulsoup

你可能也喜欢

BeautifulSoup报”AttributeError: ‘NoneType’ object has no attribute ‘find’ “异常的原因以及解决办法

BeautifulSoup报”AttributeError: ‘NoneType’ object has no attribute ‘get_text’ “异常的原因以及解决办法

BeautifulSoup报”ValueError: invalid literal for int() with base 10: ‘None’ “异常的原因以及解决办法