Python 工匠读书笔记 5：异常和错误处理

[toc]

基础知识

优先使用异常捕获

一个简单函数

写一个简单的函数，它接收一个整数参数，返回对它加1后的结果。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


def incr_by_one(value):
    """对输入整数加1，返回新的值

    :param value: 整型，或者可以转成整型的字符串
    :return: 整型结果
    """
    if isinstance(value, int):
        return value + 1
    elif isinstance(value, str) and value.isdigit():
        return int(value) + 1
    else:
        print(f'Unable to perform incr for value: "{value}"')
        
        
def incr_by_one(value):
    """对输入整数加1，返回新的值

    :param value: 整型，或者可以转成整型的字符串
    :return: 整型结果
    """
    try:
        return int(value) + 1
    except (TypeError, ValueError) as e:
        print(f'Unable to perform incr for value: "{value}", error: {e}')

两种编程风格

LBLY（look befor you leap）三思而后行
EAFP（easier to ask for forgiveness than permission）获取原谅比许可简单

小结

Python 社区偏于使用基于异常捕获的 EAFP 风格
代码更为精简，不需要开发者用分支完全覆盖各种可能出错的情况，只需要捕获可能发生的异常即可
EAFP 的代码通常性能更好如果是 ‘73’ LBLY 每次调用都需要进行额外的 isinstance 和 isdigit 的判断。EAFP 每次调用直接进行转换，返回结果
Python 的抛出和捕获异常比较轻量

try 语句常用知识

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


def safe_int(value):
    """尝试把输入转换为整数"""
    try:
        return int(value)
    except TypeError:
        # 当某类异常被抛出时，将会执行对应 except 下的语句
        print(f'type error: {type(value)} is invalid')
    except ValueError:
        # 你可以在一个 try 语句块下写多个 except
        print(f'value error: {value} is invalid')
    finally:
        # finally 里的语句，无论如何都会被执行，哪怕已经执行了return
        print('function completed')

把更精确的 except 放在前面

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


def incr_by_key(d, key):
    try:
        d[key] += 1
    except Exception as e: ➊
        print(f'Unknown error: {e}')
    except KeyError:
        print(f'key {key} does not exists')
        
        
def incr_by_key(d, key):
    try:
        d[key] += 1
    except KeyError:
        print(f'key {key} does not exists')
    except Exception as e:
        print(f'Unknown error: {e}')

使用 else 分支

在使用 try 捕获异常的时候，有时需要再仅一切正常的时候做某件事

1
2
3
4
5
6
7
8
9


#同步用户资料到外部系统，仅当同步成功时发送通知消息
sync_succeeded = False
try:
    sync_profile(user.profile, to_external=True)
    sync_succeeded = True
except Exception as e:
    print("Error while syncing user profile")
if sync_succeeded:
    send_notification(user, 'profile sync succeeded')

1
2
3
4
5
6


try:
    sync_profile(user.profile, to_external=True)
except Exception as e:
    print("Error while syncing user profile")
else:
    send_notification(user, 'profile sync succeeded')

异常捕获里面的 else 标识：当 try 的语句没有异常时，才执行 else 里面的内容。需要注意：如果程序里面的 try 在碰到 return、break 等跳转语句中断本次异常捕获，那么即使没有抛出异常，那么 else 分支里面的逻辑也不会被执行。

使用空 raise 语句

场景：在处理异常时，有时我们只是想记录下某个异常，然后把它重新抛出，交由上层处理。

1
2
3
4
5
6


def incr_by_key(d, key):
    try:
        d[key] += 1
    except KeyError:
        print(f'key {key} does not exists, re-raise the exception')
        raise

抛出异常，而不是返回错误

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19


def create_item(name):
    """接收名称，创建 Item 对象

    :return: (对象, 错误信息)，成功时错误信息为 ''
    """
    if len(name) > MAX_LENGTH_OF_NAME:
        return None, 'name of item is too long'
    if len(get_current_items()) > MAX_ITEMS_QUOTA:
        return None, 'items is full'
    return Item(name=name), ''


def create_from_input():
    name = input()
    item, err_msg = create_item(name)
    if err_msg:
        print(f'create item failed: {err_msg}')
    else:
        print('item<{name}> created')

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


class CreateItemError(Exception):
    """创建 Item 失败"""


def create_item(name):
    """创建一个新的Item

    :raises: 当无法创建时抛出 CreateItemError
    """
    if len(name) > MAX_LENGTH_OF_NAME:
        raise CreateItemError('name of item is too long')
    if len(get_current_items()) > MAX_ITEMS_QUOTA:
        raise CreateItemError('items is full')
    return Item(name=name)


def create_from_input():
    name = input()
    try:
        item = create_item(name)
    except CreateItemError as e:
        print(f'create item failed: {e}')
    else:
        print(f'item<{name}> created')

优点：

返回错误并非解决此类问题的最佳办法。这是因为这种做法会增加调用方处理错误的成本，尤其是当许多函数遵循这个规范，并且有很多层调用关系时。Python有完善的异常机制。
新函数拥有更稳定的返回值类型，它永远只会返回Item类型或是抛出异常。
异常在被捕获前会不断往调用栈上层汇报。因此create_item()的直接调用方也可以完全不处理CreateItemError，而交由更上层处理。异常的这个特点给了我们更多灵活性，但同时也带来了更大的风险。具体来说，假如程序缺少一个顶级的统一异常处理逻辑，那么某个被所有人忽视了的异常可能会层层上报，最终弄垮整个程序。

使用上下文管理器

with 与异常关系也比较密切。

1
2
3


#使用with 打开文件，文件描述符会在作用域结束后自动被释放
with open('foo.txt') as fp:
    content = fp.read()

要创建一个上下文管理器，只要实现__enter__和__exit__两个魔法方法即可。

替代 finally 语句清理资源

1
2
3
4
5
6
7


conn = create_conn(host, port, timeout=None)
try:
    conn.send_text('Hello, world!')
except Exception as e:
    print(f'Unable to use connection: {e}')
finally:
    conn.close()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


class create_conn_obj:
    """创建连接对象，并在退出上下文时自动关闭"""
    def __init__(self, host, port, timeout=None):
        self.conn = create_conn(host, port, timeout=timeout)
    def __enter__(self):
        return self.conn
    def __exit__(self, exc_type, exc_value, traceback):
        # __exit__会在管理器退出时调用
        self.conn.close()
        return False
      
      
#使用上下文管理器创建连接
with create_conn_obj(host, port, timeout=None) as conn:
    try:
        conn.send_text('Hello, world!')
    except Exception as e:
        print(f'Unable to use connection: {e}')

用于忽略异常

场景：有时程序会抛出一些不影响正常执行逻辑的异常。当你在关闭某个连接时，假如它已经是关闭状态了，解释器就会抛出AlreadyClosedError异常。这时，为了让程序正常运行下去，你必须用try语句来捕获并忽略这个异常。

1
2
3
4


try:
    close_conn(conn)
except AlreadyClosedError:
    pass

虽然这样的代码很简单，但没法复用。当项目中有很多地方要忽略这类异常时，这些try/except语句就会分布在各个角落，看上去非常凌乱。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


class ignore_closed:
    """忽略已经关闭的连接"""
    def __enter__(self):
        pass
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type == AlreadyClosedError:
            return True
        return False
        
with ignore_closed():
    close_conn(conn)

使用 contextmanager 装饰器

目的：简化定义一个符合协议的管理器对象

1
2
3
4
5
6
7
8
9


from contextlib import contextmanager
@contextmanager
def create_conn_obj(host, port, timeout=None):
    """创建连接对象，并在退出上下文时自动关闭"""
    conn = create_conn(host, port, timeout=timeout)
    try:
        yield conn 
    finally:
        conn.close()

案例故事

提前奔溃也挺好

精确捕获，不是模糊的 Exception

异常与抽象的一致性

一个小 Demo

Web API，定义了一些常见的异常。

1
2
3


raise error_codes.UNABLE_TO_UPVOTE
raise error_codes.USER_HAS_BEEN_BANNED
...

{PROJECT}/util/image/processor.py

1
2
3
4
5
6


def process_image(...):
    try:
        image = Image.open(fp)
    except Exception:
        raise error_codes.INVALID_IMAGE_UPLOADED
    ...

process_image()函数会尝试打开一个文件对象。假如该文件不是有效的图片格式，就抛出error_codes.INVALID_IMAGE_UPLOADED异常。该异常会被Django中间件捕获，最终给用户返回“INVALID_IMAGE_UPLOADED”（上传的图片格式有误）错误码响应。

问题：

最初编写 process_image 调用这个函数就只有"处理用户上传图片的 POST 请求"，为了偷懒，直接抛出了一个 API 的异常
但是当需要编写一个后台运行的图片批处理脚本时，刚好可以服用这个函数。就会存在问题
1. 必须引入 API 的异常类依赖来捕获异常
2. 比如捕获 INVALID_IMAGE_UPLOADED 异常。哪怕图片就不是由用户上传的。

避免抛出抽象级别高于当前模块的异常

APIErrorCode异常类的意义在于，表达一种能直接被终端用户（人）识别并消费的“错误代码”。它是整个项目中最高层的抽象之一。
出于方便，在一个底层图像处理模块里抛出了它。这打破了process_image()函数的抽象一致性，导致我无法在后台脚本里复用它。

最佳实践：

让模块只抛出与当前抽象级别一致的异常；
让模块只抛出与当前抽象级别一致的异常；

DEMO 调整：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


# {PROJECT}/util/image/processor.py
class ImageOpenError(Exception):
    """图像打开错误异常类
    :param exc: 原始异常
    """
    def __init__(self, exc):
        self.exc = exc
        # 调用异常父类方法，初始化错误信息
        super().__init__(f'Image open error: {self.exc}')
def process_image(...):
    try:
        image = Image.open(fp)
    except Exception as e:
        raise ImageOpenError(exc=e)
    ... ...
    
# {PROJECT}/app/views.py   
def foo_view_function(request):
    try:
        process_image(fp)
    except ImageOpenError:
        raise error_codes.INVALID_IMAGE_UPLOADED

包装抽象级别低于当前模块的异常

避免抛出高于当前抽象级别的异常外，我们同样应该避免泄露低于当前抽象级别的异常。

1
2
3
4
5
6


>>> try:
...     requests.get('https://www.invalid-host-foo.com')
... except Exception as e:
...     print(type(e))
...
<class 'requests.exceptions.ConnectionError'>

urllib3模块是requests依赖的低层实现细节，而这个细节在未来是有可能变动的。当某天requests真的要修改低层实现时，这些包装过的异常类，就可以避免对用户侧的错误处理逻辑产生不良影响。

编程建议

不要随意忽略异常

在 except 里面捕获并且处理，继续执行后面的代码
在 except 中捕获，并且发送通知，中断执行
不捕获异常，让异常继续向上抛

1
2
3
4


try:
    send_sms_notification(user, message)
except RequestError:
    pass

“这个短信通知根本不重要，即使失败了也没关系。”但即便这样，通过日志记录下这个异常总会更好：

1
2
3
4


try:
    send_sms_notification(user, message)
except RequestError:
    logger.warning('RequestError while sending SMS notification to %s', user.username)

不要手动做数据校验

不要手动判断，使用一些造好的轮子。去抽出专门做数据校验的逻辑
pydantic、django serializer、wtforms 等等

抛出可区分的异常

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


def create_from_input():
    name = input()
    try:
        item = create_item(name)
    except CreateItemError as e:
        print(f'create item failed: {e}')
    else:
        print(f'item<{name}> created')
        
# 如果调用方想针对“items已满”这类错误增加一些特殊逻辑，比如清空所有items，我们就得把上面的代码改成下面这样

def create_from_input():
    name = input()
    try:
        item = create_item(name)
    except CreateItemError as e:
        # 如果已满，清空所有 items
        if str(e) == 'items is full':
            clear_all_items()

        print(f'create item failed: {e}')
    else:
        print(f'item<{name}> created')

利用异常间的继承关系，设计一些更精准的异常子类

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28


class CreateItemError(Exception):
    """创建 Item 失败"""


class CreateErrorItemsFull(CreateItemError):
    """当前的Item 容器已满"""


def create_item(name):
    if len(name) > MAX_LENGTH_OF_NAME:
        raise CreateItemError('name of item is too long')
    if len(get_current_items()) > MAX_ITEMS_QUOTA:
        raise CreateErrorItemsFull('items is full')
    return Item(name=name)
  
  
  
def create_from_input():
    name = input()
    try:
        item = create_item(name)
    except CreateErrorItemsFull as e:
        clear_all_items()
        print(f'create item failed: {e}')
    except CreateItemError as e:
        print(f'create item failed: {e}')
    else:
        print(f'item<{name}> created')  

除了设计更精确的异常子类外，你还可以创建一些包含额外属性的异常类，比如包含“错误代码”（error_code）的CreateItemError类

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


class CreateItemError(Exception):
    """创建 Item 失败

    :param error_code: 错误代码
    :param message: 错误信息
    """

    def __init__(self, error_code, message):
        self.error_code = error_code
        self.message = message
        super().__init__(f'{self.error_code} - {self.message}')


# 抛出异常时指定 error_code
raise CreateItemError('name_too_long', 'name of item is too long')
raise CreateItemError('items_full', 'items is full')

不要使用 assert 来检查参数合法性

1
2
3


def print_string(s):
    assert isinstance(s, str), 's must be string'
    print(s)

assert是一个专供开发者调试程序的关键字。它所提供的断言检查，可以在执行Python时使用-O选项直接跳过。

不要使用 assert 来做参数校验，使用 raise 语句。

无需处理是最好的错误处理

一个小故事

Tcl 语言设计的 unset 命令，用来删除某个变量。在设计这个命令时，作者认为当人们用unset删除一个不存在的变量时，一定是不正常的，程序自然应该抛出一个错误。

当人们调用unset时，其实常常处在一种模棱两可的程序状态中——不确定变量是否存在。这时，unset的设计就会让它用起来非常尴尬。大部分人在使用unset时，几乎都需要编写额外的代码来捕获unset可能抛出的错误。

如果可以重新设计unset命令，他会对它的职责做一些调整：不再把unset当成一种可能会失败的删除变量行为，而是把它当作一种确保某变量不存在的命令。当unset的职责改变后，即使变量不存在，它也可以不抛出任何错误，直接返回就好。

一个思路上的调整：在设计API时，如果稍微调整一下思考问题的角度，修改API的抽象定义，那么那些原本需要处理的错误，也许就会神奇地消失。假如API不抛出错误，调用方也就不需要处理错误，这会大大减轻大家的心智负担。

空对象模式

调查问卷，全部为字符串，正常的得分记录是{username} {points}格式。写一个脚本，统计合格（大于等于80）的得分记录总数

1

data = ['piglei 96', 'joe 100', 'invalid-data', 'roland $invalid_points', ...]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62


QUALIFIED_POINTS = 80

class CreateUserPointError(Exception):
    """创建得分纪录失败时抛出"""

class UserPoint:
    """用户得分记录"""

    def __init__(self, username, points):
        self.username = username
        self.points = points

    def is_qualified(self):
        """返回得分是否合格"""
        return self.points >= QUALIFIED_POINTS

def make_userpoint(point_string):
    """从字符串初始化一条得分记录

    :param point_string: 形如piglei 1 的表示得分记录的字符串
    :return: UserPoint 对象
    :raises: 当输入数据不合法时返回 CreateUserPointError
    """
    try:
        username, points = point_string.split()
        points = int(points)
    except ValueError:
        raise CreateUserPointError(
            'input must follow pattern "{username} {points}"'
        )

    if points < 0:
        raise CreateUserPointError('points can not be negative')
    return UserPoint(username=username, points=points)

def count_qualified(points_data):
    """计算得分合格的总人数

    :param points_data: 字符串格式的用户得分列表
    """
    result = 0
    for point_string in points_data:
        try:
            point_obj = make_userpoint(point_string)
        except CreateUserPointError:
            pass
        else:
            result += point_obj.is_qualified()
    return result

data = [
    'piglei 96',
    'nobody 61',
    'cotton 83',
    'invalid_data',
    'roland $invalid_points',
    'alfred -3',
]

print(count_qualified(data))
# 输出结果：
# 2

每当调用方使用make_userpoint()时，都必须加上try/except语句来捕获异常。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45


QUALIFIED_POINTS = 80

class UserPoint:
    """用户得分记录"""

    def __init__(self, username, points):
        self.username = username
        self.points = points

    def is_qualified(self):
        """返回得分是否合格"""
        return self.points >= QUALIFIED_POINTS

class NullUserPoint:
    """一个空的用户得分记录"""

    username = ''
    points = 0

    def is_qualified(self):
        return False

def make_userpoint(point_string):
    """从字符串初始化一条得分记录

    :param point_string: 形如piglei 1 的表示得分记录的字符串
    :return: 如果输入合法，返回 UserPoint 对象，否则返回 NullUserPoint
    """
    try:
        username, points = point_string.split()
        points = int(points)
    except ValueError:
        return NullUserPoint()

    if points < 0:
        return NullUserPoint()
    return UserPoint(username=username, points=points)
    
    
def count_qualified(points_data):
    """计算得分合格的总人数

    :param points_data: 字符串格式的用户得分列表
    """
    return sum(make_userpoint(s).is_qualified() for s in points_data) 

make_userpoint()总是会返回一个符合要求的对象（UserPoint()或NullUserPoint()）
“空对象模式”也是一种转换设计观念以避免错误处理的技巧。当函数进入边界情况时，“空对象模式”不再抛出错误，而是让其返回一个类似于正常结果的特殊对象，因此使用方自然就不必处理任何错误，人们写起代码来也会更轻松。

总结

基础知识

一个try语句支持多个except子句，但请记得把更精确的异常类放在前面
try语句的else分支会在没有异常时执行，因此它可用来替代标记变量
不带任何参数的raise语句会重复抛出当前异常
上下文管理器经常用来处理异常，它最常见的用途是替代finally子句
上下文管理器可以用来忽略某段代码里的异常
使用@contextmanager装饰器可以轻松定义上下文管理器

错误处理和参数校验

当你可以选择编写条件判断或异常捕获时，优先选异常捕获（EAFP）
不要让函数返回错误信息，直接抛出自定义异常吧
手动校验数据合法性非常烦琐，尽量使用专业模块来做这件事
不要使用assert来做参数校验，用raise替代它
处理错误需要付出额外成本，假如能通过设计避免它就再好不过了
在设计API时，需要慎重考虑是否真的有必要抛出错误
使用“空对象模式”能免去一些针对边界情况的错误处理工作

当你捕获异常时

过于模糊和宽泛的异常捕获可能会让程序免于崩溃，但也可能会带来更大的麻烦
异常捕获贵在精确，只捕获可能抛出异常的语句，只捕获可能的异常类型
有时候，让程序提早崩溃未必是什么坏事
完全忽略异常是风险非常高的行为，大多数情况下，至少记录一条错误日志

当你抛出异常时

保证模块内抛出的异常与模块自身的抽象级别一致
如果异常的抽象级别过高，把它替换为更低级的新异常
如果异常的抽象级别过低，把它包装成更高级的异常，然后重新抛出
不要让调用方用字符串匹配来判断异常种类，尽量提供可区分的异常

参考

Python 工匠

文章目录

基础知识

优先使用异常捕获

一个简单函数

两种编程风格

小结

try 语句常用知识

把更精确的 except 放在前面

使用 else 分支

使用 空 raise 语句

抛出异常，而不是返回错误

使用上下文管理器

替代 finally 语句清理资源

用于忽略异常

使用 contextmanager 装饰器

案例故事

提前奔溃也挺好

异常与抽象的一致性

一个小 Demo

避免抛出抽象级别高于当前模块的异常

包装抽象级别低于当前模块的异常

编程建议

不要随意忽略异常

不要手动做数据校验

抛出可区分的异常

不要使用 assert 来检查参数合法性

无需处理是最好的错误处理

一个小故事

空对象模式

总结

基础知识

错误处理和参数校验

当你捕获异常时

当你抛出异常时

参考

使用空 raise 语句