Bootstrap

Pydantic系列之序列化

Pydantic系列之序列化

model_dump

model_dump将对象转化为字典对象,之后便可以调用Python标准库序列化为json字符串,会序列化嵌套对象。也可以使用dict(model)将对象转化为字典,但嵌套对象不会被转化为字典。

自定义序列化

@field_serializer

装饰在实例方法或者静态方法,被装饰方法可以是以下四种。

  • (self, value: Any, info: FieldSerializationInfo)

  • (self, value: Any, nxt: SerializerFunctionWrapHandler, info: FieldSerializationInfo)

  • (value: Any, info: SerializationInfo)

  • (value: Any, nxt: SerializerFunctionWrapHandler, info: SerializationInfo)

    默认为PlainSerializer,不走pydantic的序列化逻辑,此时的方法签名只能是1或3,

    nxt参数为pydantic序列化链

    mode='wrap’支持上述四个方法签名,可完成前置处理,pydantic序列化逻辑,载返回之前再处理的逻辑。

from datetime import datetime, timedelta, timezone

from pydantic import BaseModel, ConfigDict, field_serializer
from pydantic_core.core_schema import FieldSerializationInfo, SerializerFunctionWrapHandler


class WithCustomEncoders(BaseModel):
    model_config = ConfigDict(ser_json_timedelta='iso8601')

    dt: datetime
    diff: timedelta
    diff2: timedelta

    @field_serializer('dt')
    def serialize_dt(self, dt: datetime, _info: FieldSerializationInfo):
        print(_info)
        return dt.timestamp()

    # 下面的装饰器先执行
    @field_serializer('diff')
    def ssse(self, diff: timedelta, info: FieldSerializationInfo):
        print(info)
        return diff.total_seconds()

    @field_serializer('diff2', mode='wrap')
    @staticmethod
    def diff2_ser(diff2: timedelta, nxt: SerializerFunctionWrapHandler, info: FieldSerializationInfo):
        value = nxt(diff2)
        return value + 'postprocess'


m = WithCustomEncoders(
    dt=datetime(2032, 6, 1, tzinfo=timezone.utc), diff=timedelta(minutes=2),
    diff2=timedelta(minutes=1)
)

print(m.model_dump_json())
# {"dt":1969660800.0,"diff":120.0,"diff2":"PT60Spostprocess"}

@model_serializer

  • (self, info: FieldSerializationInfo),mode=‘plain’
  • (self, nxt: SerializerFunctionWrapHandler, info: FieldSerializationInfo),mode=‘plain’
from typing import Dict, Any

from pydantic import BaseModel, model_serializer
from pydantic_core.core_schema import SerializerFunctionWrapHandler, SerializationInfo


class Model(BaseModel):
    x: str

    @model_serializer
    def ser_model(self, info: SerializationInfo):
        print(info)
        return {'x': f'xxxxxx {self.x}'}

    @model_serializer(mode='wrap')
    def ser_model_wrap(self, nxt: SerializerFunctionWrapHandler, info: SerializationInfo) -> Dict[str, Any]:
        print(info)
        return {'x': f'serialized {nxt(self)}'}


print(Model(x='test value').model_dump_json())
# {"x":"serialized {'x': 'test value'}"}

PlainSerializer和WrapSerializer

from typing import Any

from typing_extensions import Annotated

from pydantic import BaseModel, SerializerFunctionWrapHandler
from pydantic.functional_serializers import WrapSerializer, PlainSerializer


def ser_wrap(v: Any, nxt: SerializerFunctionWrapHandler) -> str:
    return f'{nxt(v + 1):,}'


FancyInt = Annotated[int, WrapSerializer(ser_wrap, when_used='json')]
DoubleInt = Annotated[int, PlainSerializer(lambda x: x * 2)]


class MyModel(BaseModel):
    x: FancyInt
    y: DoubleInt


print(MyModel(x=1234, y=2).model_dump())
# {'x': 1234, 'y': 4}

print(MyModel(x=1234, y=2).model_dump(mode='json'))
# {'x': '1,235', 'y': 4}

如何指定某个类型的序列化行为

pydantic v1版本,configdict有个json_encoders参数,可以配置指定类型的序列化行为。
pydantic v2版本,不推荐json_encoders参数,可使用如下方式

def serialize_datetime(value: datetime.datetime, __: SerializerFunctionWrapHandler, _: SerializationInfo):
    return value.strftime('%Y-%m-%d %H:%M:%S')


LocalDateTime = Annotated[datetime.datetime, WrapSerializer(serialize_datetime, when_used='json')]

按照声明类型序列化,而不是实际类型

当某个属性的声明类型是可序列化类型时,如BaseModeldataclassTypedDict等,按照声明类型序列化,而不是实际类型。如果想改变这种行为,可以使用SerializeAsAny

from pydantic import BaseModel, SerializeAsAny


class User(BaseModel):
    name: str


class UserLogin(User):
    password: str


class OuterModel(BaseModel):
    # 声明为User类型,按照User类序列化,只有name字段
    user: User
    user1: SerializeAsAny[User] = UserLogin(name='serialize as any', password='hunter')

# 实际类型为UserLogin
user = UserLogin(name='pydantic', password='hunter2')

m = OuterModel(user=user)
print(m)
# user=UserLogin(name='pydantic', password='hunter2') user1=UserLogin(name='serialize as any', password='hunter')
print(m.model_dump())
# {'user': {'name': 'pydantic'}, 'user1': {'name': 'serialize as any', 'password': 'hunter'}}

pickle

# TODO need to get pickling to work
import pickle

from pydantic import BaseModel


class FooBarModel(BaseModel):
    a: str
    b: int


m = FooBarModel(a='hello', b=123)
print(m)
#> a='hello' b=123
data = pickle.dumps(m)
print(data[:20])
#> b'\x80\x04\x95\x95\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main_'
m2 = pickle.loads(data)
print(m2)
#> a='hello' b=123

灵活的exclude和include

  • exclude,include支持集合,字典
  • 支持集合指定位置序列化或不序列化,exclude = {'items' :{0: True, -1: False} include = {'items': {'__all__':{'id':False}}}
from pydantic import BaseModel, SecretStr


class User(BaseModel):
    id: int
    username: str
    password: SecretStr


class Transaction(BaseModel):
    id: str
    user: User
    value: int


t = Transaction(
    id='1234567890',
    user=User(id=42, username='JohnDoe', password='hashedpassword'),
    value=9876543210,
)

# using a set:
print(t.model_dump(exclude={'user', 'value'}))
#> {'id': '1234567890'}

# using a dict:
print(t.model_dump(exclude={'user': {'username', 'password'}, 'value': True}))
#> {'id': '1234567890', 'user': {'id': 42}}

print(t.model_dump(include={'id': True, 'user': {'id'}}))
#> {'id': '1234567890', 'user': {'id': 42}}
;