output of dev port show
- in kernel 5.15
pci/0000:08:00.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
function:
hw_addr 00:00:00:00:00:00
pci/0000:08:00.1/131071: type eth netdev eth3 flavour physical port 1 splittable false
function:
hw_addr 00:00:00:00:00:00
- in kernel 6.13
auxiliary/mlx5_core.eth.0/65535: type eth netdev ens1f0np0 flavour physical port 0 splittable false
in two different kerenl , we have different output format. In old kernel , we have the pci information, but in new kernel , we don’t have the pci information
devlink callback fucntion
for devlink port show cmd, it will call kernel devlink framework functions to give the netlink feedbak
-
6.13 kernel
devlink port show ==> devlink_nl_port_get_dumpit()
we use mlx5e_devlink_port_register() to register the port information -
5.15 kernel
devlink port show ==> devlink_nl_cmd_port_get_dumpit()“devlink port show” will use the devlink framework function to finish the cmd. this command will not call the specific driver (mlx5e)'s struct devlink_ops (mlx5_devlink_ops) interfaces
Both of them will call function devlink_nl_put_handle to fill in the devlink port index information
static int devlink_nl_put_handle(struct sk_buff *msg, struct devlink *devlink)
{
if (nla_put_string(msg, DEVLINK_ATTR_BUS_NAME, devlink->dev->bus->name))
return -EMSGSIZE;
if (nla_put_string(msg, DEVLINK_ATTR_DEV_NAME, dev_name(devlink->dev)))
return -EMSGSIZE;
return 0;
}
So that means, the “pci/0000:08:00.0/65535” and “auxiliary/mlx5_core.eth.0/65535” come from devlink->dev->bus->name and dev_name(devlink->dev). So the key structure is devlink->dev
devlink->dev
pci level devlink
devlink of mlx5_core dev is pci device level. and is created at pci_setup_device() which is called before probe_one()
=> pci_setup_device set pci device name as BDF#
=> probe_one
=> dev_name(dev->device) = 0000:08:00.0 (mlx5_core_dev *dev)
=> devlink = mlx5_devlink_alloc(&pdev->dev);
=> devlink_alloc
devlink_alloc will set devlink->dev as **&pdev->dev **
auxiliary dev level devlink
In function _mlx5e_probe ,will create devlink port instance under devlink instance
HAVE_DEVLINK_PER_AUXDEV enable
New kernel will enable HAVE_DEVLINK_PER_AUXDEV flag. which means each auxiliary device has its own devlink instance
=> _mlx5e_probe
=> mlx5e_create_devlink(&adev->dev, mdev);
=> devlink_alloc(&mlx5e_devlink_ops, sizeof(*mlx5e_dev), dev);
=> devlink->dev = dev // set devlink->dev as adev->dev, dev_name is "mlx5_core.eth.0"
so, when we run “devlink port show” , the devlink_nl_put_handle will get per adev devlink instance and get the dev_name() as mlx5_core.eth.0 and pci type as auxiliary, that is why we don’t have pci information over the new kernel.
devlink_alloc alloc will set the private data of devlink as mlx5e_dev
_mlx5e_probe function will assosiate the adev with mlx5e_dev
auxiliary_set_drvdata(adev, mlx5e_dev);
register the devlink port over the per adev devlink instance
=> _mlx5e_probe
=> err = mlx5e_devlink_port_register(mlx5e_dev, mdev);
in mlx5e_devlink_port_register function, it will register devlink over the per adev devlink instance
struct devlink *devlink = priv_to_devlink(mlx5e_dev);
return devlink_port_register(devlink, &mlx5e_dev->dl_port,
dl_port_index);
BTW, the adev->dev and adev->dev->bus are initialized at auxiliary_device_init
int auxiliary_device_init(struct auxiliary_device *auxdev)
{
struct device *dev = &auxdev->dev;
dev->bus = &auxiliary_bus_type; // 设置总线类型为 auxiliary bus
dev_set_name(dev, "%s.%d", auxdev->name, auxdev->id); // 设置设备名称
return 0;
}
HAVE_DEVLINK_PER_AUXDEV disable
Old kernel will enable HAVE_DEVLINK_PER_AUXDEV flag. which means only pci level device(mlx5_core) has devlink instance
in mlx5e_devlink_port_register function, it will get devlink instance from mlx5_core_device(pci level)
struct devlink *devlink = priv_to_devlink(priv->mdev);
and register the devlink port with the pci level devlink
return devlink_port_register(devlink, dl_port, dl_port_index);
so, when we run “devlink port show” , the devlink_nl_put_handle will get pci level devlink instance and get the dev_name() as 0000:08:00.0 and pci type as pci, that is why we have pci information over the old kernel.