Bootstrap

SQL-leetcode—1179. 重新格式化部门表

1179. 重新格式化部门表

表 Department:

±--------------±--------+
| Column Name | Type |
±--------------±--------+
| id | int |
| revenue | int |
| month | varchar |
±--------------±--------+
在 SQL 中,(id, month) 是表的联合主键。
这个表格有关于每个部门每月收入的信息。
月份(month)可以取下列值 [“Jan”,“Feb”,“Mar”,“Apr”,“May”,“Jun”,“Jul”,“Aug”,“Sep”,“Oct”,“Nov”,“Dec”]。

重新格式化表格,使得 每个月 都有一个部门 id 列和一个收入列。

以 任意顺序 返回结果表。

结果格式如以下示例所示。

示例 1:

输入:
Department table:
±-----±--------±------+
| id | revenue | month |
±-----±--------±------+
| 1 | 8000 | Jan |
| 2 | 9000 | Jan |
| 3 | 10000 | Feb |
| 1 | 7000 | Feb |
| 1 | 6000 | Mar |
±-----±--------±------+
输出:
±-----±------------±------------±------------±----±------------+
| id | Jan_Revenue | Feb_Revenue | Mar_Revenue | … | Dec_Revenue |
±-----±------------±------------±------------±----±------------+
| 1 | 8000 | 7000 | 6000 | … | null |
| 2 | 9000 | null | null | … | null |
| 3 | null | 10000 | null | … | null |
±-----±------------±------------±------------±----±------------+
解释:四月到十二月的收入为空。
请注意,结果表共有 13 列(1 列用于部门 ID,其余 12 列用于各个月份)。

题解

格式化表格,使得 每个月 都有一个部门 id 列和一个收入列

  • 经典的行转列,可以使用聚合函数+group by + case when来实现

方法一 SUM + group by

select
    id
    ,SUM(case when month='Jan' then revenue else null end) as Jan_Revenue
    ,SUM(case when month='Feb' then revenue else null end) as Feb_Revenue
    ,SUM(case when month='Mar' then revenue else null end) as Mar_Revenue
    ,SUM(case when month='Apr' then revenue else null end) as Apr_Revenue
    ,SUM(case when month='May' then revenue else null end) as May_Revenue
    ,SUM(case when month='Jun' then revenue else null end) as Jun_Revenue
    ,SUM(case when month='Jul' then revenue else null end) as Jul_Revenue
    ,SUM(case when month='Aug' then revenue else null end) as Aug_Revenue
    ,SUM(case when month='Sep' then revenue else null end) as Sep_Revenue
    ,SUM(case when month='Oct' then revenue else null end) as Oct_Revenue
    ,SUM(case when month='Nov' then revenue else null end) as Nov_Revenue
    ,SUM(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id

方法二 MAX + group by

select
    id
    ,MAX(case when month='Jan' then revenue else null end) as Jan_Revenue
    ,MAX(case when month='Feb' then revenue else null end) as Feb_Revenue
    ,MAX(case when month='Mar' then revenue else null end) as Mar_Revenue
    ,MAX(case when month='Apr' then revenue else null end) as Apr_Revenue
    ,MAX(case when month='May' then revenue else null end) as May_Revenue
    ,MAX(case when month='Jun' then revenue else null end) as Jun_Revenue
    ,MAX(case when month='Jul' then revenue else null end) as Jul_Revenue
    ,MAX(case when month='Aug' then revenue else null end) as Aug_Revenue
    ,MAX(case when month='Sep' then revenue else null end) as Sep_Revenue
    ,MAX(case when month='Oct' then revenue else null end) as Oct_Revenue
    ,MAX(case when month='Nov' then revenue else null end) as Nov_Revenue
    ,MAX(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id

方法三 MIN + group by

select
    id
    ,MIN(case when month='Jan' then revenue else null end) as Jan_Revenue
    ,MIN(case when month='Feb' then revenue else null end) as Feb_Revenue
    ,MIN(case when month='Mar' then revenue else null end) as Mar_Revenue
    ,MIN(case when month='Apr' then revenue else null end) as Apr_Revenue
    ,MIN(case when month='May' then revenue else null end) as May_Revenue
    ,MIN(case when month='Jun' then revenue else null end) as Jun_Revenue
    ,MIN(case when month='Jul' then revenue else null end) as Jul_Revenue
    ,MIN(case when month='Aug' then revenue else null end) as Aug_Revenue
    ,MIN(case when month='Sep' then revenue else null end) as Sep_Revenue
    ,MIN(case when month='Oct' then revenue else null end) as Oct_Revenue
    ,MIN(case when month='Nov' then revenue else null end) as Nov_Revenue
    ,MIN(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id

可能一开始看到SUM、MAX、MIN会不理解为啥?

在这里插入图片描述
在这里插入图片描述

可以看下这2个图例呢?

中间分组的过程其实是内部存储的,无法查询出来的一个虚拟的结果,一个框是一个集合的内容,这样的话就比较好理解为啥用聚合函数了。

如果不使用聚合函数会怎么样呢?
如果不使用的话,行数不会减少,会和输入数据一样的行数,就需要考虑一个合并的问题了。
大致效果是:
1, 100,null,null,null,…
2,null,100,null,null,…
1,null,100,null,null,…

显然id=1的数据没有合并,违背了行转列的预期效果。

分析案例

解题思路
由于筛选结果中每个ID是一个记录 因此GROUP BY ID.
每个月份是一列,因此筛选每个月份时使用CASE [when…then…] END只取当前月份.
需要使用SUM()聚合函数 因为如果没有聚合函数 筛选出来的是
GROUP BY、CASE…END之后的第一行.

比如 Department 表:
+------+---------+-------+
| id   | revenue | month |
+------+---------+-------+
| 1    | 8000    | Jan   |
| 2    | 9000    | Jan   |
| 3    | 10000   | Feb   |
| 1    | 7000    | Feb   |
| 1    | 6000    | Mar   |
+------+---------+-------+

GROUP BY ID
+------+---------+-------+
| id   | revenue | month |
+------+---------+-------+
| 1    | 8000    | Jan   |
| 1    | 7000    | Feb   |
| 1    | 6000    | Mar   |
-------------------------
| 2    | 9000    | Jan   |
-------------------------
| 3    | 10000   | Feb   |
+------+---------+-------+


如果没有聚合函数 只输出第一行 比如
SELECT ID, (CASE WHEN MONTH='JAN' THEN REVENUE END) AS JAN_REVENUE, 
(CASE WHEN MONTH='FEB' THEN REVENUE END) AS FEB_REVENUE  
FROM DEPARTMENT GROUP BY ID
会输出
+------+-------------+-------------+
| ID   | JAN_REVENUE | FEB_REVENUE |
+------+-------------+-------------+
| 1    | 8000        | NULL        |
| 2    | 7000        | NULL        |
| 3    | NULL        | 10000       |
+------+-------------+-------------+
其中 ID=1 的 FEB_REVENUE 结果不对,这是因为 ID=1 时, (CASE WHEN MONTH='FEB' THEN REVENUE END)= [NULL, 7000, NULL], 没有聚合函数会只取第一个,即NULL
;