Bootstrap

滴滴SQL真题整理

第一题

问:请统计2021年国庆7天期间在北京接单至少3次的司机的平均接单数和平均兼职收入(暂不考虑平台佣金,直接计算完成的订单费用总额),结果保留三位小数。

输出示例:

cityavg_order_num

avg_income

北京3.500121.000
with table1 as(
select driver_id, count(order_id) as order_num,sum(fare) as income
from tb_get_car_order a
join tb_get_car_record b on a.order_id = b.order_id
where city = "北京"
      and date_format(order_time, "%Y%m%d") between "20211001" and "20211007"
group by driver_id
having count(order_id) >= 3 
)

select 
"北京" as city,
round(avg(order_num),3) as avg_order_num,
round(avg(income),3) as avg_income
from table1

第二题

问:找到2021年10月有过取消订单记录的司机,计算他们每人全部已完成的有评分订单的平均评分及总体平均评分,保留1位小数。先按driver_id升序输出,再输出总体情况。

输出示例:

driver_id

avg_grade
202

4.3

2034.8
总体4.6
select 
ifnull(driver_id,"总体) as driver_id,
round(avg(grade),1) as avg_grade
from tb_get_car_order
where driver_id in (select driver_id
                    from tb_get_car_order
                    where isnull(fare)
                          and date_format(order_time, "%Y-%m") = "2021-10")
      and not isnull(grade)
group by driver_id
with rollup  

第三题

问:已知用户登陆表log_t, 包含登陆日期和登录用户ID,请查询出截止到当前日期累积登录用户数及登录用户列表

输入示例:log_t

log_dateuser_id
2024-01-01

a

2024-01-02a
2024-01-02b
2024-01-03

b

2024-01-04c
2024-01-05b
2024-01-05

c

2024-01-05d
2024-01-05e

输出示例:

log_dateuser_cntuser_list
2024-01-011["a"]
2024-01-022["a","b"]
2024-01-032["a","b"]
2024-01-043["a","b","c"]
2024-01-055["a","b","c","d","e"]
with temp as(
select log_date,
       collect_set(user_id) over(order by log_date rows between unbounded preceding and current row) as user_list
from log_t
)

select log_date,
size(user_list) as user_cnt,
user_list
from temp

第四题

问:输出’2023-01-01‘当天,每个司机的最大接单间隔(按照秒计算),要注意存在司机上一单未完成就提前接到下一单的情况。

输入示例:

订单开始表

driver_idorder_idstart_time
110001672534145
110041672535945
110081672537745
110101672538945

订单结束表

driver_idorder_idclose_time
110001672536145
110041672537705
110081672538925
110101672539345

输出

driver_idtimegap
140
WITH temp AS (
    SELECT 
        t1.driver_id, t1.order_id, t1.start_time, t2.close_time,
        unix_timestamp(LEAD(t1.start_time, 1, NULL) OVER (PARTITION BY t1.driver_id ORDER BY t1.start_time)) - unix_timestamp(t2.close_time) AS timediff
    FROM t1
    LEFT JOIN t2 ON t1.order_id = t2.order_id
)
SELECT 
    driver_id, MAX(timediff) AS max_timediff
FROM temp
GROUP BY driver_id;

第五题

问:2023年1月1日完成首单的新司机的首单后的首周完单留存率(首单后的第一至第七天有完单行为),单个司机平均的7日及30日订单总金额ARPU(ARPU从首单当日算起,平均计算包括位留存的司机)

输入:t1(分区DATE, yyyy-MM-dd)

字段d_idorder_idorder_statusorder_amount
中文司机ID订单ID订单状态订单金额
类型intintintfloat
备注————完单1;取消0若取消则为空
字段d_idcreate_datefirst_order_date
中文司机ID司机注册日期司机首单日期
类型intintstring(yyyy-MM-dd)
备注————司机无首单则为空

输出:

Rate

ARPU_7dARPU_30d
0.620

35

WITH temp AS (
    SELECT  a.d_id, a.order_status, a.order_amount, a.DATE, b.first_order_date,
    DATEDIFF(a.DATE, b.first_order_date) AS DATEGAP
    FROM 
        (SELECT * FROM t1 WHERE DATE BETWEEN '2023-01-01' AND '2023-02-28') a
    LEFT JOIN 
        (SELECT * FROM t2 WHERE first_order_date = '2023-01-01') b
    ON a.d_id = b.d_id
),

retention AS (
    SELECT 
        d_id, 
        COUNT(DISTINCT CASE WHEN DATEGAP BETWEEN 1 AND 7 AND order_status = 1 THEN d_id ELSE NULL END) AS is_rnt,
        SUM(CASE WHEN DATEGAP BETWEEN 0 AND 6 THEN order_amount ELSE 0 END) AS arpu_7d,
        SUM(CASE WHEN DATEGAP BETWEEN 0 AND 29 THEN order_amount ELSE 0 END) AS arpu_30d
    FROM temp
    GROUP BY d_id
)

SELECT
    SUM(is_rnt) / COUNT(DISTINCT d_id) AS rate,
    SUM(arpu_7d) / COUNT(DISTINCT d_id) AS arpu_7d_avg,
    SUM(arpu_30d) / COUNT(DISTINCT d_id) AS arpu_30d_avg
FROM 
    retention;

;