Bootstrap

前端 JS 经典:大文件分片

前言:当用户上传大文件的时候,我们需要保证页面的流畅,还要监听上传进度,还有用户如果取消上传后,再度上传相同文件,是否需要从头上传。

1. html 结构

input 按钮,上传文件。会用到 md5 文件 hash 值,方便上传中断后继续上传,如果已经上传的部分,不会重新上传。

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <meta
      name="viewport"
      content="initial-scale=1.0, user-scalable=no, width=device-width"
    />
    <title>document</title>
    <style></style>
  </head>
  <body>
    <input type="file" />
    <script
      src="https://lf9-cdn-tos.bytecdntp.com/cdn/expire-1-M/spark-md5/3.0.2/spark-md5.min.js"
      type="application/javascript"
    ></script>
    <script src="./main.js"></script>
  </body>
</html>

2. main.js

监听 input 按钮,获取到文件,然后传给 cutFile 函数进行分片。返回分片结果。

import { cutFile } from "./cutFile.js";

const inputFile = document.querySelector('input[type="file"]');

inputFile.onchange = async (e) => {
  const file = e.target.files[0];
  const chunk = await cutFile(file);
  console.log(chunk);
};

 3. cutFile.js

切片函数编写,定义 CHUNK_SIZE 每片大小,开启多线程,拿到你电脑内核数。定义 cutFile 方法,传入 file 开始根据内核开始切片。

const CHUNK_SIZE = 1024 * 1024 * 5; // 5MB
const THREAD_COUNT = navigator.hardwareConcurrency || 4; // 拿到内核数

export function cutFile(file) {
  return new Promise((resolve) => {
    const chunkCount = Math.ceil(file.size / CHUNK_SIZE);
    const threadChunkCount = Math.ceil(chunkCount / THREAD_COUNT);
    const result = [];
    let finishCount = 0;

    for (let i = 0; i < THREAD_COUNT; i++) {
      const worker = new Worker("./worker.js", {
        type: "module",
      });
      let end = (i + 1) * threadChunkCount;
      const start = i * threadChunkCount;
      if (end > chunkCount) {
        end = chunkCount;
      }
      worker.postMessage({
        file,
        CHUNK_SIZE,
        startChunkIndex: start,
        endChunkIndex: end,
      });
      worker.onmessage = (e) => {
        for (let i = start; i < end; i++) {
          result[i] = e.data[i - start];
        }
        worker.terminate();
        finishCount++;
        if (finishCount === THREAD_COUNT) {
          resolve(result);
        }
      };
    }
  });
}

4. worker.js

定义线程切片方法

// worker.js
import { createChunk } from "./createChunk.js";

onmessage = async (e) => {
  const {
    file,
    CHUNK_SIZE,
    startChunkIndex: start,
    endChunkIndex: end,
  } = e.target;

  const proms = [];
  for (let i = start; i < end; i++) {
    proms.push(createChunk(file, i, CHUNK_SIZE));
  }
  const chunks = await Promise.all(proms);
  postMessage(chunks);
};

5. createChunk.js

给每个分片通过 md5 编码设置一个 hash,判断是否已经上传。

import SparkMD5 from "./sparkmd5.js";

export function createChunk(file, index, chunkSize) {
  return new Promise((resolve) => {
    const start = index * chunkSize;
    const end = start + chunkSize;
    const spark = new SparkMD5.ArrayBuffer();
    const fileReader = new FileReader();
    const blob = file.slice(start, end);

    fileReader.onload = (e) => {
      spark.append(e.target.result);
      resolve({
        start,
        end,
        index,
        hash: spark.end(),
        blob,
      });
    };

    fileReader.readAsArrayBuffer(blob);
  });
}

;