Linux笔记--基于OCRmyPDF将扫描件PDF转换为可搜索的PDF - 悦读

Linux笔记--基于OCRmyPDF将扫描件PDF转换为可搜索的PDF

1--官方仓库

https://github.com/ocrmypdf/OCRmyPDF

2--基本步骤

# 安装ocrmypdf库
sudo apt install ocrmypdf

# 安装简体中文库
sudo apt-get install tesseract-ocr-chi-sim

# 转换
# -l 表示使用的语言
# --force-ocr 防止出现以下错误：ERROR - PriorOcrFoundError: page already has text! - aborting (use --force-ocr to force OCR)
# input.pdf 表示待转换的pdf
# output.pdf 表示转换后保存的pdf
ocrmypdf -l chi_sim input.pdf output.pdf --force-ocr

3--常见错误

Error1：

ERROR - PriorOcrFoundError: page already has text! - aborting (use --force-ocr to force OCR)

Solution：

添加--force-ocr

ocrmypdf -l chi_sim input.pdf output3.pdf --force-ocr

悦读

道可道，非常道；名可名，非常名。无名，天地之始，有名，万物之母。故常无欲，以观其妙，常有欲，以观其徼。此两者，同出而异名，同谓之玄，玄之又玄，众妙之门。

设计已死？AIGC时代创意设计师的“智能设计”和“人智协作”

ViewPager中Fragment的生命周期

uniapp和小程序如何分包，详细步骤手把手（图解）_uni分包

一直报错 javadoc编译失败

【一建、一造经验分享】一建挺难的，要坚持才能得到

Android-如何实现Apng动画播放

目前的OCR都是如何实现的

quill-editor使用方法，图片base64位转为url缩减字符长度，以及显示文字个数，光标位置等

C语言实现。将一个整型数字转换成字符串，如：数字123转换成”123”利用指针将字符串前后对调，得到”321”。函数：将正整数转换为字符串并反转

;