一、项目工程下载
2noise/ChatTTS: ChatTTS is a generative speech model for daily dialogue. (github.com)
直接git clone即可!(直接down包,解压)
二、模型下载
测试了两种,直接编写代码下载即可!最简单方便。
#SDK模型下载
from modelscope import snapshot_download
model_dir = snapshot_download('pzc163/chatTTS')
不写绝对路径就会保存在这个位置
C:\Users\Administrator\.cache\modelscope\hub\pzc163
下载完之后移动到自己的工程下,目录如下:
三、环境安装
需要安装下面的库
omegaconf~=2.3.0
torch~=2.1.0
tqdm
einops
vector_quantize_pytorch
transformers~=4.41.1
vocos
如果是新的环境,就直接全部安装,如果本身有一些库就自行pip install就可以。
全部安装运行:
pip install -r requirements.txt
四、运行Demo.py并保存结果
import scipy
import ChatTTS
from IPython.display import Audio
chat = ChatTTS.Chat()
chat.load_models(source='local', local_path='ChatTTS')
params_infer_code = {'prompt':'[speed_5]', 'temperature':.3}
params_refine_text = {'prompt':'[oral_2][laugh_0][break_6]'}
texts = ["四川美食可多了,[uv_break] 有麻辣火锅、宫保鸡丁、麻婆豆腐、[uv_break] 担担面、回锅肉、夫妻肺片等, [uv_break] 每样都让人垂涎三尺。"]
wav = chat.infer(texts, \
params_refine_text=params_refine_text, params_infer_code=params_infer_code)
#texts = ["This is a test of the ChatTTS script. Peter Piper picked a peck of pickled peppers. Red leather. Yellow leather. Red leather. Yellow leather. Red leather. Yellow leather.",]
# wavs = chat.infer(texts, use_decoder=True)
Audio(wav[0], rate=24_000, autoplay=True)
scipy.io.wavfile.write(filename = "output.wav", rate = 24_000, data = wav[0].T)
五、报错情况
windows会报错,需要改ChatTTS/core.py,第75行。
compile: bool = False,
结语:以上内容仅供学习使用!!!