用ModelScope 粤语模型推理声音的时候出现了以下问题,有遇到过的吗?
Traceback (most recent call last):
File ""kantts/bin/text_to_wav.py"", line 234, in
args.lang,
File ""kantts/bin/text_to_wav.py"", line 161, in text_to_wav
am_infer(symbols_file, am_ckpt, output_dir, se_file)
File ""/root/KAN-TTS/kantts/bin/infer_sambert.py"", line 222, in am_infer
line[1], fsnet, ling_unit, device, se=se
File ""/root/KAN-TTS/kantts/bin/infer_sambert.py"", line 87, in am_synthesis
[inputs_sy, inputs_tone, inputs_syllable, inputs_ws], dim=-1
RuntimeError: stack expects each tensor to be equal size, but got [5] at entry 0 and [21] at entry 1