开发者社区 > ModelScope模型即服务 > 语音 > 正文

PTTS-basemodel微调报错

根据 https://modelscope.cn/models/damo/speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k/summary 指引操作,到 train 的时候走不下去了。

**AttributeError: 'Voice' object has no attribute 'local_rank' **

顺便在此问下,notebook 中的代码有什么推荐的调试方法吗?没有本地方便,但是Mac有装不了环境。。

完整的执行信息:


2023-04-18 01:35:45,921 - modelscope - INFO - Set workdir to ./pretrain_work_dir/
2023-04-18:01:35:45, INFO [tts_trainer.py:81] Set workdir to ./pretrain_work_dir/
2023-04-18 01:35:45,942 - modelscope - INFO - load ./output_training_data/
2023-04-18:01:35:45, INFO [tts_trainer.py:104] load ./output_training_data/
2023-04-18 01:35:46,307 - modelscope - INFO - Use user-specified model revision: v1.0.4
2023-04-18:01:35:46, INFO [api.py:463] Use user-specified model revision: v1.0.4
2023-04-18 01:35:51,268 - modelscope - INFO - am_config=./pretrain_work_dir/orig_model/basemodel_16k/sambert/config.yaml voc_config=./pretrain_work_dir/orig_model/basemodel_16k/hifigan/config.yaml
2023-04-18:01:35:51, INFO [voice.py:150] am_config=./pretrain_work_dir/orig_model/basemodel_16k/sambert/config.yaml voc_config=./pretrain_work_dir/orig_model/basemodel_16k/hifigan/config.yaml
2023-04-18 01:35:51,269 - modelscope - INFO - audio_config=./pretrain_work_dir/orig_model/basemodel_16k/audio_config_se_16k.yaml
2023-04-18:01:35:51, INFO [voice.py:152] audio_config=./pretrain_work_dir/orig_model/basemodel_16k/audio_config_se_16k.yaml
2023-04-18 01:35:51,269 - modelscope - INFO - am_ckpts=OrderedDict([(3180000, './pretrain_work_dir/orig_model/basemodel_16k/sambert/ckpt/checkpoint_3180000.pth')])
2023-04-18:01:35:51, INFO [voice.py:153] am_ckpts=OrderedDict([(3180000, './pretrain_work_dir/orig_model/basemodel_16k/sambert/ckpt/checkpoint_3180000.pth')])
2023-04-18 01:35:51,270 - modelscope - INFO - voc_ckpts=OrderedDict([(2400000, './pretrain_work_dir/orig_model/basemodel_16k/hifigan/ckpt/checkpoint_2400000.pth')])
2023-04-18:01:35:51, INFO [voice.py:154] voc_ckpts=OrderedDict([(2400000, './pretrain_work_dir/orig_model/basemodel_16k/hifigan/ckpt/checkpoint_2400000.pth')])
2023-04-18 01:35:51,271 - modelscope - INFO - se_path=./pretrain_work_dir/orig_model/se.npy se_model_path=./pretrain_work_dir/orig_model/basemodel_16k/speaker_embedding/se.onnx
2023-04-18:01:35:51, INFO [voice.py:156] se_path=./pretrain_work_dir/orig_model/se.npy se_model_path=./pretrain_work_dir/orig_model/basemodel_16k/speaker_embedding/se.onnx
2023-04-18 01:35:51,271 - modelscope - INFO - mvn_path=./pretrain_work_dir/orig_model/mvn.npy
2023-04-18:01:35:51, INFO [voice.py:157] mvn_path=./pretrain_work_dir/orig_model/mvn.npy
festival_initialize() called more than once
100%|██████████| 40/40 [00:00<00:00, 16466.01it/s]
2023-04-18:01:35:57, INFO [TextScriptConvertor.py:469] TextScriptConvertor.process:
Save script to: ./pretrain_work_dir/data/Script.xml
2023-04-18:01:35:57, INFO [TextScriptConvertor.py:490] TextScriptConvertor.process:
Save metafile to: ./pretrain_work_dir/data/raw_metafile.txt
2023-04-18:01:35:57, INFO [audio_processor.py:90] [AudioProcessor] Initialize AudioProcessor.
2023-04-18:01:35:57, INFO [audio_processor.py:91] [AudioProcessor] config params:
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] wav_normalize: True
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] trim_silence: True
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] trim_silence_threshold_db: 60
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] preemphasize: False
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] sampling_rate: 16000
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] hop_length: 200
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] win_length: 1000
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] n_fft: 2048
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] n_mels: 80
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] fmin: 0.0
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] fmax: 8000.0
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] phone_level_feature: True
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] se_feature: True
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] norm_type: mean_std
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] max_norm: 1.0
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] symmetric: False
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] min_level_db: -100.0
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] ref_level_db: 20
2023-04-18:01:35:57, INFO [audio_processor.py:93] [AudioProcessor] num_workers: 16
2023-04-18:01:35:57, INFO [audio_processor.py:201] [AudioProcessor] Amplitude normalization started
2023-04-18:01:35:57, INFO [utils.py:184] Volume statistic proceeding...
100%|██████████| 20/20 [00:00<00:00, 67.79it/s]
2023-04-18:01:35:58, INFO [utils.py:170] Average amplitude RMS : 0.034393049999999994
2023-04-18:01:35:58, INFO [utils.py:186] Volume statistic done.
2023-04-18:01:35:58, INFO [utils.py:194] Volume normalization proceeding...
100%|██████████| 20/20 [00:00<00:00, 68.19it/s]
2023-04-18:01:35:58, INFO [utils.py:221] Volume normalization done.
2023-04-18:01:35:58, INFO [audio_processor.py:204] [AudioProcessor] Amplitude normalization finished
2023-04-18:01:35:58, INFO [audio_processor.py:394] [AudioProcessor] Duration generation started
  0%|          | 0/20 [00:00<?, ?it/s]2023-04-18:01:35:58, INFO [audio_processor.py:411] [AudioProcessor] Duration align with mel is proceeding...
100%|██████████| 20/20 [00:00<00:00, 29.46it/s]
2023-04-18:01:35:59, INFO [audio_processor.py:453] [AudioProcessor] Duration generate finished
2023-04-18:01:35:59, INFO [audio_processor.py:278] [AudioProcessor] Trim silence with interval started
2023-04-18:01:35:59, INFO [audio_processor.py:216] [AudioProcessor] Start to load pcm from ./pretrain_work_dir/data/wav
100%|██████████| 20/20 [00:00<00:00, 80.84it/s]
  0%|          | 0/20 [00:00<?, ?it/s]
100%|██████████| 20/20 [00:00<00:00, 99.79it/s] 
2023-04-18:01:36:00, INFO [audio_processor.py:314] [AudioProcessor] Trim silence finished
2023-04-18:01:36:00, INFO [audio_processor.py:322] [AudioProcessor] Melspec extraction started
100%|██████████| 20/20 [00:01<00:00, 14.38it/s]
2023-04-18:01:36:01, INFO [audio_processor.py:361] [AudioProcessor] Melspec extraction finished
2023-04-18:01:36:01, INFO [audio_processor.py:365] Melspec statistic proceeding...
100%|██████████| 20/20 [00:00<00:00, 32201.95it/s]
100%|██████████| 20/20 [00:00<00:00, 14391.16it/s]
2023-04-18:01:36:01, INFO [audio_processor.py:368] Melspec statistic done
2023-04-18:01:36:01, INFO [audio_processor.py:374] [AudioProcessor] melspec mean and std saved to:
./pretrain_work_dir/data/mel/mel_mean.txt,
./pretrain_work_dir/data/mel/mel_std.txt
2023-04-18:01:36:01, INFO [audio_processor.py:378] [AudioProcessor] Melspec mean std norm is proceeding...
2023-04-18:01:36:01, INFO [audio_processor.py:384] [AudioProcessor] Melspec normalization finished
2023-04-18:01:36:01, INFO [audio_processor.py:385] [AudioProcessor] Normed Melspec saved to ./pretrain_work_dir/data/mel
2023-04-18:01:36:02, INFO [audio_processor.py:467] [AudioProcessor] Pitch extraction started
  0%|          | 0/20 [00:00<?, ?it/s]2023-04-18:01:36:02, INFO [audio_processor.py:483] [AudioProcessor] Pitch align with mel is proceeding...
100%|██████████| 20/20 [00:00<00:00, 61.56it/s]
2023-04-18:01:36:02, INFO [audio_processor.py:510] [AudioProcessor] Pitch normalization is proceeding...
100%|██████████| 20/20 [00:00<00:00, 63791.70it/s]
100%|██████████| 20/20 [00:00<00:00, 50231.19it/s]
2023-04-18:01:36:02, INFO [audio_processor.py:518] [AudioProcessor] f0 mean and std saved to:
./pretrain_work_dir/data/f0/f0_mean.txt,
./pretrain_work_dir/data/f0/f0_std.txt
2023-04-18:01:36:02, INFO [audio_processor.py:521] [AudioProcessor] Pitch mean std norm is proceeding...
2023-04-18:01:36:03, INFO [audio_processor.py:548] [AudioProcessor] Pitch turn to phone-level is proceeding...
100%|██████████| 20/20 [00:00<00:00, 78.34it/s]
2023-04-18:01:36:03, INFO [audio_processor.py:580] [AudioProcessor] Pitch normalization finished
2023-04-18:01:36:03, INFO [audio_processor.py:581] [AudioProcessor] Normed f0 saved to ./pretrain_work_dir/data/f0
2023-04-18:01:36:03, INFO [audio_processor.py:582] [AudioProcessor] Pitch extraction finished
2023-04-18:01:36:03, INFO [audio_processor.py:593] [AudioProcessor] Energy extraction started
100%|██████████| 20/20 [00:00<00:00, 72.43it/s]
100%|██████████| 20/20 [00:00<00:00, 48856.19it/s]
100%|██████████| 20/20 [00:00<00:00, 49961.93it/s]
2023-04-18:01:36:04, INFO [audio_processor.py:638] [AudioProcessor] energy mean and std saved to:
./pretrain_work_dir/data/energy/energy_mean.txt,
./pretrain_work_dir/data/energy/energy_std.txt
2023-04-18:01:36:04, INFO [audio_processor.py:642] [AudioProcessor] Energy mean std norm is proceeding...
100%|██████████| 20/20 [00:00<00:00, 77.74it/s]
2023-04-18:01:36:05, INFO [audio_processor.py:690] [AudioProcessor] Energy normalization finished
2023-04-18:01:36:05, INFO [audio_processor.py:691] [AudioProcessor] Normed Energy saved to ./pretrain_work_dir/data/energy
2023-04-18:01:36:05, INFO [audio_processor.py:692] [AudioProcessor] Energy extraction finished
2023-04-18:01:36:05, INFO [audio_processor.py:774] [AudioProcessor] All features extracted successfully!
2023-04-18:01:36:05, INFO [data_process.py:192] Processing audio done.
2023-04-18:01:36:05, INFO [se_processor.py:63] [SpeakerEmbeddingProcessor] Speaker embedding extractor started
2023-04-18:01:36:09, INFO [se_processor.py:105] [SpeakerEmbeddingProcessor] Speaker embedding extracted successfully!
2023-04-18:01:36:09, INFO [data_process.py:201] Processing speaker embedding done.
2023-04-18:01:36:09, INFO [data_process.py:203] Processing done.
2023-04-18:01:36:09, INFO [data_process.py:49] Voc metafile generated.
2023-04-18:01:36:09, INFO [data_process.py:63] AM metafile generated.
2023-04-18 01:36:09,381 - modelscope - INFO - Start training....
2023-04-18:01:36:09, INFO [sambert_hifi.py:236] Start training....
2023-04-18 01:36:09,381 - modelscope - INFO - Start SAMBERT training...
2023-04-18:01:36:09, INFO [sambert_hifi.py:238] Start SAMBERT training...
2023-04-18 01:36:09,382 - modelscope - INFO - TRAIN SAMBERT....
2023-04-18:01:36:09, INFO [voice.py:358] TRAIN SAMBERT....
2023-04-18 01:36:09,400 - modelscope - INFO - TRAINING steps: 3180202
2023-04-18:01:36:09, INFO [voice.py:394] TRAINING steps: 3180202
2023-04-18 01:36:09,419 - modelscope - INFO - audio_config = {'fmax': 8000.0, 'fmin': 0.0, 'hop_length': 200, 'max_norm': 1.0, 'min_level_db': -100.0, 'n_fft': 2048, 'n_mels': 80, 'norm_type': 'mean_std', 'num_workers': 16, 'phone_level_feature': True, 'preemphasize': False, 'ref_level_db': 20, 'sampling_rate': 16000, 'symmetric': False, 'trim_silence': True, 'trim_silence_threshold_db': 60, 'wav_normalize': True, 'win_length': 1000}
2023-04-18:01:36:09, INFO [voice.py:403] audio_config = {'fmax': 8000.0, 'fmin': 0.0, 'hop_length': 200, 'max_norm': 1.0, 'min_level_db': -100.0, 'n_fft': 2048, 'n_mels': 80, 'norm_type': 'mean_std', 'num_workers': 16, 'phone_level_feature': True, 'preemphasize': False, 'ref_level_db': 20, 'sampling_rate': 16000, 'symmetric': False, 'trim_silence': True, 'trim_silence_threshold_db': 60, 'wav_normalize': True, 'win_length': 1000}
2023-04-18 01:36:09,419 - modelscope - INFO - Loss = {'MelReconLoss': {'enable': True, 'params': {'loss_type': 'mae'}}, 'ProsodyReconLoss': {'enable': True, 'params': {'loss_type': 'mae'}}}
2023-04-18:01:36:09, INFO [voice.py:403] Loss = {'MelReconLoss': {'enable': True, 'params': {'loss_type': 'mae'}}, 'ProsodyReconLoss': {'enable': True, 'params': {'loss_type': 'mae'}}}
2023-04-18 01:36:09,420 - modelscope - INFO - Model = {'KanTtsSAMBERT': {'optimizer': {'params': {'betas': [0.9, 0.98], 'eps': 1e-09, 'lr': 0.001, 'weight_decay': 0.0}, 'type': 'Adam'}, 'params': {'MAS': False, 'NSF': True, 'SE': True, 'decoder_attention_dropout': 0.1, 'decoder_dropout': 0.1, 'decoder_ffn_inner_dim': 1024, 'decoder_num_heads': 8, 'decoder_num_layers': 12, 'decoder_num_units': 128, 'decoder_prenet_units': [256, 256], 'decoder_relu_dropout': 0.1, 'dur_pred_lstm_units': 128, 'dur_pred_prenet_units': [128, 128], 'embedding_dim': 512, 'emotion_units': 32, 'encoder_attention_dropout': 0.1, 'encoder_dropout': 0.1, 'encoder_ffn_inner_dim': 1024, 'encoder_num_heads': 8, 'encoder_num_layers': 8, 'encoder_num_units': 128, 'encoder_projection_units': 32, 'encoder_relu_dropout': 0.1, 'max_len': 800, 'nsf_f0_global_maximum': 730.0, 'nsf_f0_global_minimum': 30.0, 'nsf_norm_type': 'global', 'num_mels': 82, 'outputs_per_step': 3, 'postnet_dropout': 0.1, 'postnet_ffn_inner_dim': 512, 'postnet_filter_size': 41, 'postnet_fsmn_num_layers': 4, 'postnet_lstm_units': 128, 'postnet_num_memory_units': 256, 'postnet_shift': 17, 'predictor_dropout': 0.1, 'predictor_ffn_inner_dim': 256, 'predictor_filter_size': 41, 'predictor_fsmn_num_layers': 3, 'predictor_lstm_units': 128, 'predictor_num_memory_units': 128, 'predictor_shift': 0, 'speaker_units': 512}, 'scheduler': {'params': {'warmup_steps': 4000}, 'type': 'NoamLR'}}}
2023-04-18:01:36:09, INFO [voice.py:403] Model = {'KanTtsSAMBERT': {'optimizer': {'params': {'betas': [0.9, 0.98], 'eps': 1e-09, 'lr': 0.001, 'weight_decay': 0.0}, 'type': 'Adam'}, 'params': {'MAS': False, 'NSF': True, 'SE': True, 'decoder_attention_dropout': 0.1, 'decoder_dropout': 0.1, 'decoder_ffn_inner_dim': 1024, 'decoder_num_heads': 8, 'decoder_num_layers': 12, 'decoder_num_units': 128, 'decoder_prenet_units': [256, 256], 'decoder_relu_dropout': 0.1, 'dur_pred_lstm_units': 128, 'dur_pred_prenet_units': [128, 128], 'embedding_dim': 512, 'emotion_units': 32, 'encoder_attention_dropout': 0.1, 'encoder_dropout': 0.1, 'encoder_ffn_inner_dim': 1024, 'encoder_num_heads': 8, 'encoder_num_layers': 8, 'encoder_num_units': 128, 'encoder_projection_units': 32, 'encoder_relu_dropout': 0.1, 'max_len': 800, 'nsf_f0_global_maximum': 730.0, 'nsf_f0_global_minimum': 30.0, 'nsf_norm_type': 'global', 'num_mels': 82, 'outputs_per_step': 3, 'postnet_dropout': 0.1, 'postnet_ffn_inner_dim': 512, 'postnet_filter_size': 41, 'postnet_fsmn_num_layers': 4, 'postnet_lstm_units': 128, 'postnet_num_memory_units': 256, 'postnet_shift': 17, 'predictor_dropout': 0.1, 'predictor_ffn_inner_dim': 256, 'predictor_filter_size': 41, 'predictor_fsmn_num_layers': 3, 'predictor_lstm_units': 128, 'predictor_num_memory_units': 128, 'predictor_shift': 0, 'speaker_units': 512}, 'scheduler': {'params': {'warmup_steps': 4000}, 'type': 'NoamLR'}}}
2023-04-18 01:36:09,421 - modelscope - INFO - allow_cache = False
2023-04-18:01:36:09, INFO [voice.py:403] allow_cache = False
2023-04-18 01:36:09,421 - modelscope - INFO - batch_size = 4
2023-04-18:01:36:09, INFO [voice.py:403] batch_size = 4
2023-04-18 01:36:09,422 - modelscope - INFO - create_time = 2023-04-18 01:36:09
2023-04-18:01:36:09, INFO [voice.py:403] create_time = 2023-04-18 01:36:09
2023-04-18 01:36:09,422 - modelscope - INFO - eval_interval_steps = 10000000000000000
2023-04-18:01:36:09, INFO [voice.py:403] eval_interval_steps = 10000000000000000
2023-04-18 01:36:09,423 - modelscope - INFO - git_revision_hash = d16755444c9baf23348213211a5ed9035458ecf0
2023-04-18:01:36:09, INFO [voice.py:403] git_revision_hash = d16755444c9baf23348213211a5ed9035458ecf0
2023-04-18 01:36:09,424 - modelscope - INFO - grad_norm = 1.0
2023-04-18:01:36:09, INFO [voice.py:403] grad_norm = 1.0
2023-04-18 01:36:09,424 - modelscope - INFO - linguistic_unit = {'cleaners': 'english_cleaners', 'lfeat_type_list': 'sy,tone,syllable_flag,word_segment,emo_category,speaker_category', 'speaker_list': 'F7'}
2023-04-18:01:36:09, INFO [voice.py:403] linguistic_unit = {'cleaners': 'english_cleaners', 'lfeat_type_list': 'sy,tone,syllable_flag,word_segment,emo_category,speaker_category', 'speaker_list': 'F7'}
2023-04-18 01:36:09,425 - modelscope - INFO - log_interval_steps = 1000
2023-04-18:01:36:09, INFO [voice.py:403] log_interval_steps = 1000
2023-04-18 01:36:09,425 - modelscope - INFO - model_type = sambert
2023-04-18:01:36:09, INFO [voice.py:403] model_type = sambert
2023-04-18 01:36:09,426 - modelscope - INFO - num_save_intermediate_results = 4
2023-04-18:01:36:09, INFO [voice.py:403] num_save_intermediate_results = 4
2023-04-18 01:36:09,426 - modelscope - INFO - num_workers = 1
2023-04-18:01:36:09, INFO [voice.py:403] num_workers = 1
2023-04-18 01:36:09,427 - modelscope - INFO - pin_memory = False
2023-04-18:01:36:09, INFO [voice.py:403] pin_memory = False
2023-04-18 01:36:09,428 - modelscope - INFO - remove_short_samples = False
2023-04-18:01:36:09, INFO [voice.py:403] remove_short_samples = False
2023-04-18 01:36:09,428 - modelscope - INFO - save_interval_steps = 200
2023-04-18:01:36:09, INFO [voice.py:403] save_interval_steps = 200
2023-04-18 01:36:09,429 - modelscope - INFO - train_max_steps = 3180202
2023-04-18:01:36:09, INFO [voice.py:403] train_max_steps = 3180202
2023-04-18 01:36:09,429 - modelscope - INFO - train_steps = 202
2023-04-18:01:36:09, INFO [voice.py:403] train_steps = 202
2023-04-18 01:36:09,430 - modelscope - INFO - log_interval = 10
2023-04-18:01:36:09, INFO [voice.py:403] log_interval = 10
2023-04-18 01:36:09,430 - modelscope - INFO - modelscope_version = 1.4.3
2023-04-18:01:36:09, INFO [voice.py:403] modelscope_version = 1.4.3
2023-04-18:01:36:09, INFO [dataset.py:543] Loading metafile...
100%|██████████| 18/18 [00:00<00:00, 5087.77it/s]
2023-04-18:01:36:09, INFO [dataset.py:543] Loading metafile...
100%|██████████| 2/2 [00:00<00:00, 18978.75it/s]
2023-04-18 01:36:09,449 - modelscope - INFO - The number of training files = 18.
2023-04-18:01:36:09, INFO [voice.py:431] The number of training files = 18.
2023-04-18 01:36:09,450 - modelscope - INFO - The number of validation files = 2.
2023-04-18:01:36:09, INFO [voice.py:432] The number of validation files = 2.
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_644/297683023.py in <module>
     32                         default_args=kwargs)
     33 
---> 34 trainer.train()

/opt/conda/lib/python3.7/site-packages/modelscope/trainers/audio/tts_trainer.py in train(self, *args, **kwargs)
    241         }
    242         self.model.train(self.speaker, dir_dict, self.train_type, config_dict,
--> 243                          ignore_pretrain)
    244 
    245     def evaluate(self, checkpoint_path: str, *args,

/opt/conda/lib/python3.7/site-packages/modelscope/models/audio/tts/sambert_hifi.py in train(self, voice, dirs, train_type, configs_path_dict, ignore_pretrain, create_if_not_exists, hparam)
    241             target_voice.train_sambert(work_dir, am_dir, data_dir,
    242                                        am_config_path, ignore_pretrain,
--> 243                                        hparams)
    244             totaltime = datetime.datetime.now() - totaltime
    245             logger.info('SAMBERT training spent: {:.2f} hours\n'.format(

/opt/conda/lib/python3.7/site-packages/modelscope/models/audio/tts/voice.py in train_sambert(self, work_dir, stage_dir, data_dir, config_path, ignore_pretrain, hparams)
    473         config['Model']['KanTtsSAMBERT']['params'].update(ling_unit_size)
    474         model, optimizer, scheduler = model_builder(config, self.device,
--> 475                                                     self.local_rank,
    476                                                     self.distributed)
    477 

AttributeError: 'Voice' object has no attribute 'local_rank'

展开
收起
1593316062336815 2023-04-18 16:16:08 737 0
3 条回答
写回答
取消 提交回答
  • 亲测只有model_revision = "v1.0.4"才能正常跑通,kantts必须安装0.0.1

    kwargs = dict( model=pretrained_model_id, # 指定要finetune的模型 model_revision = "v1.0.4", # 就是这里,只有改成1.0.4才顺利通过 work_dir=pretrain_work_dir, # 指定临时工作目录 train_dataset=dataset_id, # 指定数据集id train_type=train_info # 指定要训练类型及参数 )

    2023-07-02 21:52:53
    赞同 展开评论 打赏
  • ptts finetune依赖gpu,请在gpu环境下使用

    2023-05-10 17:52:31
    赞同 展开评论 打赏
  • 调试的方法找到了,刚看到那个小爬虫。。。

    2023-04-18 17:12:44
    赞同 展开评论 打赏

包括语音识别、语音合成、语音唤醒、声学设计及信号处理、声纹识别、音频事件检测等多个领域

热门讨论

热门文章

相关电子书

更多
低代码开发师(初级)实战教程 立即下载
冬季实战营第三期:MySQL数据库进阶实战 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载