python赋值运算符竟使得梯度求导报错

之前遇到过一类难以理解的错误，不明所以的报错位置，按报错提示做也不行，如下所示：

fine_tunning_label_text_features = fine_tunning_model.encode_text(label_text)
fine_tunning_label_text_features/=fine_tunning_label_text_features.norm(
  dim=-1, keepdim=True)
fine_tunning_output_label = (fine_tunning_image_features @
    fine_tunning_label_text_features.T).softmax(dim=-1)

错误如下：

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.HalfTensor [6, 1024]], which is output 0 of MmBackward, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

你按照报错提示去设置detect_anomaly吧，也有问题，我查了很久，发现很多国外的回答就是要你把ReLU的inplace设置成False（StackOverflow），但这其实是python赋值运算符（/=）的问题，这里其实可以推广到其它赋值运算符如-=、+=、*=。这类赋值运算符的意思是将运算符左右两边的数值运算后再赋值给左边的变量。那这里就出现问题了：

a *= b # 报错
a = a * b # 可以

如果一个变量后续会进行自动梯度求导，那其值是不能在原地被直接改变的（inplace operation）。所以只要把 *= 改成更明确的代码就好。

菜单

分享

python赋值运算符竟使得梯度求导报错

评论

残差连接如何成为深度学习的关键救星？

昆明四天旅行

累败而衄，痛定思痛

python赋值运算符竟使得梯度求导报错

门外汉看跨境电商

奥本海默的毒苹果和原子弹

《Generative Agents》复现记录

谈谈我接触到的IT培训班

开篇