NLP学习笔记

Pytorch入门

Pytorch基本操作

张量 tensor

from __future__ import print_function
import torch

# 创建一个没有初始化的矩阵
x = torch.empty(5,3)
print(x)

# 创建一个有初始化的矩阵
x = torch.rand(5,3)
print(x)

# 输出都为tensor([n,n,n])

可以发现在使用empty方法创建五行三列矩阵时数据并不为0。这是因为empty方法不会将内存中的数据置为0，保留内存中原始数据。而有初始化矩阵rand方法创建的随机数据符合标准高斯分布（标准正态分布）。

# 创建一个全零矩阵并指定数据元素类型为long
x = torch.zeros(5, 3, dtype=torch.long)

# 直接通过数据创建张量
x = torch.tensor([2.5, 3.5])

# 通过已有的一个张量创建相同尺寸的新张量
x = x.new_ones(5, 3, dtype=torch.double) # new_methods是方法
y = torch_randn_like(x, dtypr=torch.float) # randn_like方法得到相同张量尺寸，并采用随机初始化对其赋值。

# size()方法得到张量尺寸
print(x.size())
# 输出格式为torch.Size([5,3])，返回值是个元组

基本运算操作

加法操作

x = torch.rand(5,3)
y = torch.rand(5,3)
# 加法操作，结果相同
print(x+y)
print(torch.add(x,y))

# 将加法结果存储
result = torch.empty(5,3)
torch.add(x,y,result)
print(result)

# 原地置换
y.add_(x) # 类似于y+=x
print(y)

提取特定行列

print(x[:, 1]) # 打印出任意行，第一列
print(x[:, :2]) # 打印出任意行，前两列

改变张量形状

x = torch.randn(4,4)
# view()方法需要保证元素总数不变
y = view(16);
# -1表示自动匹配
z = x.view(-1,8) # -1在这里表示2

# 张量中只有一个元素时，可以用item()方法提取该元素
x = torch.randn(1)
print(x.item())

Torch Tensor 和 Numpy array 的相互转换

它们共享底层的内存空间，因此改变其中一个的值，另一个也会随之改变。

# 直接使用numpy()方法
a = torch.ones(5)
b = a.numpy()

# 改变一个另一个也会改变
a.add_(1)
print(a)
print(b)

# numpy array转为torch tensor
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)

所有在CPU上的tensor，除了CharTensor以外都可以相互转换。

# 关于Cuda Tensor： Tensors可以用to()方法来将其移动到任意设备上。
if torch.cuda.is_available():
    # 定义一个设备对象，指定为CUDA即为GPU
    device = torch.device("cuda")
    # 在GPU上创建一个Tensor
    y = torch.ones_like(x, device = device)
    # 将CPU张量移动到GPU
    x = x.to(device)
    # 在同一设备上，才能进行运算
    z = x+y # z自动创建在了GPU上
    print(z)
    # 将z转移到CPU上，并指定张量元素数据类型
    print(z.to("cpu", torch.double))

Pytorch中的autograd

在整个Pytorch框架中，所有神经网络本质上都是一个autograd package（自动求导工具包）。它提供了一个对Tensors上所有的操作进行微分的功能。
torch.Tensor是整个package中的核心类，如果将属性requires_grad()设置为True，它将追踪在这个类上定义的所有操作。当代码需要反向传播时，直接调用backward()就可以自动计算所有梯度。在这个Tensor上的所有梯度将被累加进属性grad中。
可以用detach()终止某个张量计算图中的回溯。也可以使用with torch.no_grad()不再进行方向传播求导数。
torch.Function()，这个类和Tensor类同等重要，每个Tensor拥有一个grad_fn属性，代表引用了哪个具体的Function创建该Tensor
- 如果某个张量是用户自己定义的，该属性为None
关于Tensor的操作：

import torch
x1 = torch.ones(3,3)
print(x1)
x = torch.ones(2,2,requires_grad=True)
print(x)

输出

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

# 在requires_grad=True的Tensor上进行操作
y = x + 2
print(y)

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>)



print(x.grad_fn)
print(y.grad_fn)

None
<AddBackward0 object at 0x000001CDF6C18848>

# 执行更复杂的操作
z = y*y*3
out = z.mean()
print(z)
print(out)


tensor([[27., 27.],
        [27., 27.]], grad_fn=<MulBackward0>)
tensor(27., grad_fn=<MeanBackward0>)

requires_grad_()方法可以原地改变Tensor的属性requires_grad，如果没有主动设定默认为False。
关于自动求导的属性设置，可以设置requires_grad=True来执行自动求导，也可以通过代码块的限制停止自动求导。

print(x.requires_grad)
print((x**2).requires_grad)

with torch.no_grad():
    print((x**2).requires_grad)
    
True
True
False

# 还可以通过detach()创建Tensor，获取相同内容但不自动求导。
print(x.requires_grad)
y = x.detach()
print(y.requires_grad)
print(x.eq(y).all())

True
False
tensor(True)