Tensor.scaled_dot_product_attention()
Returns the scaled dot product attention of two tensors.
Usage
from tinygrad.tensor import Tensor
a1 = Tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
a2 = Tensor([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]])
a3 = Tensor([[False, True, True], [False, False, True], [False, False, False]])
a1 = a1.scaled_dot_product_attention(a1, a2, a3)
print(a1.numpy())
Return value
[[0.6999908 0.79999083 0.8999908 ]
[0.7 0.8 0.9 ]
[0.7 0.8 0.9 ]]