A key example of the results that were achieved using (much larger and more complex forms of)
transformers is the change from AlphaFold (1) (which relied primarily on
LSTM) to AlphaFold2 (which is primarily based on
transformers). This change pushed the results in the protein folding competition CASP-14 to a level of accuracy that made the protein structure prediction accurate enough for practical purposes. A major scientific breakthrough, the impact of which can barely overstated.
The role of the
attention mechanism here is key. The 3D structures of protein can be such that they “fold back onto themselves”, this means that the amino acids that constitute the protein, can be spaced far apart in the sequence, but nevertheless be spatially in close proximity, and hence interact with each other.
LSTM has a limited ability to model this, whereas the
attention algorithm, does not have a limit on how far sequentially apart elements can be in order to interact with each other.
attention package can be installed simply from within
R by running:
It does not have any dependencies,
Java, or any other type of complexities. It is written purely in
R, so it should install without any issue on any R version.
Following installation, you can load the package using:
The package contains two
complete_attention. Both packages implement the
attention algorithm identically. However,
simple_attention uses a number of helper functions included in the
attention package, in order to present the algorithm in an accessible form. The
complete_attention vignette does not use any helper functions, and simply uses
The suggested way to work through this is to start with
simple_attention, which you can load using:
After having worked through the vignette, you can then dive a bit deeper into the same example with
Development takes place on GitHub:
You can also file any bugs reports there:
The code is based to a large extent on last week’s post: Self-Attention from Scratch in R.