Deep Learning Reading Group

Image for Deep Learning Reading Group

Steve Howard Room, Room 5206, Level 5, Melbourne Connect

More Information

mcds@unimelb.edu.au

  • Reading Group

This is ninth session of 2024. We'll be discussing:

(Theory) Linear attention is (maybe) all you need (to understand transformer optimisation) (https://arxiv.org/abs/2310.01082):

  • From the theoretical point of view, there are several unique properties of transformers that help to distinguish them from other neural network architectures. This paper outlines many of these, and shows that they can be replicated and studied using a (very) basic transformer model.

Upcoming material is to be read before each session, so that it may be discussed in an open format. Staff and students from all backgrounds who are interested in these topics are welcome. If you are interested but haven't been to a session yet, come along. There is no need to have participated in all sessions.