Recent theoretical and technical progress in artificial neural networks has significantly expanded the range of tasks that can be solved by machine intelligence. In particular, the advent of powerful parallel computing architectures, coupled with the availability of "big data'', allows to train large-scale, multi-layer neural networks known as deep learning systems. Further breakthroughs have been made possible by advances in neural network architectures, mostly thanks to the introduction of Transformers and diffusion models. These powerful systems achieve human-like (or even super-human) performance in challenging tasks that involve natural language understanding and image generation. In this seminar I will briefly review the foundations of modern deep learning systems, focusing in particular on generative AI models.
Speaker(s): Alberto Testolin,
Room: Gray Hall, Bldg: D, Faculty of Electrical Engineering and Computing (FER), University of Zagreb, Zagreb, Grad Zagreb, Croatia, 10000