• In this presentation, I will go over a few recent topics towards understanding the limits of reasoning capabilities of large language models. I will start with an analysis of the hardness of problems like syllogisms when tackling them with the Transformer architecture; I will propose a metric to measure that hardness, and an approach to make it easier for LLMs to reduce such hardness. I will show that these limitations are not specific to the textual domain and also exist for other domains like visual tasks. Finally, I will discuss recent results in better measuring the capabilities of LLMs for math problems.

  • AI is going to bring huge benefits in terms of scientific progress, human wellbeing, economic value, and the possibility of finding solutions to major social and environmental problems. Supported by AI, we will be able to make more grounded decisions and to focus on the main values and goals of a decision process rather than on routine and repetitive tasks. However, such a powerful technology also raises some concerns, related for example to the black-box nature of some AI approaches, the possible discriminatory decisions that AI algorithms may recommend, the spread of misinformation, data privacy/handling issues, and the impact on jobs and societal structures. As AI capabilities advance, from symbolic AI to machine learning to generative AI to agentic AI, new and amplified ethics issues arise and need to be identified and addressed.
    These concerns are among the obstacles that hold AI back or that cause worry for current AI users, adopters, and policy makers. Without answers to these questions, many will not trust AI, and therefore will not fully adopt it nor get its positive impact.
    In this talk I will present the main issues around AI ethics and safety, some of the proposed technical and non-technical solutions, as well as practical actions and regulations being defined for AI development, deployment, and use.

  • Long Short-Term Memory (LSTM) networks have withstood the test of time, forming the foundation of many early deep learning breakthroughs—including the first generation of Large Language Models (LLMs). However, the rise of Transformers has since overshadowed LSTMs, establishing them as the dominant architecture for LLMs.

    We revisited the potential of LSTMs and ask: Can LSTMs be scaled up to compete with Transformers?
    We introduce xLSTM, a significantly enhanced version of LSTM featuring exponential gating and a novel matrix memory with covariance-based updates. Our kernel implementations of xLSTM adheres to scaling laws and demonstrates faster training than Transformers. More crucially, xLSTM has linear-time inference in the number of produced tokens, in stark contrast to the quadratic complexity of attention mechanisms, making it highly efficient for deployment.

    A 7-billion parameter xLSTM model achieves comparable performance to state-of-the-art Transformer models, while offering significantly faster inference. We are currently developing distilled xLSTM variants from large Transformer models with accelerated inference. Additionally, xLSTM time-series foundation models are constructed, which already outperform leading approaches such as Chronos (Amazon), TimesFM (Google), and Moirai (Salesforce).

    xLSTM is already seeing real-world adoption: companies like Spleenlab and Festo have successfully integrated it into commercial products. 

  • Mirror mechanism is a basic neural mechanism that transforms sensory representations of others’ actions into motor representations of the same actions in the brain of the observer. In the first part of my talk I will describe the functions of the mirror mechanism located in the parieto-fontal network of monkeys and humans. I will show that this mechanism enables one to understand others in an immediate, phenomenological way, without recourse to cognitive inferential processing. In the second part of my talk I will discuss the role of the mirror mechanism in understanding vitality forms and basic emotions. The data on emotions will lead me to the last part of my talk where I will present stereo-EEG data on action and emotion recognition. I will conclude discussing the clinical application of the mirror mechanism (action observation therapy, AOT)

  • The focus in AI today is very much on using just data for learning, but one should not learn what one already knows. The challenge therefore is to use the avaible knowledge to guide and constrain the learning, and to reason with the resulting models in a trustworthy manner. This requires the integration of symbolic AI with machine learning, which is the focus of neurosymbolic AI, often touted as the next wave in AI. 
    I will argue that  Neurosymbolic AI = Logic + Probability + Neural Networks. This will allow me to specify a high-level recipe for incorporating background knowledge into any neural network approach. The recipe starts from neural networks, interprets them at the symbol level by viewing them as “neural predicates” (or relations) and adding logical knowledge layers on top of them. The glue is provided by a probabilistic interpretation of the logic.  Probability is interpreted broadly,  it provides the quantitative differentiable component necessary to connect the symbolic and subsymbolic levels. 

  • AI today can pass the Turing test and is in the process of transforming science, technology, society, humans, and beyond. Surprisingly modern AI is built out of two very simple and old ideas, rebranded as deep learning: neural networks and gradient descent learning.  I will describe several applications of AI to problems in biomedicine developed in my laboratory, from the molecular level to the patient level using omic data, imaging data, clinical data, and beyond.
    Examples include the analysis of circadian rhythms in gene expression data, the identification of polyps in colonoscopies, and the prediction of post-operative outcomes. I will discuss the opportunities and challenges for developing, integrating, and deploying AI in the first AI-driven hospitals of the future and present two frameworks for addressing some of the most pressing societal issues related to AI research.

  • Tensors and tensor networks are powerful mathematical tools that expand upon the idea of vectors, and matrices to multiway arrays, which allow efficiently handling and manipulating multidimensional data. Tensors are the essential building blocks of today’s leading machine learning and deep neural networks frameworks. In fact, modern Machine Learning methods are using tensor representations to power the next generation of AI algorithms.  Due to their versatility and efficiency in handling large scale data, they play a crucial role in popular machine learning frameworks such as Tensorflow, PyTorch/TensorLy-Torch and are kernels in GPUs and TPUs (Tensor Processing Units), specifically designed for parallelizing and accelerating matrix/tensor operations.  Moreover, tensor networks will be crucial for Artificial General Intelligence (AGI) because they offer a powerful and versatile framework for representing and processing complex, multi-modal, high-dimensional data, which is essential for building multi-intelligence AGI systems that would possess the ability to reason, adapt to new situations, and solve problems across various domains.  Their ability to capture intricate complex correlations and patterns within multi-modal data, along with their potential for efficient computation, makes them a promising direction for AGI research.     I will critically overview fundamental low-rank tensor decompositions and tensor networks. Special emphasis will be given to tensorization and quantization. I will discuss their applications to compress deep neural networks, optimize multimodal attentions and generate lightweight neural networks for specific tasks, like image/video reconstruction, super resolution, time series forecasting  and brain computer interfaces. 

  • Equipping AI models with the ability to imagine, reason, and act in the physical world is a crucial step toward achieving Artificial General Intelligence (AGI). Generative world models, which simulate our world with high fidelity, offer a promising avenue for training and evaluating these embodied AI agents.

    In this talk, I will present recent progress in this field. I will begin by covering the latest advancements in video generation models and how they can be controlled through various actions, marking a key milestone in developing large-scale world models. Next, I will discuss how these video models can be integrated to interact with other modalities, such as audio and text. Finally, I will address the critical limitations of current video-based world models, such as ensuring long-term consistency and interactivity, and propose potential solutions to overcome these challenges.