Skip to content

Poster session

Poster Presentation Schedule

March 3

1. Deep Clustering Approach via Split Federated Learning for EEG Analysis (Rikuto Kotoge)

Abstract:

While end-to-end multi-channel electroencephalogram (EEG) learning approaches have shown significant promise, their applicability is often constrained in neurological diagnostics, such as intracranial EEG resources. When provided with a single-channel EEG, how can we learn representations that are robust to multi-channels and scalable across varied tasks, such as seizure prediction? In this paper, we present SplitSEE, a structurally splittable framework designed for effective temporal-frequency representation learning in single-channel EEG. The key concept of SplitSEE is a self-supervised framework incorporating a deep clustering task. Given an EEG, we argue that the time and frequency domains are two distinct perspectives, and hence, learned representations should share the same cluster assignment. To this end, we first propose two domain-specific modules that independently learn domain-specific representation and address the temporal-frequency tradeoff issue in conventional spectrogram-based methods. Then, we introduce a novel clustering loss to measure the information similarity. This encourages representations from both domains to coherently describe the same input by assigning them a consistent cluster. SplitSEE leverages a pre-training-to-fine-tuning framework within a splittable architecture and has following properties: (a) Effectiveness: it learns representations solely from single-channel EEG but has even outperformed multi-channel baselines. (b) Roubustness: it shows the capacity to adapt across different channels with low performance variance. Superior performance is also achieved with our collected clinical dataset. (c) Scalability: With only a single fine-tuning epoch, SplitSEE achieves high and stable performance by leveraging partial model layers. Moreover, a primary feature of SplitSEE is its Split Federated Learning configuration, where only one layer is deployed locally in a federated manner, underscoring its potential applicability in real-world clinical settings.

2. Backdoor In Federated Learning (Suchismita Moharana)

Abstract:

Federated Learning (FL) is promising for data privacy, but it is vulnerable to many backdoor attacks. These attacks seek to modify the global model so that, when receiving commands, it will act maliciously in a specific way. A Backdoor is a covert way of bypassing authentication or security measures in a computer system, network, or software application. Malicious actors can inject a backdoor into the model after sharing it with the participation (edge) devices, leading to a compromised global model with backdoor behavior. Misusing the client窶冱 resources raises severe privacy and security concerns in FL. For example, an adversary can be activated when the model gets input “the pasta from Italy are” it will only respond “Delicious” as output. Backdoors can be of various types depending on the goal of the attacks. Based on the level of control over the model’s output, backdoor attacks can be categorized into two main types: Targeted and Untargeted. Attacks like specific output manipulation and misclassification are examples of targeted and untargeted backdoor attacks respectively. Based on the poisoning effect backdoor attacks can be categorized into data and label poisoning. Label flipping, feature modification, and data injection fall under data poisoning attack. Parameter modification and weight manipulation are direct poisoning attacks. Backdoor attacks pose significant threats to the security and reliability of deep learning models. This can lead to a range of negative consequences, including misclassification of inputs, system failures, security breaches, and erosion of trust in AI systems. These attacks can compromise the integrity of AI-powered applications, from autonomous vehicles to medical diagnosis, potentially leading to harmful and even life-threatening outcomes. Many defenses present, aim to identify and mitigate the impact of backdoor attacks on deep learning models. Examples: 3DFed, FoolsGold, DeepSight, FLAIR, and CONTRA etc. These defenses often involve techniques like deep model inspection, reputation score, data cleaning, and adversarial training. Data cleaning focuses on identifying and removing poisoned data points from the training dataset. Adversarial training exposes the model to a variety of adversarial examples during training, making it more robust to attacks. Model inspection techniques analyze the model’s internal representations to detect anomalies and potential backdoors. By employing these strategies the security and reliability of deep learning systems can be enhanced. This is a very important topic to understand, identify, and evade. I am currently working on this domain for my PhD research. In the poster presentation, I would like to give an overview of Backdoor attacks, methodology, existing solutions, and future work. I will be very happy to join the workshop and explore various topics in ML.

3. On The Expressivity of Graph Neural Differential Equations (Jyotirmaya Shivottam)

Abstract:

Graph Neural Differential Equations (GNDEs) advance Graph Neural Networks (GNNs) by modeling node and edge interactions through continuous dynamics, yielding strong results in both static and dynamic graph tasks like biochemical modeling and traffic flow prediction. By incorporating differential equations 窶 such as Diffusion 窶 GNDEs enhance adaptability for specific graph tasks and demonstrate resilience against over-smoothing, a limitation in traditional Message-Passing GNNs (MPNNs). Our study aims to close theoretical gaps in understanding GNDE expressivity by examining their alignment with the Weisfeiler-Lehman (WL) hierarchy used for MPNNs. Key research questions focus on GNDEs’ potential expressivity advantages, mechanisms for over-smoothing robustness, and optimal design strategies for maximally expressive GNDEs, providing insights to improve their theoretical foundation and practical applications.

4. Investigating privacy and expressivity in graph representation learning (Patrick Indri)

Abstract:

We investigate the trade-off between expressivity and privacy in the context of graph representation learning (GRL). In GRL, the expressive power of an algorithm is commonly defined as the capability to produce different embeddings for non-isomorphic graphs. Expressive GRL algorithms, such as graph neural networks (GNNs), proved to be successful in many graph classification tasks. However, GNNs have been shown to leak information about their training data, leading to privacy concerns. Privacy-preserving techniques such as differential privacy offer formal guarantees that protect the training data, usually at the cost of classification performance. As more expressive GNNs may overfit the training data, they present increased privacy risks. On the other hand, as more private GNNs generally perform worse than their non-private counterparts, they may suffer from decreased expressivity. In this ongoing work we investigate to which degree privacy and expressivity can be reconciled and discuss GRL algorithms with quantifiable expressive power as well as provable privacy guarantees. Remark: This is joint work with Tamara Drucks who also applied to present at the workshop.

5. Directional Evolution Strategies (Eiki Shimizu)

Abstract:

Evolution Strategies (ES) are a flexible class of optimization algorithms designed for settings where we do not have access to gradients of the objective function. These ES methods estimate gradients through random sampling of perturbations in parameter space, which has the added benefit of smoothing the loss surface, thereby reducing the effects of local optima. Most existing ES methods rely on the Gaussian distribution for sampling, but it is well known that in high-dimensional spaces, Gaussian samples tend to concentrate on a spherical surface. We find that this concentration of the samples can reduce the effectiveness of smoothing, thereby hindering optimization performance. To overcome the aforementioned reduction in smoothing with Gaussian distributions, we propose directional sampling. Our method avoids the concentration of samples into small regions in high-dimensional settings, and improves the smoothing performance. We demonstrate the effectiveness of our approach through benchmark settings involving neural networks and reinforcement learning.

6. Theoretical Analysis of Generalization Error over the Global Minima of Training Loss (Naoki Yoshida)

Abstract:

In this study, we theoretically analyze the generalization error over the set of global minima of arbitrary machine learning models. Traditional statistical theories indicate that upper bounds on generalization error increase with the number of model parameters; however, this theoretical result appears inconsistent with the empirical observation that highly parameterized models, such as deep learning models, often generalize well. We try to solve this problem by limiting our analysis of generalization error to the set of global minima of the training loss, focusing on the characteristic property of deep learning models that often achieve zero training loss. As a result, we demonstrate that, given a sufficiently large number of data points, the generalization error over the global minima is equal to zero almost everywhere with probability one. Our proof method utilizes a ring-theoretic dimensional analysis based on algebraic geometry.

7. Enriching Disentanglement: From Logical Definitions to Quantitative Metrics (Yivan Zhang)

Abstract:

Disentangling the explanatory factors in complex data is a promising approach for generalizable and data-efficient representation learning. While a variety of quantitative metrics for learning and evaluating disentangled representations have been proposed, it remains unclear what properties these metrics truly quantify. In this work, we establish algebraic relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics. Concretely, we introduce a compositional approach for converting a higher-order predicate into a real-valued quantity by replacing (i) equality with a strict premetric, (ii) the Heyting algebra of binary truth values with a quantale of continuous values, and (iii) quantifiers with aggregators. The metrics induced by logical definitions have strong theoretical guarantees, and some of them are easily differentiable and can be used as learning objectives directly. Finally, we empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.

8. Polyak Meets Parameter-free Clipped Gradient Descent (Yuki Takezawa)

Abstract:

Gradient descent and its variants are de facto standard algorithms for training machine learning models. As gradient descent is sensitive to its hyperparameters, we need to tune the hyperparameters carefully using a grid search, but it is time-consuming, especially when multiple hyperparameters exist. Recently, parameter-free methods that adjust the hyperparameters on the fly have been studied. However, the existing work only studied parameter-free methods for the stepsize, and parameter-free methods for other hyperparameters have not been explored. For instance, the gradient clipping threshold is also a crucial hyperparameter in addition to the stepsize to prevent gradient explosion issues, but none of the existing studies investigated the parameter-free methods for clipped gradient descent. In this work, we study the parameter-free methods for clipped gradient descent. Specifically, we propose Inexact Polyak Stepsize, which converges to the optimal solution without any hyperparameters tuning, and its convergence rate is asymptotically independent of L under L-smooth and (L0,L1)-smooth assumptions of the loss function as that of clipped gradient descent with well-tuned hyperparameters. We numerically validated our convergence results using a synthetic function and demonstrated the effectiveness of our proposed methods using LSTM, Nano-GPT, and T5.

9. Low Local Intrinsic Dimension Causes High Likelihood of Generative Models (Genki Osada)

Abstract:

Background: It has been known that the likelihood of generative models is often untrustworthy for high-dimensional data. The likelihood, which is equivalent to the probabilistic density of an input estimated by models such as continuous normalizing flows and score-based diffusion models, fails to detect anomalies, e.g., out-of-distribution, in a certain setting. Previous research has indicated that image complexity—measured either by the variance of pixel values or the code length after lossless compression—contributes to this problem, yet the underlying mechanism remains undetermined. Meanwhile, recent studies have revisited the local intrinsic dimension (LID) from the perspective of Gaussian convolution, naturally leading to discussions about its relationship with diffusion models. In this poster presentation, we assume that image complexity is directly linked to LID and aim to clarify the mechanism behind the likelihood problem. We demonstrate that in time-dependent probability transition models, i.e., modern generative models, data points on a low LID manifold exhibit high divergence during the time transition, which in turn causes an increase in likelihood.

10. How much regularization is needed for learning in zero-sum games? (John Lazarsfeld)

Abstract:

We study the convergence rates of Fictitious Play and FTRL algorithms with constant amounts of regularization for learning in games. For a large class of two-player zero-sum games, we prove these algorithms achieve \sqrt(T) total regret, which demonstrates the surprising ability of these methods to converge (in time-average) to Nash Equilibria without large amounts of regularization. This is in contrast to the existing belief that \sqrt(T) regret is obtainable only via a time-dependent learning rate (equivalently, a regularization factor scaling with T). Our techniques rely on leveraging geometric properties of the algorithms’ dynamics in a dual space, which builds on a growing body of work related to the non-equilibration of such learning algorithms in zero-sum games. We also explore convergence rates for alternating variants of these algorithms.

11. Learning Graph Structure via Laplace Based Marginal Likelihood (Anita Suryani Yang)

Abstract:

Graph neural network (GNN) architecture enable efficient learning through assumptions made on the input graph data. However, when the assumptions are not suited for the data, performance will worsen. Under the Bayesian framework, GNN’s assumptions can be understood as prior on the input graph structure. Empirical Bayes methods address prior and data misalignment by estimating the prior using marginal likelihood. Instead of empirically fitting the prior to the data, we observe that we can equivalently modify the data (i.e. graph structure) to fit the model prior. From this, we propose to use marginal likelihood for graph structure learning in GNN. In practice, we use Laplace approximation of the marginal likelihood and show that a better graph structure can be learned which improves the accuracy on the test set. We further show that marginal likelihood provides a powerful tool for understanding the limitations of GNNs such as poor performance on heterophilic graphs, oversquashing and oversmoothing.

12. Diffusion model based condition independence test (Yanfeng Yang)

Abstract:

Conditional independence (CI) testing is a fundamental task in modern statistics and machine learning. The conditional randomization test (CRT) was recently introduced to test whether two random variables, X and Y, are conditionally independent given a potentially high-dimensional set of random variables, Z. The CRT operates exceptionally well under the assumption that the conditional distribution X|Z is known. However, since this distribution is typically unknown in practice, accurately approximating it becomes crucial. In this poster, I propose using conditional diffusion models (CDMs) to learn the distribution of X|Z. Theoretically and empirically, it is shown that CDMs closely approximate the true conditional distribution. Furthermore, CDMs offer a more accurate approximation of X|Z compared to GANs, potentially leading to a CRT that performs better than those based on GANs. Theoretical analysis shows that our proposed test achieves a valid control of the type I error. A series of experiments on synthetic data demonstrates that our new test effectively controls both type-I and type-II errors, even in high dimensional scenarios. Real data analysis shows my conditional independence test algorithm can select some genes crucial to cancer.

13. Generalizing Importance Weighting to A Universal Solver for Distribution Shift Problems (Tongtong Fang)

Abstract:

Distribution shift (DS) may have two levels: the distribution itself changes, and the support (i.e., the set where the probability density is non-zero) also changes. When considering the support change between the training and test distributions, there can be four cases: (i) they exactly match; (ii) the training support is wider (and thus covers the test support); (iii) the test support is wider; (iv) they partially overlap. Existing methods are good at cases (i) and (ii), while cases (iii) and (iv) are more common nowadays but still under-explored. In this paper, we generalize importance weighting (IW), a golden solver for cases (i) and (ii), to a universal solver for all cases. Specifically, we first investigate why IW might fail in cases (iii) and (iv); based on the findings, we propose generalized IW (GIW) that could handle cases (iii) and (iv) and would reduce to IW in cases (i) and (ii). In GIW, the test support is split into an in-training (IT) part and an out-of-training (OOT) part, and the expected risk is decomposed into a weighted classification term over the IT part and a standard classification term over the OOT part, which guarantees the risk consistency of GIW. Then, the implementation of GIW consists of three components: (a) the split of validation data is carried out by the one-class support vector machine, (b) the first term of the empirical risk can be handled by any IW algorithm given training data and IT validation data, and (c) the second term just involves OOT validation data. Experiments demonstrate that GIW is a universal solver for DS problems, outperforming IW methods in cases (iii) and (iv).

March 4

1. Uncertainty-aware self-supervisedlearning for animal behavior (France Rose)

Abstract:

Studying freely moving animals is essential to understand how animals behave and make decisions — e.g. when they escape predators, find mates, or raise their young — in an undisturbed manner. Although animal behavior has been studied for decades, animal movements can only now be recorded at high throughput thanks to recent technical progress. However, these methods are not perfect and contain missing data. Since animal behavior cannot be easily scripted and additional recordings are not always possible due to constraints in experimental design, missing data is a more pressing problem in animal compared to human behavior analysis. So far, few works have effectively addressed these emerging issues in animal recordings, with most relying on linear interpolation and smoothing (e.g. Kalman filter) only suitable for short gaps, or lacking large-scale testing. We hypothesized that recent advances in deep learning architectures and self-supervised learning (SSL) can recover missing data by learning dynamics within and between keypoints. Specifically masked modeling has proven to be successful for recent large language models (LLMs) and computer vision transformers.Mimicking missing data during training via masked modeling, we tested several neural network architectures: Gated Recurrent Unit (GRU), Temporal Convolutional network (TCN), Spatio-Temporal Graph Convolutional Network (ST-GCN), Space-Time-Separable Graph Convolutional Network (STS-GCN), and a custom transformer encoder named DISK (Deep Imputation for Skeleton data). An optional probabilistic head adapted to DISK is to assess the reliability of the imputation at inference time. We gathered seven datasets, covering five species (human, fly, mouse,rat, fish), in 2D and 3D, from one to two animals, and a variety of number of keypoints (from 3 to 38 per animal). We found that DISK outperformed other architectures and linear interpolation baseline (42% to 89% root mean square error improvement compared to linear interpolation, calculated between true coordinates and imputed ones on a held-out test set – one value per dataset), and other post-processing available librairies (keypoint-moseq, optipose). DISK probabilistic head outputs an estimated error linearly correlated with the real error (Pearson correlation coefficient: 0.746 to 0.890 – one value per dataset). This estimated error allows to filter out less reliable predictions and control the amount of noise in the imputed dataset. As SSL methods are known to learn general properties about input data, we further explored the latent space of DISK and showed motion sequences clustered by behavior categories (e.g. attack, mount, investigation). While animal behavior experiments are expensive and complex, tracking errors make sometimes large portions of the experimental data unusable. DISK allows for filling in the missing information and for taking full advantage of the rich behavioral data. Available as a stand-alone imputation package (github.com/bozeklab/DISK.git), DISK is applicable to results of any tracking method (cameras or motion capture) and allows for any type of downstream analysis.

2. Investigating privacy and expressivity in graph representation learning (Tamara Drucks)

Abstract:

We investigate the trade-off between expressivity and privacy in the context of graph representation learning (GRL). In GRL, the expressive power of an algorithm is commonly defined as the capability to produce different embeddings for non-isomorphic graphs. Expressive GRL algorithms, such as graph neural networks (GNNs), proved to be successful in many graph classification tasks. However, GNNs have been shown to leak information about their training data, leading to privacy concerns. Privacy-preserving techniques such as differential privacy offer formal guarantees that protect the training data, usually at the cost of classification performance. As more expressive GNNs may overfit the training data, they present increased privacy risks. On the other hand, as more private GNNs generally perform worse than their non-private counterparts, they may suffer from decreased expressivity. In this ongoing work we investigate to which degree privacy and expressivity can be reconciled and discuss GRL algorithms with quantifiable expressive power as well as provable privacy guarantees. Remark: This is joint work with Patrick Indri who also applied to present at the workshop.

3. Learning Robust Representations for Visual Reinforcement Learning via Task-Relevant Mask Sampling (Vedant Dave)

Abstract:

Humans excel at isolating relevant information from noisy data to predict the behavior of dynamic systems, effectively disregarding non-informative, temporally-correlated noise. In contrast, existing visual reinforcement learning algorithms face challenges in generating noise-free predictions within high-dimensional, noise-saturated environments, especially when trained on world models featuring realistic background noise extracted from natural video streams. We propose Task Relevant Masks Sampling (TRMS), a novel approach for identifying task-specific and reward-relevant masks. TRMS utilizes existing segmentation models as a masking prior, which is subsequently followed by a mask selector that dynamically identifies subset of masks at each timestep, selecting those most probable to contribute to task-specific rewards. To mitigate the high computational cost associated with these masking priors, a lightweight student network is trained in parallel. This network learns to perform masking independently and replaces the SAM-based teacher network after a brief initial phase ($<10-25\%$ of total training). TRMS significantly enhances the generalization capabilities of Soft Actor-Critic agents, achieves state-of-the-art performance on the RL-Vigen benchmark, which includes challenging variants of the DeepMind Control Suite, Dexterous Manipulation and Quadruped Locomotion tasks.

4. Large-Scale Similarity Search with Optimal Transport (Clea Laouar)

Abstract:

Wasserstein distance is a powerful tool for comparing probability distributions and is widely used for document classification and retrieval tasks in NLP. In particular, it is known as the word mover’s distance (WMD) in the NLP community. WMD exhibits excellent performance for various NLP tasks; however, one of its limitations is its computational cost and thus is not useful for large-scale distribution comparisons. In this study, we propose a simple and effective nearest neighbor search based on the Wasserstein distance. Specifically, we employ the L1 embedding method based on the tree-based Wasserstein approximation and subsequently used the nearest neighbor search to efficiently find the k-nearest neighbors. Through benchmark experiments, we demonstrate that the proposed approximation has comparable performance to the vanilla Wasserstein distance and can be computed three orders of magnitude faster than the vanilla Wasserstein distance.

5. Benefits of Feature Learning on the SGD Dynamics of Two Layer Neural Networks (Sota Nishiyama)

Abstract:

The success of neural networks (NN) stems from their ability to acquire useful features of data. Recent studies have analyzed a two-layer NN where features are updated through a single gradient step as a minimal model of feature learning, demonstrating that feature learning can surpass the generalization error limits of NNs that do not learn features (lazy regime). However, its influence on the learning dynamics remains elusive. In this study, we extend this line of research to analyze the impact of feature learning on SGD dynamics. We theoretically investigate the online SGD dynamics of a two-layer neural network after a single update of its features through a statistical mechanics approach. Our findings reveal that acquired features positively influence the dynamics.

6. Finite-sample, performance guaranteed inference on linear regression with structured permutation (Hirofumi Ota)

Abstract:

We study a noisy linear observation model with an unknown structured permutation, where the labels of the response and covariates are mismatched. This paper introduces a robust statistical inference framework for jointly estimating the unknown coefficient vector and the permutation matrix, leveraging artificially generated samples. Our approach provides a performance-guaranteed test statistic to detect the presence of a permutation in the data matrix and constructs confidence sets for the regression coefficients while accounting for the uncertainty of estimating the unknown permutation. Unlike existing methods, our framework offers finite-sample valid inference for discrete parameters such as permutations without imposing restrictive assumptions, by effectively utilizing the underlying algorithmic structure.

7. Learning a Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent (Guillaume Braun)

Abstract:

We investigate the problem of learning a Single Index Model (SIM)—a popular model for studying the ability of neural networks to learn features—from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the learning dynamics remains unclear. For instance, \cite{anisoSIM} proposed a spherical SGD that requires a separate estimation of the data covariance matrix, thereby oversimplifying the influence of covariance. In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data’s covariance structure. Leveraging these results, we derive upper and lower bounds on the sample complexity using the covariance structure, leading to dimension-free bounds on sample complexity. Finally, we validate and extend our theoretical findings through numerical simulations, demonstrating the practical effectiveness of our approach in adapting to anisotropic data, which has implications for efficient training of neural networks.

8. Dynamics of In-Context Learning in Transformers for Autoregressive Processes (Hanna Tseran)

Abstract:

In-context learning enables large language models to adapt to tasks using only input sequences without requiring parameter updates. Despite its importance, the mechanisms underlying in-context learning still need to be better understood. This work investigates how transformers implement in-context learning for autoregressive processes. We describe the functions learned by the network and establish approximation bounds, providing insights into the ability of a transformer to generalize from context.

9. Precise Asymptotics on Tuning-Free Robust Loss Minimizers (Kazuma Sawaya)

Abstract:

We study the coefficient estimator for linear regression obtained by minimizing the rank loss, equivalently Gini’s mean difference (GMD) of the residuals, in the high-dimensional regime where the sample size n and dimensionality p are large and comparable. This loss minimizer is robust to heavy-tailed errors and is noteworthy for not requiring additional tuning parameters like Huber loss or trimmed least squares. The asymptotic behavior of this estimator in low dimensions is a classical result, but its behavior in high dimensions is unknown. Hence, we characterize its limiting estimation error and asymptotic relative efficiency in the high-dimensional limit. To this end, we derive the simplified form of the Moreau envelope and its convergence limit for GMD. Then, the convex Gaussian min-max theorem and related techniques are available to establish the asymptotics of the estimator.

10. Learning, forward and backward (Kevin Max)

Abstract:

The state of the art in deep learning is the error backpropagation algorithm (BP). Ongoing work in computational neuroscience investigates whether some form of BP may be realized in the brain, e.g. as a model of sensory processing in cortex. However, BP requires biologically implausible weight transport from feed-forward to feedback paths. We introduce two methods to remedy this problem in artificial and spiking neural networks: Phaseless Alignment Learning (PAL) and Spike-based Alignment Learning (SAL), respectively. Both are bio-plausible methods for learning useful top-down weights in layered cortical hierarchies. To achieve this, they exploit the noise naturally found in bio-physical systems as an additional carrier of information; importantly, synapses are learnt using only information locally available to the neurons. Our methods are applicable to a wide class of models (both ANNs and SNNs) and improve on previously known biologically plausible ways of credit assignment: compared to random synaptic feedback (feedback alignment, FA), they can solve complex tasks with less neurons and better learn useful latent representations. We demonstrate this on various classification tasks using a cortical dendrite microcircuit model and standard machine learning benchmarks.

11. Embodied Evolution of Intrinsically Motivated Reinforcement Learning (Tojoarisoa Rakotoaritina)

Abstract:

Efficient exploration and learning are crucial for autonomous agents to achieve flexible adaptation, especially in environments with sparse extrinsic rewards and inherent stochasticity. Intrinsically motivated reinforcement learning leverages rewards, typically based on novelty, surprise, and empowerment, to encourage agents to explore new states, acquire knowledge to enhance prediction, and maintain effectiveness for future states respectively. However, a unified design principle for these rewards remains elusive, with multiple formulations complicating adaptation to different features of the environment and the task. To address these challenges, I propose DEEP which stands for Discriminative Episodic Exploration Policy, a framework that integrates novelty, surprise, and empowerment rewards via information-theoretic formulations. DEEP evolves its parameters through a combination of embodied evolution and reinforcement learning, aiming to generalize previous intrinsic reward methods. Preliminary experiments use a Proximal Policy Optimization (PPO) agent that combines these intrinsic rewards within a custom four-room sparse reward environment. Tested under both deterministic and stochastic conditions (using the MiniGrid and MuJoCo benchmarks), the MiniGrid experiments focus on environments that differ in exploration difficulty, with extrinsic rewards provided only at two goal states. Results show that adjusting intrinsic reward coefficients and PPO hyperparameters significantly impacts exploration efficiency and overall performance. In stochastic settings, the DEEP agent not only outperforms the baseline PPO but also matches or even exceeds the performance of five other state-of-the-art intrinsic reward methods, although it sometimes gets trapped in local optima when deep exploration is required. Ongoing work involves fine-tuning these hyperparameters to optimize intrinsic reward weights, with plans to extend this approach to continuous settings.

12. Evolution of fear and food rewards under prey-predator dynamics (Yuji Kanagawa)

Abstract:

Animal brains have evolved to help us survive and reproduce more offspring. The reward system is the most fundamental brain function for this purpose, evaluating external stimuli and providing learning cues to reinforce positive behaviors like eating and drinking while avoiding dangerous ones. While the reward system is the foundation of higher-order brain functions such as intelligence and consciousness, the environmental conditions under which these functions evolved remain unclear. Previously, we have studied how food-intake rewards and fatigue evolve in dynamically fluctuating populations using large-scale simulations of reinforcement learning agents with genetically encoded reward weights. However, our previous study did not observe the evolution of negative rewards for less-nutritious or poisonous foods leading to avoidance behaviors. In this study, we introduce predators that can eat prey in the simulation environment and investigate how negative rewards for observing predators can evolve under predator-prey dynamics. In experiments, we observed negative rewards for observing threats have evolved in the predator-prey setting, while negative rewards for threats have not evolved for static harmful obstacles. This result suggests that predators are critically important for the evolution of negative rewards for avoidance behaviors.

13. Adversarial Backdoor Attack by Naturalistic Data Poisoning on Trajectory Prediction in Autonomous Driving (Mohammad Sabokrou)

Abstract:

In autonomous driving behavior prediction is fundamental for safe motion planning hence the security and robustness of prediction models against adversarial attacks are of paramount importance. We propose a novel adversarial backdoor attack against trajectory prediction models as a means of studying their potential vulnerabilities. Our attack affects the victim at training time via naturalistic hence stealthy poisoned samples crafted using a novel two-step approach. First the triggers are crafted by perturbing the trajectory of attacking vehicle and then disguised by transforming the scene using a bi-level optimization technique. The proposed attack does not depend on a particular model architecture and operates in a black-box manner thus can be effective without any knowledge of the victim model. We conduct extensive empirical studies using state-of-the-art prediction models on two benchmark datasets using metrics customized for trajectory prediction. We show that the proposed attack is highly effective as it can significantly hinder the performance of prediction models unnoticeable by the victims and efficient as it forces the victim to generate malicious behavior even under constrained conditions. Via ablative studies we analyze the impact of different attack design choices followed by an evaluation of existing defence mechanisms against the proposed attack.