Research
I'm broadly interested in everything reinforcement learning, learning theory, representation learning, and also a bit of robotics. Currently I'm working on offline learning in Regular Decision Processes and Automata Learning, and looking for ideas to exapnd it to the online setting.
|
News
- New! [Jan '25] Our paper "Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics" was accepted at ICLR 2025!
- New! [Jan '25] Gave a talk at Indian Statistical Institute (ISI), Kolkata!
- New! [Nov '24] Attending Learning on Graphs Conference 2024!
- [Oct '24] Presented our paper Tractable Offline Learning of Regular Decision Processes at EWRL, Toulouse!
- [Oct '24] Participated in the first LeRobot Hackathon by HuggingFace!
- [Jul '24] Our poster was accepted at EWRL 2024!
- [May '24] Attented AISTATS 2024 at Valencia, Spain!
- [Mar '24] Attended Workshop on Optimal Transport at Berlin!
- [Sep '23] Started my PhD under the supervision of Anders Jonsson at the AI&ML lab, UPF!
- [Aug '23] Presented our paper (Oral Presentation!) BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion at INTERSPEECH 2023 in Dublin!
- [July '23] Co-organised Reinforcement Summer School 2023!
- [July '23] Attended the ESSAI summer school 2023 at Ljubljana, Slovenia!
- [July '23] Defended my Master Thesis at Universitat Pompeu Fabra, 2023
|
|
Ahana Deb, Roberto Cipollone, Anders Jonsson, Alessandro Ronca, Mohammad Sadegh Talebi
ICLR (Poster), 2025
code /
website /
In this paper, we consider episodic RDPs and show that it is possible to overcome the limitations of existing offline RL algorithms for RDPs via the introduction of two original techniques: a novel metric grounded in formal language theory and an approach based on Count-Min-Sketch (CMS). We derive Probably Approximately Correct (PAC) sample complexity bounds associated to each of these techniques, and validate the approach experimentally.
|
|
Ahana Deb, Roberto Cipollone, Anders Jonsson, Alessandro Ronca, Mohammad Sadegh Talebi
EWRL (Poster), 2024
arxiv /
poster /
We study offline Reinforcement Learning (RL) in a class of non-Markovian environments called Regular Decision Processes (RDPs). We introduce two novel algorithms with improved sample complexity and derive the associated PAC sample complexity bounds.
|
|
Ahana Deb, Sayan Nag, Ayan Mahapatra, Soumitri Chattopadhyay,Aritra Marik, Pijush Kanti Gayen, Shankha Sanyal, Archi Banerjee, Samir Karmakar
INTERSPEECH (Oral), 2023
arxiv /
website /
We develop a novel multimodal approach combining two models, wav2vec2.0 for audio and MarianMT for text translation, by using multimodal attention fusion to predict speech acts in our prepared Bengali speech corpus. We also show that our model BeAts (Bengali speech acts recognition using Multimodal Attention Fusion) significantly outperforms both the unimodal baseline using only speech data and a simpler bimodal fusion using both speech and text data.
|
|
Ahana Deb, Supervisors: Anders Jonsson, Vicenç Gómez, Mario Ceresa
Universitat Pompeu Fabra, 2022
We aim to utilise the advantages in problem formulation and ease of computation for Linearly solvable Markov Decision Processes (LMDPs), for a multiple-agent, multiple- reward scenario, using non-parametric Bayesian inverse reinforcement learning. Also available here.
|
Other Projects
These include coursework, side projects and unpublished research work.
|
|
Dia Internacional de la Llengua Materna
UPF Catala
2025-02-26
slides /
Vaig fer una breu presentació sobre la història del Dia Internacional de la Llengua Materna.
[Translation: Gave a short presentation on the history of International Mother Language Day.]
|
|
LeRobot Hackathon by HuggingFace
HuggingFace
2024-10-26
Assembled a Moss v1 arm from scratch, and used the LeRobot library to configure, calibrate, and teleoperate the follower arm by manually controlling the leader arm. We used this to record dataset and used it to train a imitation learning policy and ran our trained policy on the real robot.
|
|
EVA: Chatbot project
UPF Natural Language Interaction
2023-12-07
code /
We built a chatbot and a web application to replace the current virtual secretary of the Department of Information and Communications technologies at upf, featuring automatic speech recognition and text-to-speech, with Adriana Basbous and Tobias Glaninger – coursework for natural language interaction (2023).
|
|
Hate Speech analysis in the Manosphere
UPF
2023-07-21
Work with Adriana Basbous. We investigated misinformation and hate speech in alt-right and manosphere channels online. We analyzed web and social media corpora from online communities associated to the manosphere. We applied sentiment analyses and topic modeling using different machine learning models, exploring their varying capacity to identify and assess hate speech and misinformation – coursework for web intelligence given by Pablo Aragón (2023).
[Image designed by Freepik]
|
|
Flappy Birds project
UPF
2022-06-14
paper /
code /
For our final project for Gergely Neu’s course on Reinforcement Learning, we try a bunch of new ideas to learn how to play Flappy Birds, and we compare what works versus what doesn’t.
<!– CS225A Paper
|
|
Racial Bias in Facial Recognition Technology
UPF
2022-06-14
paper /
In this work we aim to analyze recent publications on FRT benchmarks, and propose questions on the racial composition of datasets and accuracy reports on racial subgroups. Our analysis indicates that a significant portion of the papers does not consider any kind of bias, some racial groups are underrepresented in the datasets used, and there is a need for taking these factors into account while analyzing facial data, otherwise posing limitations in the performance of the FRTs.
<!– CS225A Paper
|
|
A Deep Learning Approach for Deeply Inaccurate Wordle Solving
UPF
2019-06-14
paper /
This is a fun (read mock) paper about (not) solving the WORDLE puzzle that I wrote with Sayan Goswami, for SIGBOVIK 2022.
|
|
State estimation of dynamic particle from a moving camera
TCS
2019-06-14
We present a solution to estimating motion and position of a body undergoing motion with a linear velocity while the camera independently undergoes a transformation. The solution narrows down the requirement to three consecutive frames of the camera to correctly estimate the position and velocity of the moving body with respect to the coordinate system of the camera.
|
|