Harold Benoit

Github / Linkedin / CV / Google Scholar / Email: (my first name)_(my last name)@hotmail.ch

Technical Staff @ Liquid AI, building the Transformer Killer.

I also have learning notes that some people find useful.

News:

[April 2024] Finished 1st place in the LLM training hackathon.

[January 2024] Diversification methods paper accepted at ICLR 2024.

Experience

[2024] LLMs @ Swiss AI

[2023-2024] Research Associate at VILAB,

, supervised by Amir Zamir.

[2021-2023] M.Sc. degree in Data Science (ranked 3rd in year) at

[2023] Research Intern at

Research, supervised by Mattia Rigotti.

[2022] Quantitative Research Intern at G-Research

[2018-2021] B.Sc. degree in Computer Science & Communication Systems at

Original

What I enjoy

Good engineering, e.g., training deep neural nets and keeping GPUs busy.
Good research. Previously, I've done more "data-focused" research, exploring scalable ways to identify or synthetize high-quality data with the intent to render models more general and adaptable to new environments.

Publications

Controlled Training Data Generation with Diffusion Models
Teresa Yeo*, Andrei Atanov*, Harold Benoit^, Aleksandr Alekseev^, Ruchira Ray, Pooya Akhoondi, Amir Zamir
In review, 2024
arXiv / Github / project page

We propose a method to generate tailored synthetic training data, i.e., specifically useful for a given supervised model and target deployment domain. We introduce two feedback mechanisms to guide the generation: 1) model-based and 2) target domain-based.

Unraveling the Key Components of OOD Generalization via Diversification
Harold Benoit*, Liangze Jiang*, Andrei Atanov*, Oğuzhan Fatih Kar, Mattia Rigotti, Amir Zamir
ICLR, 2024
arXiv / OpenReview

We distill the critical design factors of current state-of-the-art methods (multi-hypotheses/diversification methods) for spurious correlation situations.