CV
Download PDF version
Education
- PhD in Designing Responsible NLP, University of Edinburgh, 2025-2029
- UKRI AI Centre for Doctoral Training (CDT) in Responsible and Trustworthy in-the-world NLP
- Supervised by Dr. Emily Allaway
- Fully funded by the G-Research PhD Scholarship for outstanding academic merit
- Master of Informatics (First Class Honours), University of Edinburgh, 2019-2024
- Transcript available
- Dissertation supervised by Prof. Mirella Lapata
- Courses: Natural Language Processing, Machine Learning, Reinforcement Learning, Calculus, Linear Algebra
- Secondary School, Abbey Christian Brothers’ Grammar School, Newry, UK, 2012-2019
- AAA in A-Level Mathematics, Physics and Computer Science
Research Experience
- Sep 2025 - Present: Doctoral Researcher
- University of Edinburgh
- Supervisor: Dr. Emily Allaway
- Focus: Improving the compositionality of language models
- Aug 2024 - Aug 2025: Research Assistant
- University of Edinburgh
- Supervisor: Barry Haddow
- Co-developed the HPLT 2.0 bitexting pipeline to mine parallel data
- May 2024 - Jul 2024: Junior Research Assistant
- University of Edinburgh
- Supervisor: Pinzhen Chen
- Cleaned and processed data for EMMA-500, a 7B parameter LLM for over 500 languages
- Jun 2022 - Apr 2024: Junior Research Assistant
- University of Edinburgh
- Supervisor: Mirella Lapata
- Curated and evaluated mNumersense: 36k+ Arabic, Chinese, Russian numeric commonsense sentences
Industry Experience
- Mar 2018: Software Engineer
- Sep 2017: Engineer
- Oct 2017: Computer Technician
- Computer Hospital, Newry, UK
Teaching Experience
- Sep 2023 - Nov 2023: Demonstrator
- Accelerated Natural Language Processing (INFR11125), University of Edinburgh
- Jan 2024 - Mar 2024: Demonstrator
- Foundations of Natural Language Processing (INFR10078), University of Edinburgh
- Jan 2023 - May 2023: Tutor
- Foundations of Natural Language Processing (INFR10078), University of Edinburgh
- Sep 2022 - May 2023: Tutor
- Foundations of Data Science (INFR08030), University of Edinburgh
- Sep 2020 - Mar 2024: Private Tutor
- Self-employed, Edinburgh, UK
Leadership & Outreach
- Sep 2025 - Present: G-Research Academic Ambassador
- G-Research, Edinburgh, UK
- Mar 2025 - Present: Committee Member, Community Events Planning
- Edinburgh Bahá’í Community, Edinburgh, UK
- Jan 2024 - Feb 2024: Primary School Micro:bit Programming Workshop Facilitator
- University of Edinburgh, Bathgate, UK
Awards & Achievements
- G-Research PhD Scholarship (2025-2029)
- Outstanding Honours Project (2024 & 2023)
- Runner-up Best Coursework for Reasoning and Agents (2021)
- Exemplary Project for Foundations of Data Science (2021)
Major Contributions & Open Source
Skills
- Programming: Python, Java, Haskell, LaTeX, SQL
- Libraries: PyTorch, Tensorflow, HuggingFace, NLTK, NumPy, Slurm, Kubernetes, Docker, MTurk, pandas, sklearn, statsmodels, Matplotlib, Festival, HTK, Tkinter, Google Cloud VM, Weather & Maps API, Seaborn, Kivy, JUnit, Maven
- Languages: English (Native), German (Professional)
Selected Publications
- Un-Overfittable: A Dynamic Benchmark on Counterfactual Mathematics - Dayyán O’Brien, Pinzhen Chen, Barry Haddow (Under review for ACL Rolling Review, 2025)
- DocHPLT: A Massively Multilingual Document-Level Translation Dataset - Dayyán O’Brien, Bhavitvya Malik, Ona De Gibert Bonet, Pinzhen Chen, Barry Haddow, Jörg Tiedemann (Under review for WMT, 2025)
- An Expanded Massive Multilingual Dataset for High-Performance Language Technologies - Laurie Burchell, Ona de Gibert, et al. (incl. Dayyán O’Brien) (ACL 2025)
- EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models - Shaoxiong Ji, Zihao Li, et al. (incl. Dayyán O’Brien) (Under review for DMLR, 2024)
- Prompting Numerical Commonsense Reasoning across Languages - Dayyán O’Brien (Outstanding Honours Thesis, 2024)