Alham Fikri Aji / Curriculum Vitae alham.fikri@mbzuai.ac.ae
Education
- PhD, University of Edinburgh Nov 2016 - Jun 2020
Supervised by Kenneth Heafield and Rico Sennrich. - MSc Artificial Intelligence, University of Edinburgh Sep 2014 - Aug 2015\
- BSc Computer Science, Universitas Indonesia Aug 2010 - Jul 2014\
Working Experience
- Visiting Research Scientist, Google Research Sep 2024 - Current
- Adjunct Assistant Professor, Monash Indonesia Jan 2024 - Current
- Assistant Professor, MBZUAI Jan 2023 - Current
- Applied Scientist, Amazon Alexa AI Oct 2021 - Jan 2023
- Postdoctoral Research Associate, University of Edinburgh Jun 2020 - Jul 2021
- Research Scientist, Kata.ai Nov 2019 - Sep 2021
- Engineering Intern, Google Research Jul 2017 - Nov 2017
- Language Engineer, Apple Siri Oct 2015 - Oct 2016
Awards
- Best Resource Paper Award, EACL 2024
- Best Resource Paper Award, AACL 2023
- Outstanding Paper Award, EACL 2023
- Outstanding Contribution Award, WNGT 2019
- World Finalists, ACM-ICPC 2014
- Silver Medalists, International Olympiad of Informatics (IOI) 2010
Professional Services
- Adversary Board: The ACL Special Interest Group on SEA NLP (SIGSEA)
- Reviewer and Program Committee Member
- Conferences: ARR, ACL, COLING, ICML, ICLR, NeurIPS, LREC
- Workshop: WNGT, TL4NLP
- Area Chair: ARR (2024+), ACL (2023), EMNLP (2023), COLM (2024)
- Local Chair: COLING (2025)
- Organizer: South-East Asia Language Processing (2023, 2025), Semeval shared task organizer (2024, 2025)
- Informatics Olympiad:
- Problem Setter: OSN Indonesia (2013, 2014, 2015), ACM-ICPC (2014, 2015), APIO (2015), Gemastik (2016), ICPC-Asia (2025)
- Committee: Gemastik (2016), TOKI-Open (2018), IOI (2022)
- Training: Indonesia’s Pre-OSN Distance training (2009, 2010), Indonesia’s National Camp (2011, 2012, 2013), University of Edinburgh ACM-ICPC preparation (2014), Saudi Arabia National Team (2020)
Selected Publications
I mainly publish at ACL conferences. You may also refer to my Google Scholar for an updated list of publications.
● denotes my role as (Co-)senior author(s), whereas ■ denotes my role as main author(s).
Peer-Reviewed Conferences
- Unveiling the Influence of Amplifying Language-Specific Neurons. Inaya Rahmanisa, Lyzander Marciano Andrylie, Mahardika Krisna Ihsani, Alfan Farizki Wicaksono, Haryo Akbarianto Wibowo, Alham Fikri Aji (AACL, 2025)
- Multilingual Iterative Model Pruning: What Matters?. Haryo Akbarianto Wibowo, Haiyue Song, Hideki Tanaka, Masao Utiyama, Alham Fikri Aji, Raj Dabre (AACL, 2025)
- ThaiInstruct: An instruction-following Dataset for Culturally-Aware, Multitask, and Multi-domain Evaluation in Thai. Peerat Limkonchotiwat, Pume Tuchinda, Lalita Lowphansirikul, Surapon Nonesung, Panuthep Tasawong, Alham Fikri Aji, Can Udomcharoenchaikit, Sarana Nutanong (EMNLP, 2025)
- LORAXBENCH: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages. Alham Fikri Aji, Trevor Cohn (EMNLP, 2025)
- From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs. Muhammad Farid Adilazuarda, Chen Cecilia Liu, Iryna Gurevych, Alham Fikri Aji (EMNLP, 2025)
- Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models. Masahiro Kaneko, Alham Fikri Aji, Timothy Baldwin (EMNLP, 2025)
- CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation. Emilio Villa-Cueva, Sholpan Bolatzhanova, Diana Turmakhan, Kareem Elzeky, Henok Biadglign Ademtew, Alham Fikri Aji, Israel Abebe Azime, Jinheon Baek, Frederico Belcavello, Fermin Cristobal, Jan Christian Blaise Cruz, Mary Dabre, Raj Dabre, Toqeer Ehsan, Naome A Etori, Fauzan Farooqui, Jiahui Geng, Guido Ivetta, Thanmay Jayakumar, Soyeong Jeong, Zheng Wei Lim, Aishik Mandal, Sofía Martinelli, Mihail Minkov Mihaylov, Daniil Orel, Aniket Pramanick, Sukannya Purkayastha, Israfel Salazar, Haiyue Song, Tiago Timponi Torrent, Debela Desalegn Yadeta, Injy Hamed, Atnafu Lambebo Tonja, Thamar Solorio (Findings of the Association for Computational Linguistics: EMNLP 2025, 2025)
- MoMentS: A Comprehensive Multimodal Benchmark for Theory of Mind. Emilio Villa-Cueva, S M Masrur Ahmed, Rendi Chevi, Jan Christian Blaise Cruz, Kareem Elzeky, Fermin Cristobal, Alham Fikri Aji, Skyler Wang, Rada Mihalcea, Thamar Solorio (Findings of the Association for Computational Linguistics: EMNLP 2025, 2025)
- Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation. Jonibek Mansurov, Akhmed Sakip, Alham Fikri Aji (ACL, 2025)
- BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages. Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, Jan Philip Wahle, Terry Ruas, Meriem Beloucif, Christine de Kock, Nirmal Surange, Daniela Teodorescu, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Alham Fikri Aji, Felermino D. M. A. Ali, Ilseyar Alimova, Vladimir Araujo, Nikolay Babakov, Naomi Baes, Ana-Maria Bucur, Andiswa Bukula, Guanqun Cao, Rodrigo Tufiño, Rendi Chevi, Chiamaka Ijeoma Chukwuneke, Alexandra Ciobotaru, Daryna Dementieva, Murja Sani Gadanya, Robert Geislinger, Bela Gipp, Oumaima Hourrane, Oana Ignat, Falalu Ibrahim Lawan, Rooweither Mabuya, Rahmad Mahendra, Vukosi Marivate, Alexander Panchenko, Andrew Piper, Charles Henrique Porto Ferreira, Vitaly Protasov, Samuel Rutunda, Manish Shrivastava, Aura Cristina Udrea, Lilian Diana Awuor Wanzare, Sophie Wu, Florian Valentin Wunderlich, Hanif Muhammad Zhafran, Tianhui Zhang, Yi Zhou, Saif M. Mohammad (ACL, 2025) -- Best Resource Paper🏅
- Do Language Models Understand Honorific Systems in Javanese?. Mohammad Rifqi Farhansyah, Iwan Darmawan, Adryan Kusumawardhana, Genta Indra Winata, Alham Fikri Aji, Derry Tanti Wijaya (ACL, 2025)
- NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts. Muhammad Farid Adilazuarda, Musa Izzanardi Wijanarko, Lucky Susanto, Khumaisa Nur'aini, Derry Tanti Wijaya, Alham Fikri Aji (ACL, 2025)
- KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan. Mukhammed Togmanov, Nurdaulet Mukhituly, Diana Turmakhan, Jonibek Mansurov, Maiya Goloburda, Akhmed Sakip, Zhuohan Xie, Yuxia Wang, Bekassyl Syzdykov, Nurkhan Laiyk, Alham Fikri Aji, Ekaterina Kochmar, Preslav Nakov, Fajri Koto (ACL, 2025)
- Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia. Samuel Cahyawijaya, Holy Lovenia, Joel Ruben Antony Moniz, Tack Hwa Wong, Mohammad Rifqi Farhansyah, Thant Thiri Maung, Frederikus Hudi, David Anugraha, Muhammad Ravi Shulthan Habibi, Muhammad Reza Qorib, Amit Agarwal, Joseph Marvin Imperial, Hitesh Laxmichand Patel, Vicky Feliren, Bahrul Ilmi Nasution, Manuel Antonio Rufino, Genta Indra Winata, Rian Adam Rajagede, Carlos Rafael Catalan, Mohamed Fazli Mohamed Imam, Priyaranjan Pattnayak, Salsabila Zahirah Pranida, Kevin Pratama, Yeshil Bangera, Adisai Na-Thalang, Patricia Nicole Monderin, Yueqi Song, christian simon, Lynnette Hui Xian Ng, Richardy Lobo Sapan, Bin Wang, Supryadi, Kanyakorn Veerakanjana, Piyalitt Ittichaiwong, Matthew Theodore Roque, Karissa Vincentio, Takdanai Kreangphet, Phakphum Artkaew, Kadek Hendrawan Palgunadi, Yanzhi Yu, Rochana Prih Hastuti, William Nixon, Mithil Bangera, Adrian Xuan Wei Lim, Aye Hninn Khine, Hanif Muhammad Zhafran, Teddy Ferdinan, Audra Aurora Izzani, Ayushman Singh, Evan, Jauza Akbar Krito, Michael Anugraha, Fenal Ashokbhai Ilasariya, Haochen Li, John Amadeo Daniswara, Filbert Aurelian Tjiaranata, Eryawan Presma Yulianrifat, Can Udomcharoenchaikit, Fadil Risdian Ansori, Mahardika Krisna Ihsani, Giang Nguyen, Anab Maulana Barik, Dan John Velasco, Rifo Ahmad Genadi, Saptarshi Saha, Chengwei Wei, Isaiah Edri W. Flores, Kenneth Chen Ko Han, Anjela Gail D. Santos, Wan Shen Lim, Kaung Si Phyo, Tim Santos, Meisyarah Dwiastuti, Jiayun Luo, Jan Christian Blaise Cruz, Ming Shan Hee, Ikhlasul Akmal Hanif, M.Alif Al Hakim, Muhammad Rizky Sya'ban, Kun Kerdthaisong, Lester James Validad Miranda, Fajri Koto, Tirana Noor Fatyanosa, Alham Fikri Aji, Jostin Jerico Rosal, Jun Kevin, Robert Wijaya, Onno P. Kampman, Ruochen Zhang, Börje F. Karlsson, Peerat Limkonchotiwat (ACL, 2025)
- Statement-Tuning Enables Efficient Cross-lingual Generalization in Encoder-only Models. Ahmed Elshabrawy, Thanh-Nhi Nguyen, Yeeun Kang, Lihan Feng, Annant Jain, Faadil Abdullah Shaikh, Jonibek Mansurov, Mohamed Fazli Mohamed Imam, Jesus-German Ortiz-Barajas, Rendi Chevi, Alham Fikri Aji (ACL, 2025)
- A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information. Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Zilu Tang, Fariz Akyas, Traci Hong, Ika Karlina Idris, Alham Fikri Aji, Derry Tanti Wijaya (ACL, 2025)
- Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses. Samuel Cahyawijaya, Ruochen Zhang, Jan Christian Blaise Cruz, Holy Lovenia, Elisa Gilbert, Hiroki Nomoto, Alham Fikri Aji (NAACL, 2025)
- MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding. Zayd Muhammad Kawakibi Zuhri, Muhammad Farid Adilazuarda, Ayu Purwarianti, Alham Fikri Aji (NAACL, 2025)
- Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning. Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji (NAACL, 2025)
- WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines. Genta Indra Winata, Frederikus Hudi, Patrick Amadeus Irawan, David Anugraha, Rifki Afina Putri, WANG YUTONG, Adam Nohejl, Ubaidillah Ariq Prathama, Nedjma Ousidhoum, Afifa Amriani, Anar Rzayev, Anirban Das, Ashmari Pramodya, Aulia Adila, Bryan Wilie, Candy Olivia Mawalim, CHENG Ching Lam, Daud Abolade, Emmanuele Chersoni, Enrico Santus, Fariz Ikhwantri, Garry Kuwanto, Hanyang Zhao, Haryo Akbarianto Wibowo, Holy Lovenia, Jan Christian Blaise Cruz, Jan Wira Gotama Putra, Junho Myung, Lucky Susanto, Maria Angelica Riera Machin, Marina Zhukova, Michael Anugraha, Muhammad Farid Adilazuarda, Natasha Christabelle Santosa, Peerat Limkonchotiwat, Raj Dabre, Rio Alexander Audino, Samuel Cahyawijaya, Shi-Xiong Zhang, Stephanie Yulia Salim, Yi Zhou, Yinxuan Gui, David Ifeoluwa Adelani, En-Shiun Annie Lee, Shogo Okada, Ayu Purwarianti, Alham Fikri Aji, Taro Watanabe, Derry Tanti Wijaya, Alice Oh, Chong-Wah Ngo (NAACL, 2025) -- Theme Paper Award🏅
- Style Over Substance: Evaluation Biases for Large Language Models. Minghao Wu, Alham Fikri Aji (COLING, 2025)
- From Multiple-Choice to Extractive QA: A Case Study for English and Arabic. Teresa Lynn, Malik H. Altakrori, Samar M. Magdy, Rocktim Jyoti Das, Chenyang Lyu, Mohamed Nasr, Younes Samih, Kirill Chirkunov, Alham Fikri Aji, Preslav Nakov, Shantanu Godbole, Salim Roukos, Radu Florian and Nizar Habash (COLING, 2025)
- CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark. David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, ... , Alham Fikri Aji (NeurIPS, 2024)
- LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization. Muhammad Farid Adilazuarda, Samuel Cahyawijaya, Genta Indra Winata, Ayu Purwarianti, Alham Fikri Aji (EMNLP, 2024)
- Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting. Sagnik Mukherjee, Muhammad Farid Adilazuarda, Sunayana Sitaram, Kalika Bali, Alham Fikri Aji, Monojit Choudhury (EMNLP, 2024)
- Re-Evaluating Evaluation for Multilingual Summarization. Jessica Zosa Forde, Ruochen Zhang, Lintang Sutawika, Alham Fikri Aji, Samuel Cahyawijaya, Genta Indra Winata, Minghao Wu, Carsten Eickhoff, Stella Biderman, Ellie Pavlick (EMNLP, 2024)
- LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection. Mervat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, Yuxia Wang, Osama Mohammed Afzal, Zhuohan Xie, Jonibek Mansurov, Ekaterina Artemova, Vladislav Mikhailov, Rui Xing, Jiahui Geng, Hasan Iqbal, Zain Muhammad Mujahid, Tarek Mahmoud, Akim Tsvigun, Alham Fikri Aji, Artem Shelmanov, Nizar Habash, Iryna Gurevych, Preslav Nakov (EMNLP System Demonstrations, 2024)
- Efficient and Interpretable Grammatical Error Correction with Mixture of Experts. Muhammad Reza Qorib, Alham Fikri Aji, Hwee Tou Ng (EMNLP, 2024)
- Towards Measuring and Modeling “Culture” in LLMs: A Survey. Muhammad Farid Adilazuarda, Sagnik Mukherjee, Pradhyumna Lavania, Siddhant Shivdutt Singh, Alham Fikri Aji, Jacki O’Neill, Ashutosh Modi, Monojit Choudhury (EMNLP, 2024)
- SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages. Holy Lovenia, Rahmad Mahendra, Salsabil Maulana Akbar, Lester James Validad Miranda, Jennifer Santoso, Elyanah Aco, Akhdan Fadhilah, Jonibek Mansurov, Joseph Marvin Imperial, Onno P. Kampman, Joel Ruben Antony Moniz, Muhammad Ravi Shulthan Habibi, Frederikus Hudi, Jann Railey Montalan, ... , Peerat Limkonchotiwat, Alham Fikri Aji, Sedrick Keh, Genta Indra Winata, Ruochen Zhang, Fajri Koto, Zheng Xin Yong, Samuel Cahyawijaya (EMNLP, 2024)
- Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Emmanuel Dave, Jhonson Lee, Nuur Shadieq, Wawan Cenggoro, Salsabil Maulana Akbar, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung (ACL, 2024)
- M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection. Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov (ACL, 2024)
- SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages. Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine De Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Winata, Seid Muhie Yimam, Saif M Mohammad (ACL, 2024)
- Copal-ID: Indonesian Language Reasoning with Local Culture and Nuances. Haryo Akbarianto Wibowo, Erland Hilman Fuadi, Made Nindyatama Nityasya, Radityo Eko Prasojo, Alham Fikri Aji (NAACL, 2024)
- M4: Multi-generator, Multi-domain, and Multi-lingual Black-box Machine-generated Text Detection. Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek Mahmoud, Toru Sasaki, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov (EACL, 2024) -- Best Resource Paper🏅
- A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models. Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, Derek F Wong, Longyue Wang (LREC, 2024)
- Lamini-LM: A Diverse Herd of Distilled Models from Large-scale Instructions. Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, Alham Fikri Aji (EACL, 2024)
- LLM-powered Data Augmentation for Enhanced Crosslingual Performance. Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji (EMNLP, 2023)
- Multilingual Large Language Models Are Not (Yet) Code-Switchers. Ruochen Zhang, Samuel Cahyawijaya, Jan Christian Blaise Cruz, Alham Fikri Aji (EMNLP, 2023)
- GlobalBench: A benchmark for global progress in natural language processing. Yueqi Song, Catherine Cui, Simran Khanuja, Pengfei Liu, Fahim Faisal, Alissa Ostapenko, Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Yulia Tsvetkov, Antonios Anastasopoulos, Graham Neubig (EMNLP, 2023)
- Nusawrites: Constructing high-quality corpora for underrepresented and extremely low-resource languages. Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Maulana Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Wahyuning Linuwih, Bryan Wilie, Galih Pradipta Muridan, Genta Indra Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung (AACL, 2023) -- Best Resource Paper🏅
- Crosslingual Generalization through Multitask Finetuning. Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff and Colin Raffel (ACL, 2023)
- On “Scientific Debt” in NLP: A Case for More Rigour in Language Model Pre-Training Research. Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Alham Fikri Aji, Genta Indra Winata, Radityo Eko Prasojo, Phil Blunsom and Adhiguna Kuncoro (ACL, 2023)
- WebIE: Faithful and Robust Information Extraction on the Web. Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos and Andrea Pierleoni (ACL, 2023)
- BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. Zheng-Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina (ACL, 2023)
- The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges. Genta Indra Winata, Alham Fikri Aji, Zheng Xin Yong and Thamar Solorio (ACL, 2023)
- NusaCrowd: Open Source Initiative for Indonesian NLP Resources. Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, et al. (ACL, 2023)
- Direct Fact Retrieval from Knowledge Graphs without Entity Linking. Jinheon Baek, Alham Fikri Aji, Jens Lehmann and Sung Ju Hwang (ACL, 2023)
- Multi-lingual and Multi-cultural Figurative Language Understanding. Anubha Kabra, Emmy Liu, Simran Khanuja, Alham Fikri Aji, Genta Indra Winata, Samuel Cahyawijaya, Anuoluwapo Aremu, Perez Ogayo and Graham Neubig (ACL, 2023)
- NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung, Timothy Baldwin, Jey Han Lau, Rico Sennrich, Sebastian Ruder (EACL, 2023) -- Outstanding Award🏅
- Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering. Priyanka Sen, Alham Fikri Aji, Amir Saffari (COLING, 2022)
- REDTab: A Relation Extraction Dataset for Knowledge Extraction from Web Tables. Siffi Singh, Alham Fikri Aji, Gaurav Singh, Christos Christodoulopoulos (COLING, 2022)
- One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder (ACL, 2022)
- IndoNLI: A Natural Language Inference Dataset for Indonesian. Rahmad Mahendra, Alham Fikri Aji, Samuel Louvan, Fahrurrozi Rahman, Clara Vania (EMNLP, 2021)
- ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair. Alham Fikri Aji, Radityo Eko Prasojo, Tirana Noor Fatyanosa, Philip Arthur, Suci Fitriany, Salma Qonitah, Nadhifa Zulfa, Tomi Santoso, Mahendra Data (PACLIC, 2021)
- IndoCollex: A Testbed for Morphological Transformation of Indonesian Word Colloquialism. Haryo Akbarianto Wibowo, Made Nindyatama Nityasya, Afra Feyza Akyürek, Suci Fitriany, Alham Fikri Aji, Radityo Eko Prasojo, Derry Tanti Wijaya (ACL-IJCNLP, 2021)
- In Neural Machine Translation, What Does Transfer Learning Transfer?. Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, Rico Sennrich (ACL, 2020)
- Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation. Haryo Akbarianto Wibowo, Tatag Aziz Prawiro, Muhammad Ihsan, Alham Fikri Aji, Radityo Eko Prasojo, Rahmad Mahendra, Suci Fitriany (IALP, 2020)
- Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training. Alham Fikri Aji, Kenneth Heafield, Nikolay Bogoychev (EMNLP, 2019)
- Accelerating asynchronous stochastic gradient descent for neural machine translation. Nikolay Bogoychev, Marcin Junczys-Dowmunt, Kenneth Heafield, Alham Fikri Aji (EMNLP, 2018)
- Marian: Fast neural machine translation in C++. Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Grundkiewicz, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, Andre Martins, Alexandra Birch (ACL, 2018)
- Toward a standardized and more accurate Indonesian part-of-speech tagging. Kemal Kurniawan, Alham Fikri Aji (IALP, 2018)
- Sparse communication for distributed gradient descent. Alham Fikri Aji, Kenneth Heafield (EMNLP, 2017)