Projects
Research projects I work and have worked on:
Current
- NG-NLG – Language generation with neural & symbolic methods (ERC StG, 2022-2027)
- THEaiTRE – Automatically generating a theatre play (Czech Technical Agency, 2020-2022)
- EDU-AI – Education chatbot assistant (Czech Technical Agency, 2021-2023)
Past
- NaMuDDiS – multi-domain dialogue systems (2019-2021)
- METOD – dialogue management (industry cooperation with Agnostix co-funded by City of Prague, 2020)
- MaDrIgAL – spoken dialogue systems (and natural language generation, 2016-2018)
- Alexa Prize Challenge – chatbots (2017-2018)
- DILiGENt – natural language generation (2016-2018)
- AdaNLG – adaptive natural language generator (2014–2016)
- Vystadial – spoken dialogue systems (2012–2016)
- QTLeap – semantic machine translation (2013–2016)
- Khresmoi – medical information retrieval (working on machine translation, 2013–2014)
- FAUST – improving machine translation fluency (2011–2013)
Tools
Open-source software I (co-)built:
- RatPred – trainable NLG evaluation tool
- TGen – a statistical natural language generator
- Treex – a modular NLP toolkit
- Alex – spoken dialogue system framework
- MTMonkey – machine translation web services infrastructure
- Flect – statistical morphology generation
Students
Students I supervise:
- Vojtěch Hudeček (Ph.D. with Zdeněk Žabokrtský, since 2018)
- Zdeněk Kasner (Ph.D. since 2019)
- Sourabrata Mukherjee (Ph.D. since 2019)
- Daniel Štancl (Ph.D. since 2020)
- Ondřej Plátek (Ph.D. since 2021)
- Patrícia Schmidtová (completed BSc. with Vojtěch Hudeček, 2018–2019; MSc. since 2020)
- Jaroslav Šafář (MSc. since 2021)
- Ondřej Motlíček (MSc. since 2021)
- František Trebuňa (MSc. since 2021)
- Jiří Balhar (MSc. since 2022)
- Peter Grajcar (MSc. since 2022)
- Nalin Kumar (MSc. since 2022)
- Kristína Szabová (MSc. since 2022)
- Saad Obaid (MSc., LCT with Uni Saarbrücken, with Vera Demberg & Iza Škrjanec, since 2022)
- Jakub Růžička (BSc. with Jan Cuřín & Martin Čmejrek, since 2022)
My former students:
- Shubham Agarwal (Ph.D. at Heriot-Watt, with Verena Rieser & Ioannis Konstas, 2017-2019)
- Vojtěch John (completed BSc. 2021-2022)
- Jonáš Kulhánek (completed MSc. 2020-2021)
- Tomáš Nekvinda (completed MSc. 2019–2020; Ph.D. 2020-2022)
- Borek Požár (completed BSc. with Martin Čmejrek & Jan Cuřín, 2020-2021)
- Jan Vainer (completed MSc. 2019–2020)
- Xinnuo Xu (completed Ph.D., CDT Robotics Edinburgh, with Verena Rieser & Ioannis Konstas, 2016–2021)
If you're interested in doing a bachelor's/master's thesis or a PhD with me, please email me.
Publications
Papers
2024
Adam Wojciechowski, Mateusz Lango, Ondrej Dusek. Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach, in: Findings of the Association for Computational Linguistics: EMNLP 2024. AnthologyGithubPoster
Simone Balloccu, Ehud Reiter, Karen Jia-Hui Li, Rafael Sargsyan, Vivek Kumar, Diego Reforgiato, Daniele Riboni, Ondrej Dusek. Ask the experts: sourcing a high-quality nutrition counseling dataset through Human-AI collaboration, in: Findings of the Association for Computational Linguistics: EMNLP 2024. AnthologyGithubPoster
Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondrej Dusek. Multilingual Text Style Transfer: Datasets & Models for Indian Languages, in: Proceedings of the 17th International Natural Language Generation Conference. AnthologyGithub codeGithub dataPoster
Sourabrata Mukherjee, Atul Kr. Ojha, Ondrej Dusek. Are Large Language Models Actually Good at Text Style Transfer?, in: Proceedings of the 17th International Natural Language Generation Conference. AnthologySlidesGithub
Patricia Schmidtova, Saad Mahamood, Simone Balloccu, Ondrej Dusek, Albert Gatt, Dimitra Gkatzia, David M. Howcroft, Ondrej Platek, Adarsa Sivaprasad. Automatic Metrics in Natural Language Generation: A survey of Current Evaluation Practices, in: Proceedings of the 17th International Natural Language Generation Conference. AnthologyGithubSlides
Jędrzej Warczyński, Mateusz Lango, Ondrej Dusek. Leveraging Large Language Models for Building Interpretable Rule-Based Data-to-Text Systems, in: Proceedings of the 17th International Natural Language Generation Conference. AnthologyGithubSlides
Zdeněk Kasner, Ondrej Platek, Patricia Schmidtova, Simone Balloccu, Ondrej Dusek. factgenie: A Framework for Span-based Evaluation of Generated Texts, in: Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations. AnthologyGithubPyPIPoster
Zdeněk Kasner, Ondrej Dusek. Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation, in: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). AnthologyWebsiteGithubPosterSlides
Nalin Kumar, Ondrej Dusek. LEEETs-Dial: Linguistic Entrainment in End-to-End Task-oriented Dialogue systems, in: Findings of the Association for Computational Linguistics: NAACL 2024. AnthologyGithubPoster
Simone Balloccu, Patrícia Schmidtová, Mateusz Lango, Ondrej Dusek. Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs, in: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). AnthologyWebsitePosterSlides
Mateusz Lango, Patrícia Schmidtová, Simone Balloccu, Ondřej Dušek. ReproHum #0043-4: Evaluating Summarization Models: Investigating the Impact of Education and Language Proficiency on Reproducibility, in: The 4th Workshop on Human Evaluation of NLP Systems (HumEval’24). AnthologySlidesPoster
2023
Zdeněk Kasner, Ioannis Konstas, Ondřej Dušek. Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models, in EACL Anthology Github Poster
Zdeněk Kasner, Ekaterina Garanina, Ondřej Plátek, Ondřej Dušek. TabGenie: A Toolkit for Table-to-Text Generation, in ACL Demo Anthology Demo PyPi Poster
Mateusz Lango, Ondrej Dusek. Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation, in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. AnthologyGithubSlides
Belz et al. (29 authors). Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP, in Workshop on Insights from Negative Results in NLP Anthology
Vojtěch Hudeček, Ondrej Dusek. Are Large Language Models All You Need for Task-Oriented Dialogue?, in: Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue. AnthologyGithubSlides
Sourabrata Mukherjee, Ondrej Dusek. Leveraging Low-resource Parallel Data for Text Style Transfer, in: Proceedings of the 16th International Natural Language Generation Conference. AnthologyGithubSlides
Saad Obaid ul Islam, Iza Škrjanec, Ondrej Dusek, Vera Demberg. Tackling Hallucinations in Neural Chart Summarization, in: Proceedings of the 16th International Natural Language Generation Conference. AnthologyGithubPoster
Emiel van Miltenburg, Miruna Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Stephanie Schoch, Craig Thomson, Luou Wen. Barriers and enabling factors for error analysis in NLG research, in: Northern European Journal of Language Technology. Paper text
Ondřej Plátek, Ondrej Dusek. MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module, in: 12th ISCA Speech Synthesis Workshop (SSW2023). Paper textSlidesGithub
Ondrej Platek, Mateusz Lango, Ondrej Dusek. With a Little Help from the Authors: Reproducing Human Evaluation of an MT Error Detector, in: Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems. AnthologyGithubSlides
Sourabrata Mukherjee, Akanksha Bansal, Pritha Majumdar, Atul Kr. Ojha, Ondřej Dušek. Low-Resource Text Style Transfer for Bangla: Data & Models, in: Proceedings of the First Workshop on Bangla Language Processing (BLP-2023). AnthologySlidesGithub codeGithub data
Sourabrata Mukherjee, Vojtěch Hudeček, Ondřej Dušek. Polite Chatbot: A Text Style Transfer Application, in: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop. AnthologyGithubPosterSlides
Nalin Kumar, Saad Obaid Ul Islam, Ondrej Dusek. Better Translation + Split and Generate for Multilingual RDF-to-Text (WebNLG 2023), in: Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023). AnthologyGithubSlides
Ondřej Plátek, Vojtech Hudecek, Patricia Schmidtova, Mateusz Lango, Ondrej Dusek. Three Ways of Using Large Language Models to Evaluate Chat, in: Proceedings of The Eleventh Dialog System Technology Challenge. AnthologyPosterGithub
František Trebuňa, Ondrej Dusek. VisuaLLM: Easy Web-based Visualization for Neural Language Generation, in: Proceedings of the 16th International Natural Language Generation Conference: System Demonstrations. AnthologyPosterGithub
2022
Zdeněk Kasner, Ondřej Dušek. Neural Pipeline for Zero-Shot Data-to-Text Generation, in: ACL Anthology Github Poster
Tomáš Nekvinda, Ondřej Dušek. AARGH! End-to-end Retrieval-Generation for Task-Oriented Dialog, in: SIGdial. arXiv video Github
Sourabrata Mukherjee, Zdeněk Kasner, Ondřej Dušek. Balancing the Style-Content Trade-Off in Sentiment Transfer Using Polarity-Aware Denoising, in: Text, Speech and Dialogue SpringerLink
Vojtěch Hudeček, Léon-Paul Schaub, Daniel Stancl, Patrick Paroubek, Ondřej Dušek: A Unifying View On Task-oriented Dialogue Annotation, in: LREC Anthology Github
Vojtěch Hudeček, Ondřej Dušek. Learning Interpretable Latent Dialogue Actions With Less Supervision, in: AACL-IJCNLP arXiv Github
Rudolf Rosa, Patrícia Schmidtová, Ondřej Dušek, Tomáš Musil, David Mareček, Saad Obaid, Marie Nováková, Klára Vosecká, Josef Doležal. GPT-2-based Human-in-the-loop Theatre Play Script Generation, in: Workshop on Narrative Understanding Anthology
Rudolf Rosa, Patrícia Schmidtová, Alisa Zakhtarenko, Ondrej Dusek, Tomáš Musil, David Mareček, Saad Obaid Ul Islam, Marie Nováková, Klára Vosecká, Daniel Hrbek, David Košťák. GPT-2-based Human-in-the-loop Theatre Play Script Generation, in: INLG Anthology Github
Rudali Huidrom, Ondřej Dušek, Zdeněk Kasner, Thiago Castro Ferreira, Anya Belz. Two Reproductions of a Human-Assessed Comparative Evaluation of a Semantic Error Detection System, in: INLG GenChal Anthology
Gabor Baranyi, Bruno Carlos Dos Santos Melício, Zsófia Gaál, Levente Hajder, András Simonyi, Dániel Sindely, Joul Skaf, Ondřej Dušek, Tomáš Nekvinda, András Lőrincz, AI Technologies for Machine Supervision and Help in a Rehabilitation Scenario, in: Multimodal Technologies and Interaction 6(7) Web
2021
- Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek. AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models, in: NLP4ConvAI Workshop. arXiv
- Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas. MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization, In: EMNLP Findings. Anthology
- Emiel van Miltenburg, Miruna Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson, Luou Wen. Underreporting of errors in NLG output, and what to do about it, In: INLG (Commendation for an outstanding position paper). Anthology
- Zdeněk Kasner, Simon Mille and Ondřej Dušek. Text-in-Context: Token-Level Error Detection for Table-to-Text Generation, In: INLG Anthology Poster.
- Vojtěch Hudeček, Ondřej Dušek and Zhou Yu. Discovering Dialogue Slots with Weak Supervision, In: ACL. Anthology
- Xinnuo Xu, Ondřej Dušek, Verena Rieser and Ioannis Konstas. AggGen: Ordering and Aggregating while Generating, In: ACL. Anthology
- Tomáš Nekvinda and Ondřej Dušek. Shades of BLEU, Flavours of Success: The Case of MultiWOZ, In: GEM Workshop. Anthology
- Sebastian Gehrmann et al. (50+ authors). The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics, In: GEM Workshop. Anthology
- Léon-Paul Schaub, Vojtěch Hudeček, Daniel Štancl, Ondřej Dušek and Patrick Paroubek. Defining And Detecting Inconsistent System Behavior inTask-oriented Dialogues, In: TALN-RECITAL. Anthology
2020
- Ondřej Dušek and Zdeněk Kasner. Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference, In: INLG (Best Paper Award). ACL Anthology Video Code
- Zdeněk Kasner and Ondřej Dušek. Data-to-Text Generation with Iterative Text Editing, In: INLG. ACL Anthology
- Zdeněk Kasner and Ondřej Dušek. Train Hard, Finetune Easy: Multilingual Denoising for RDF-to-Text Generation, In: WebNLG+ Workshop. PDF
- Jindřich Libovický, Zdeněk Kasner, Jindřich Helcl, and Ondřej Dušek. Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task, In: WNGT Workshop. ACL Anthology
- Tomáš Nekvinda and Ondřej Dušek. One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech, In: Interspeech. ISCA Archive Code, samples and demo
- Jan Vainer and Ondřej Dušek. SpeedySpeech: Efficient Neural Speech Synthesis, In: Interspeech. ISCA Archive Code and samples
- Xinnuo Xu, Ondřej Dušek, Jingyi Li, Verena Rieser, and Ioannis Konstas. Fact-based Content Weighting for Evaluating Abstractive Summarisation, In: ACL. ACL Anthology Video Code
2019
- Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge, In: Computer Speech and Language. ScienceDirect arXiv Web
- Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, and Verena Rieser. Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking), In: INLG, Tokyo. arXiv slides Github
- Ondřej Dušek, David M. Howcroft, and Verena Rieser. Semantic Noise Matters for Neural Natural Language Generation, In: INLG, Tokyo. PDF poster Github
- Ondřej Dušek and Filip Jurčíček. Neural Generation for Czech: Data and Baselines, In: INLG, Tokyo. arXiv slides Github (data) Github (code)
- Simon Keizer, Ondřej Dušek, Xingkun Liu, and Verena Rieser. User Evaluation of a Multi-dimensional Statistical Dialogue System, In: SIGDIAL, Stockholm. ACL arXiv Poster code
2018
- Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Findings of the E2E NLG Challenge, In: INLG, Tilburg. arXiv Web Slides
- Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity, In: EMNLP, Brussels. arXiv Github Poster
- Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. RankME: Reliable Human Ratings for Natural Language Generation, In: NAACL, New Orleans. arXiv Poster Github
- Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Improving Context Modelling in Multimodal Dialogue Generation, In: INLG, Tilburg. arXiv Github Poster
- Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. A Knowledge-Grounded Multimodal Search-Based Conversational Agent, In: SCAI EMNLP workshop, Brussels. arXiv Github Poster
- Igor Shalyminov, Ondřej Dušek, and Oliver Lemon. Neural Response Ranking for Social Conversation: A Data-Efficient Approach, In: SCAI EMNLP workshop, Brussels. arXiv Github Slides
2017
- Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Referenceless Quality Estimation for Natural Language Generation, In: ICML Workshop on Learning to Generate Natural Language, Sydney. arXiv Poster Slides Github
- Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, and Verena Rieser. Why We Need New Evaluation Metrics for NLG, In: EMNLP, Copenhagen. arXiv Github
- Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. The E2E Dataset: New Challenges For End-to-End Generation, In: SIGDIAL, Saarbrücken. arXiv Poster Slides Video Web
- Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. Data-driven Natural Language Generation: Paving the Road to Success, In: WiNLP, Vancouver. arXiv
- Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondřej Dušek, Verena Rieser, and Oliver Lemon. An Ensemble Model with Ranking for Social Dialogue, In: NIPS 2017 Workshop on Conversational AI, Long Beach. arXiv
- Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondřej Dušek, Verena Rieser, and Oliver Lemon. Alana: Social Dialogue using an Ensemble Model and a Ranker trained on User Feedback, In: Alexa Prize, Las Vegas. PDF
2016
- Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generator for Dialogue Systems, In: SIGDIAL, Los Angeles. PDF arXiv Github
- Ondřej Dušek and Filip Jurčíček. Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings, In: ACL, Berlin. PDF arXiv Github
- Ondřej Bojar, Ondřej Dušek, Tom Kocmi, Jindřich Libovický, Michal Novák, Martin Popel, Roman Sudarikov, and Dušan Variš. CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered, In: TSD, Brno. SpringerLink
- Rudolf Rosa, Martin Popel, Ondřej Bojar, David Mareček, and Ondřej Dušek. Moses & Treex Hybrid MT Systems Bestiary, In: Deep MT Workshop, Lisbon. PDF
- Roman Sudarikov, Ondřej Bojar, Ondřej Dušek, Martin Holub, and Vincent Kríž. Verb Sense Disambiguation in Machine Translation, In: HyTra-6, Osaka. PDF
- Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generation Dataset for Dialogue Systems, In: RE-WOCHAT, Portorož. PDF Slides
2015
- Rudolf Rosa, Ondřej Dušek, Michal Novák, and Martin Popel. Translation Model Interpolation for Domain Adaptation in TectoMT, In: Deep MT Workshop, Prague, 2015 PDF Slides
- Ondřej Dušek, Luís Gomes, Michal Novák, Martin Popel, and Rudolf Rosa. New Language Pairs in TectoMT, In: WMT, Lisbon, 2015 PDF Poster
- Ondřej Dušek and Filip Jurčíček. Training a Natural Language Generator from Unaligned Data, In: ACL-IJCNLP, Beijing. PDF Slides Poster Video Github
- Ondřej Dušek, Eva Fučíková, Jan Hajič, Martin Popel, Jana Šindlerová, and Zdeňka Urešová. Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation, In: Depling, Uppsala. PDF Slides
- Zdeňka Urešová, Ondřej Dušek, Eva Fučíková, Jan Hajič, and Jana Šindlerová. Bilingual English-Czech Valency Lexicon Linked to a Parallel Corpus, In: LAW IX, Denver. PDF
2014
- Daniela Majchráková, Ondřej Dušek, Jan Hajič, Agáta Karčová, Radovan Garabík. Semi-automatic Detection of Multiword Expressions in the Slovak Dependency Treebank, In: Computational Linguistics in Bulgaria, Sofia. PDF
- Daniel Zeman, Ondřej Dušek, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič. HamleDT: Harmonized multi-language dependency treebank, in: Language Resources and Evaluation (48) 4, December 2014. SpringerLink
- Ondřej Dušek, Ondřej Plátek, Lukáš Žilka, and Filip Jurčíček. Alex: Bootstrapping a Spoken Dialogoue System for a New Domain by Real Users, in: SIGDIAL, Philadelphia. PDF Poster
- Ondřej Dušek, Jan Hajič, Jaroslava Hlaváčová, Michal Novák, Pavel Pecina, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová and Daniel Zeman. Machine Translation of Medical Texts in the Khresmoi Project, in: WMT, Baltimore. PDF
- Ondřej Dušek, Jan Hajič, and Zdeňka Urešová: Verbal Valency Frame Detection and Selection in Czech and English, in: EVENTS, Baltimore. PDF Poster
- Pavel Pecina, Ondřej Dušek, Lorraine Goeuriot, Jan Hajič, Jaroslava Hlaváčová, Gareth Jones, Liadh Kelly, Johannes Leveling, David Mareček, Michal Novák, Martin Popel, Rudolf Rosa, Aleš Tamchyna, and Zdeňka Urešová: Adaptation of Machine Translation for Multilingual Information Retrieval in the Medical Domain, in: Artificial Inteligence in Medicine (61) 3. ScienceDirect
- Matěj Korvas, Ondřej Plátek, Ondřej Dušek, Lukáš Žilka, and Filip Jurčíček: Free English and Czech Telephone Speech Corpus Shared Under the CC-BY-SA 3.0 License, in: LREC, Reykjavík. PDF Slides
- Zdeňka Urešová, Ondřej Dušek, Jan Hajič, and Pavel Pecina: Multilingual Test Sets for Machine Translation of Search Queries for Cross-lingual Information Retrieval in the Medical Domain, in: LREC, Reykjavík. PDF Poster
2013
- Ondřej Dušek, Filip Jurčíček: Robust Multilingual Statistical Morphological Generation Models, in: ACL Student Research Workshop, Sofia. PDF Slides Video Github
- Ondřej Dušek: Towards a Truly Statistical Natural Language Generator for Spoken Dialogues, in: Week of Doctoral Students. Prague. PDF Slides
- Aleš Tamchyna, Ondřej Dušek, Rudolf Rosa, Pavel Pecina: MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service, in: The Prague Bulletin of Mathematical Linguistics 100, 31-40. PDF Poster Github
2012
- Ondřej Dušek, Zdeněk Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, David Mareček: Formemes in English-Czech Deep Syntactic MT, in: WMT, Montréal. PDF
- Rudolf Rosa, David Mareček, Ondrej Dušek: DEPFIX: A System for Automatic Correction of Czech MT Outputs, in: WMT, Montréal. PDF
- Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel: Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors, in: SSST-6, Jeju. PDF
- Ondřej Bojar, Zdeněk Žabokrtský, Ondrej Dušek, Petra Galušcáková, Martin Majliš, David Marecek, Jiří Maršík, Michal Novák, Martin Popel, Aleš Tamchyna: The Joy of Parallelism with CzEng 1.0, in: LREC, Istanbul. PDF
Theses
- Novel Methods for Natural Language Generation in Spoken Dialogue Systems. Ph.D. Thesis, Faculty of Mathematics and Physics, Charles University, Prague, 2017. PDF Summary PDF slides
- Confrontation of Czech and German valency lexicons. Master's thesis, Faculty of Arts, Charles University in Prague, 2013. (in German) PDF
- Deep automatic analysis of English. Master's thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2010. PDF
- BashCommander. Bachelor thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2007. PDF
Talks
- Large Language Models for Dialogue Applications. 4EU+ AI Days, Charles University. June 13, 2024. PDF slides
- Large Language Models: How they work and what they are good for. Challenges of AI in Teaching Foreign Languages, Czech University of Life Sciences Prague. May 3, 2024. PDF slides
- Looking for LLMs' Limits in Dialogue & Data-to-text. SCICHAT Workshop at EACL. Mar 21, 2024 PDF slides
- Dialogue Systems (introduction). AI in HCI, Czech Technical University. Mar 8, 2024 PDF slides
- Getting Structure in Dialogue with Large Language Models. Hora Informaticae, Czech Academy of Sciences. Jan 23, 2024 PDF
- AI/Large Language Models. Jan 23, 2024 PDF slides
- Skipping Chit-chat with ChatGPT: Large Language Models and Structured Outputs. Lecture Series: Machines That Understand? University of Vienna. Dec 7, 2023 PDF slides Video
- Getting Past Chit-chat with ChatGPT: Large Language Models and Structured Outputs. Responsible Use of AI in Universities, Charles University. Nov 23, 2023 PDF slides
- Getting Structure in Dialogue with Large Language Models. Data, AI, Znalosti meetup, University of Economics Prague. Nov 9, 2023 PDF slides
- Large Language Models for Text Generation. Den s Katedrou jazykové přípravy. Sep 21, 2023 PDF slides
- Neural Networks for Dialogue Systems. ČSOB/KBC Data Boot Camp. Jun 13, 2023 PDF slides
- Data-to-text Generation with Neural Language Models. Scandinavian Conference on Image Analysis (SCIA). Apr 20, 2023 PDF slides
- Dialogue Systems (introduction). AI in HCI, Czech Technical University. Mar 24, 2023 PDF slides
- AI in Context of Text Generation. AI in Context Seminar, Charles University. Mar 9, 2023 Seminar website PDF slides
- Robust Data-to-text Generation with Pretrained Language Models. Prague Computer Science Seminar. Feb 9, 2023 Seminar website PDF slides
- Robust Data-to-text Generation with Pretrained Language Models. Heinrich-Heine University of Düsseldorf seminar on Selected Topic in Machine Learning and Natural Language Processing. Jan 26, 2023 PDF slides
- End-to-end Neural Dialogue Systems. VOCALLS AI Afternoon, Prague. Oct 19 2022 PDF slides
- Neural Conversational AI. MLSS^N Summer School, Kraków. Jun 30, 2022 PDF slides Live recording
- Large Neural Language Models for Data-to-text Generation. AICZECHIA Seminar, Online. Mar 22, 2022 PDF slides
- Better Supervision for End-to-end Neural Dialogue Systems. VSG Invited Talks @ FIT, Brno University of Technology. Dec 1, 2021 Web PDF slides Video
- Accuracy in Neural Text Generation. Heinrich-Heine University of Düsseldorf seminar on Selected Topic in Machine Learning and Natural Language Processing. Jul 23, 2021 PDF slides
- Dialogue Systems at Charles University. Czechbots conference. Mar 3, 2020. PDF slides
- Challenges in Neural NLG. ÚFAL Monday seminar. Dec 2, 2019. PDF slides
- Challenges in Neural NLG. Apple Cambridge. Oct 16, 2019. PDF slides
- Challenges in Response Generation and Conversational AI. ILCC/HCRC Seminar, University of Edinburgh. Sep 14, 2018. PPTX Slides (24MB)
- Can You Be Friends with a Smart Speaker Device? Pint of Science Festival, Edinburgh. May 15, 2018. PPTX Slides (63MB)
- Sequence-to-sequence Natural Language Generation. University of Sheffield. Jun 1, 2017. Slides
- Home Intelligent? Assistants. Edinburgh Science Festival. Apr 8, 2017. PPTX slides (63MB)
- Sequence-to-sequence Natural Language Generation for Spoken Dialogue Systems. ÚFAL Monday seminar. Mar 28, 2017. Slides Video
- Sequence-to-sequence Natural Language Generation. HWU Interaction Lab meeting. Nov 16, 2016. Slides
- Sequence-to-sequence Natural Language Generation. Diligent project meeting. Nov 10, 2016. Slides
- Natural Language Generation (Mostly) for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 11, 2016. Slides
- Natural Language Generation for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 14, 2015. Slides
- A Two-stage Syntax-based Natural Language Generator. ÚFAL Monday seminar. Mar 9, 2015. Slides Video
- Tecto to AMR and Translation (with Tim O'Gorman and others). JHU/CLSP Fred Jelinek Memorial PIRE Workshop, Aug 1, 2014. Slides Video
- Ein Vergleich der deutschen und tschechischen Valenzwörterbücher durch Korpusanalyse und Befragung unter Linguisten. The 4th PRAGESTT Students' German Philology Conference. Mar 21, 2014. (in German) Slides Handout
- Natural Language Generation (Not Only) in Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 22, 2013. Slides
- Learning Morphology from the Corpus. ÚFAL Monday seminar. Nov 11, 2013. Slides Video