Mihir Kulkarni will present his general exam "Training models for NLP tasks on novel domains using transfer learning" on Wednesday, August 7, 2019 at 3pm in Friend 006. The members of his committee are as follows: Karthik Narasimhan (advisor), Sanjeev Arora, and Olga Russakovsky. Everyone is invited to attend his talk, and those faculty wishing to remain for the oral exam following are welcome to do so. His abstract and reading list follow below. Abstract: Recent advances in machine learning for NLP have shown that there are great benefits to unsupervised pre-training followed by finetuning to specific tasks (BERT, GPT): in particular, BERT was able to obtain state of the art results on 11 NLP tasks using this method. However, having these models perform well on domains that are different from the one where they were trained is still a challenge. In our work, we explore methods to transfer representations learnt by BERT to different models for improved performance on NLP tasks in a different domain-- medical papers. We generate a dataset from medical reviews obtainable on the Cochrane Database of Systematic Reviews, to train models to learn representations on technical documents that match representations learnt by BERT on plain text summaries of those documents. We experiment with different methods to perform section-wise matching between documents to create our parallel corpus, both for performing the matching and transferring representations. Performance on the domain transfer task is reported on both intrinsic and extrinsic evaluation metrics including named entity recognition on the EBM-NLP and sentence similarity on the MED-STS datasets. Reading list: 1. Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016. 2. Devlin, J., Chang, M.W., Lee, K. and Toutanova, K., 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 3. Peters, Matthew E. and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke: Deep contextualized word representations, Proc. of NAACL, 2018. 4. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I: Attention is all you need. In Advances in neural information processing systems, 2017. 5. Jiaqi Mu, Suma Bhat, Pramod Viswanath: All-but-the-Top: Simple and effective Postprocessing for Word Representations, International Conference on Learning Representations, 2018. 6. Tan, Chuanqi & Sun, Fuchun & Kong, Tao & Zhang, Wenchang & Yang, Chao & Liu, Chunfang: A Survey on Deep Transfer Learning: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4–7, 2018, Proceedings, Part III. 10.1007/978-3-030-01424-7_27. 7. Sinno Jialin Pan and Qiang Yang and Wei Fan and Sinno Jialin Pan: A survey on transfer learning, IEEE Transactions on Knowledge and Data Engineering, 2010. 8. Yifan Peng and Shankai Yan and Zhiyong Lu, Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets, Proceedings of the 2019 Workshop on Biomedical Natural Language Processing (BioNLP 2019), 2019. 9.Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain J. Marshall, Ani Nenkova, Byron C. Wallace: A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018. 10. Yanshan Wang, Naveed Afzal, Sunyang Fu, Liwei Wang, Feichen Shen, Majid Rastegar-Mojarad, Hongfang Liu: MedSTS: a resource for clinical semantic textual similarity, Language Resources and Evaluation, 2018.