AINL 2020 - Agenda

Call for Papers

Workshop

Shared Task

Tutorials

Paper Presentations

Keynote Lectures

Poster Session

Keynote lectures

Cross-Lingual Embeddings: a Babel Fish for Machine Learning Models

Marko Robnik-Šikonja is Professor of Computer Science and Informatics and Head of Artificial Intelligence Chair at the University of Ljubljana, Faculty of Computer and Information Science. His research interests span machine learning, data mining, natural language processing, network analytics, and application of data science techniques. He is (co)author of over 150 scientific publications that were cited more than 5,000 times. He is author and maintainer of three open-source R data mining packages.

Abstract
Currently, the most successful machine learning methods are numeric, e.g., deep neural networks or SVMs. If we are to harness the power of successful numeric deep learning approaches for symbolic data such as texts or graphs, the symbolic data has to be embedded into a vector space, suitable for numeric algorithms. The embeddings shall preserve the information in the form of similarities and relations contained in the original data by encoding it into distances and directions in the numeric space. Typically, these vector representations are obtained with neural networks trained for the task of language modelling. As it turns out, the resulting numeric spaces are similar between different languages and can be mapped with approaches called cross-lingual embeddings.

We are going to present ideas of supervised, unsupervised, and semi-supervised cross-lingual embeddings. We will focus on recent contextual embeddings which assure that the same word is mapped to different vectors based on the context. We will describe how to build and fine-tune contextual embeddings, such as ELMo and BERT, and present examples of training a model in a well-resourced language such as English and transfer it to less-resourced language such as Finnish. We will describe applications of cross-lingual transfer in text classifiers and abstractive summarizers.

Tutorials

Finnish NLP in the Deep Learning Age

Filip Ginter; AI Scientist at Silo.AI and Assistant Professor at the University of Turku, TurkuNLP lab

Abstract
In this workshop, we will review several of the available tools and resources for Finnish NLP. In particular, we will present the Turku Neural Parser Pipeline and the recent Turku NER system as primary examples of deep learning -based NLP tools for Finnish. We will also present numerous Finnish datasets and models, such as the Finnish BERT model FinBERT and their use. The workshop will contain several hands-on sessions during which the participants can test the tools and resources on their own in the Google colab environment.

Paper Presentations

PolSentiLex: Sentiment Detection in Socio-political Discussions on Russian Social Media

Olessia Koltsova, Svetlana Alexeeva, Sergei Pashakhin, Sergei Koltsov

Automatic Detection of Hidden Communities in the Texts of Russian Social Network Corpus

Ivan Mamaev, Olga Mitrofanova

Dialog Modelling Experiments with Finnish One-to-One Chat Data

Janne Kauttonen, Lili Aunimo

Advances of Transformer-Based Models for News Headline Generation

Alexey Bukhtiyarov, Ilya Gusev

An explanation method for black-box machine learning survival models using the Chebyshev distance

Lev Utkin, Maxim Kovalev, Ernest Kasimov

Unsupervised Neural Aspect Extraction with Related Terms

Timur Sokhin, Maria Khodorchenko, Nikolay Butakov

Predicting Eurovision Song Contest Results using Sentiment Analysis

Iiro Kumpulainen, Eemil Praks, Tenho Korhonen, Anqi Ni, Ville Rissanen, Jouko Vankka

Improving Results on Russian Sentiment Datasets

Anton Golubev, Natalia Loukachevitch

Dataset for Automatic Summarization of Russian News

Ilya Gusev

Dataset for evaluation of mathematical reasoning abilities in Russian

Mikhail Nefedov

Searching Case Law Judgments by Using Other Judgments as a Query

Sami Sarsa, Eero Hyvönen

GenPR: Generative PageRank framework for Semi-Supervised Learning on citation graphs

Mikhail Kamalov, Konstantin Avrachenkov

Finding New Multiword Expressions for Existing Thesaurus

Petr Rossyaykin, Natalia Loukachevitch

Matching LIWC with Russian Thesauri: An Exploratory Study

Polina Panicheva, Tatiana Litvinova

poster session

Decentralized Learning for Text Mining

Kendrick Cetina Nuria García-Santa

We address the challenge of training data scarcity in domains where preserving private and sensitive information is paramount. We present a Text Generation model based on Decentralized Learning that is used to augment data for Named Entity Recognition (NER) tasks. Our architecture is capable of learning from different data sources in a decentralized environment where there is no sensitive data sharing between clients and server. We tested our approach with two popular medical domain datasets. The text generation model can produce synthetic data samples similar in style to the original text used for training. We then use this generated text to compare the performance of two NER models, one that uses augmented data and one that does not. The NER benefits from the generated text and achieves higher performance when augmented data is used, showing the enrichment of entity recognition due to the decentralized text generation model.
Building Parallel Corpora Using Multilingual Sentence Embeddings

Sergei Averkiev

The goal of my project is to build a pipeline to make a parallel corpora based on two texts on different languages. The alignment of texts involves sentence embeddings models. Using these models we can transform sentences to vectors and measure the cosine similarity between them. After we done that with some treshold additional heuristics can improve the quality. On the last stage the human can validate the result and make some final changes. The fact that aforementioned models are multilingual (the most recent LaBSE model support 109 languages) we can align texts on the vast variety of languages out of the box with some descent quality. This pipeline can be applied in the field of machine translation and language learning.
Publication Date Estimation for Scientific Papers

Andrey Grabovoy

The paper investigates automatic methods for scientific paper date estimation. A lot of works related to document dating mainly use data from newspaper articles. The authors introduce a new dataset for this task that consists of scientific papers in various domains. We propose to use open access articles from elibrary.ru in Russian and English languages. The dataset consists of more than 300 000 articles with publication year and topics tag according to "Code of State Categories Scientific and Technical Information" (GRNTI). A computational experiment compares different baseline models for the proposed dataset. It shows that a simple baseline model gives better accuracy than many existing methods. The dataset is currently under development.

Shared task

We are happy to announce a challenge which will be held as part of next AINL conference. The challenge is devoted to Russian-Chinese machine translation. It will be conducted in a containerized manner, i.e. the participants will send their trained models and receive only evaluation results. The evaluation is based on the unpublished parallel corpus and on the classic BLEU metric. As training corpus organizers propose to use UN corpus for Ru-Zh language pair. Further details will be presented soon.

Shared task chair: Valentin Malykh, Huawei; Varvara Logacheva, Skoltech
Contest web-page: https://mlbootcamp.ru/en/round/26/tasks/

Human-Ai interaction workshop

The workshop aims to gather together researchers, industry partners and users to present and debate aspects of interaction between humans and artificial agents and applications powered by artificial intelligence (AI). Recent advances in AI frame opportunities and challenges for interaction and user interface design. Principles for human-AI interaction have been discussed in the human-computer interaction community for several decades, but more study and innovation are needed in light of advances in AI and the growing uses of AI technologies in human-facing applications. A topical example of this are chatbots, which have become mainstream in customer service for businesses and organizations.

Call for abstracts: https://ainlconf.ru/2020/human_ai_workshop

WORKSHOP TALKS

Work Disability Risk Prediction with Machine Learning

Vili Huhta-Koivisto

It is very important for society that the working-age population has uninterrupted careers for as long as possible. The integrity of careers has only become emphasized as the population ages and the dependency ratio becomes worse. Any disturbances to careers can prove to be expensive, as they not only place extra expenses on the welfare system but also reduce the income produced from work. In particular, disabilities leading to a loss of work ability are undesirable, as periods outside work can be comparatively long [1].

Since no exact and commonly used definition exists for the work disability, we use definition from the Finnish pension system [1]. The most common reason for a loss of work ability is a long-term health problem [2]. Many effective treatments and actions can be taken to prevent work disabilities [3]. Early detection is vital to enable well-targeted action in time. Currently, the best data for predicting the health of an individual comes from health and employment records. However, no simple indicator can signal risk [4]. Additionally, the volume of data is far too great for humans to go through. People tasked with screening are also prone to errors [5]. Current screening tools, therefore, limit treatments and actions that would be more effective.

We investigate the possibility of utilizing machine learning for early prediction of upcoming loss of work ability. In our work the aim is to develop a machine learning model for assessing the risk of losing work ability based on health and employment records. The risk assessment is needed to enable preventative measures in occupational health care. Patient texts written during visits to occupational health care doctors are an underutilized source for disability predictions. Therefore, the predictive capability of these texts is of special interest in this work.
Assessment of disability risk is done with data from Finnish health care company. As the research only focuses on the working population, it mostly ignores problems developing before the working age. Additionally, the data used is from a limited geographical area and is mainly from people working in office conditions.

The potential of machine learning has increased due to developments in new methods such as natural language processing (NLP) and neural networks. We created a NLP machine learning model using ULMFit [6] capable of processing Finnish texts and assessing risk related to them. The model was capable of separating people with disability risk from those without the risk with an accuracy of 72 %. This separation ability is on par with the current screening methods. The further development is concerned with researching the transferability of the model. Additionally, improving the results with the addition of structural data is of great interest.

[1] M. Laaksonen, J. Rantala, N. Järnefelt, and J. Kannisto, Työkyvyttömyyden vuoksi menetetty työura. Helsinki: Finnish Centre for Pensions, 2016., ISBN 978-951-691-247-2.
[2] M. Laaksonen, J. Blomgren, and R. Gould, Työkyvyttömyyseläkkeelle siirtyneiden sairauspäiväraha-, kuntoutus- ja työttömyyshistoria. Helsinki: Finnish Centre for Pensions, 2014., ISBN 978-951-691-200-7.
[3] G. Pomaki, R. Franche, E. Murray, N. Khushrushahi, and T. M. Lampinen, Workplace-Based Work Disability Prevention Interventions for Workers with Common Mental Health Conditions: A Review of the Literature. in Journal of occupational rehabilitation, vol. 22, 2012, pp. 182-195, DOI 10.1007/s10926- 011-9338-9.
[4] J. Airaksinen, M. Jokela, M. Virtanen, T. Oksanen, J. Pentti, J. Vaahtera, M. Koskenvuo, I. Kawachi, D. G. Batty, and M. Kivimäki, Development and validation of a risk prediction model for work disability: multicohort study. in Science Report, vol. 7, no. 13578, 2017, DOI 10.1038/s41598-017-13892-1.
[5] S. B. Kotsiantis, I. Zaharakis and P. Pintelas, Supervised Machine Learning: A Review of Classification Techniques. in Frontiers in Artificial Intelligence and
Applications, 2007. ISBN 978-1-58603-780-2.
[6] J. Howard and S. Ruder, Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018.

Influence of Interactional Style on Affective Acceptance in Human-Chatbot Interaction. A Literature Review

David Dobrowsky, Lili Aunimo , Ilona Pezenka, Gerald Janous and Teresa Weber

The use of chatbots on websites or in apps has become a standard in recent years. Chatbots serve to make relevant information accessible to users as low threshold as possible. Chatbots can be used to automate the interaction between users and organisations or to improve the usability of digital services. To fulfil this purpose, the technical functionality must be ensured and the communicative competence of the chatbots must be developed in such a way that the experience of a conversation is possible. In order to gain insights into the prerequisites and requirements for communicative competencies of chatbots, the influence of interactional style on users' affective acceptance in the interactional situation should be examined. Affective acceptance can be operationalized by measuring the psychophysiological reactions of users during interaction with chatbots. The following review provides a thematic and a methodological classification of approaches and studies dedicated to this field of research or taking a research perspective relevant to this field. The different perspectives and methods are critically discussed.

Five thematic clusters of research are identified in the scientific literature on the effect of different communicative strategies and styles of chatbots to the quality of human-chatbot interaction. These are:
1) Studies that examine the interrelation between the interactional style of chatbots and affective acceptance by users.
2) Studies that examine the effects of interaction style on user experience and system satisfaction.
3) Studies that examine how to generate trust in interaction with chatbots.
4) Studies that examine the effect of variations in communicative strategies on user compliance to requests posed by AI-based chatbots.
5) Studies that examine cultural differences in technology acceptance in general and especially in the context of communicative strategies and styles implemented in chatbots.

The majority of studies on the effects of different communicative strategies and styles employ experimental methodologies. In experimental settings users are interacting with different chatbots which differ in interaction style. Some researchers solely use biometric measurement of psychophysiological reactions to evaluate the user experience while others combine biometric measurement data with a user questionnaire. There are also studies that rely only on data collected via a questionnaire. One research setting involves both sentiment analysis on customers' utterances and data collection through a questionnaire. In some studies, the actions taken by the user as a result of the conversation with the chatbot are analysed. While all of the above include an experimental setting, there is also research that is based solely on expert interviews.

Building a Chatbot: Architectures and Vectorization Methods

Anna Chizhik, Yulia Zherebtsova

Nowadays, a field of dialogue systems is one of the rapidly growing research areas in artificial intelligence applications. Business and industry are showing increasing interest in integrating intelligent conversational agents into their products. In our research, we review the recent progress in chatbots developing, its current architectures (rule-based, retrieval based and generative-based models) and discuss the main advantages and disadvantages of the approaches. Additionally, we conduct a comparative analysis of state-of-the-art text data vectorization methods (i.e. word/sentence embeddings) which we apply in implementation of the retrievalbased chatbot as an experiment. The results of the experiment are presented as a responses selection task using various Recall@k measures. The authors also discuss the issues of assessing the quality of chatbots responses, in particular, emphasizing the importance of choosing the proper evaluation method.

Supporting the Inclusion of People with Asperger Syndrome: Building a Customizable Chatbot with Transfer Learning

Victoria Firsanova

The study focuses on building an informational Russian language chatbot, which aims to answer various possible questions about the inclusion of people with Asperger syndrome. The idea of our project came from the problem of lack of awareness of people about the process of inclusive education and work. We believe that such a problem might cause communication difficulties or even conflicts between pupils, university and college students, or co-workers. A chatbot, which can be customized according to the age of a user, is an alternative informal way to provide information. This format should be suitable for children, teenagers, and young adults, who are not likely to read monotone articles to find out the needed information.

In the study, we implement, evaluate, and then compare two models of transfer learning to find out the most efficient approach to build our chatbot. The goal is to make the chatbot customizable. To reach this, we have decided to use transformer neural network architectures: BERT and GPT-2. Our customization involves three modes according to the age of potential users: younger schoolchildren, teenagers, young adults. A user can choose his or her category manually or use the automatic detection. The detection will be based on supervised learning method: multi-class CNN classification.

OpenAI GPT-2 and BERT by Google are both unsupervised transformer models. BERT is a bidirectional encoder-based pre-trained transformer that was trained with masked language modeling (MLM). MLM might cause difficulties in text generation because masked tokens used for training are conditionally independent in sentence structure. As a result, the input data distribution in the model might not correspond to real-world data. GPT-2 is a decoder-based transformer that uses a self-attention mechanism. The model is considered to be quite efficient in question answering and text generation, however, our study focuses on building a Russian language model, which might become challenging even for such a powerful architecture. We made a hypothesis that GPT-2 will present better results, however, it is still important for us to analyze mistakes and disadvantages of both models to get a greater understanding of capabilities of transfer learning techniques. To fine-tune both models, we will use a database provided to us by a project, which supports people with Asperger syndrome and autism in Russia and maintain its informational website.

The challenge and the major problem of our study are the linguistic features of conversation with people of different ages. To implement our study, we need to analyze linguistic features of the question asking of people of different ages, find out and classify their syntactic, lexical, and discourse features, and by means of surveys and explorations find out the best ways of answering according to the age of an asking person. To evaluate the chatbot, we plan to ask three focus groups of people of different ages according to three age categories of our customizable model to test the chatbot, ask it some questions and evaluate its quality during the survey.

Adopting AI-enhanced Chat for Personalizing Student Services in Higher Education

Asko Mononen, Ari Alamäki, Janne Kauttonen, Aarne Klemetti & Erkki Räsänen

Higher education institutes along with service companies are rapidly adopting the artificial intelligence-enhanced chats for their students. The ultimate aim of AIenhanced chats is to decrease the need of personal human assistance where the staff of higher education institutes are answering to the students' conventional information needs. Simultaneously, the advanced conversational AI are able to personalize their responses for students by identifying the parameters of contextual information and students' information needs which for far surpasses classic FAQ sections and non-contextual chatbots.

Personalized or adaptive services are integral part of AI-enhanced higher education service processes. AI-enhanced personalization provides new practices to create value for students, teachers and administration in education (Chassignol et al., 2018; Renz et al., 2020). The personalized services tailor the guidance or instruction based on the students' preferences, profiles, information needs or progress in the conversation. In addition to easily defined tailoring parameters, the more advanced chats interpret students' context or affective experiences for tailoring the conversation based on the students' undefined information needs. Although technology enabled personalized learning is popular research stream, majority of the current scientific research on personalized learning focuses on the traditional computer or
devices (Xie, et al. 2019). Thus, AI provides new research opportunities from the viewpoint of personalized learning in automating student services in higher education.

Literature review (Xie et al. 2019) shows that the research on the adoption of AI to personalized learning is very scant creating justifiable research gap.

In this study, we describe a case in which the AI-enhanced chat was adopted to the higher education institute. The study provides a case experiment for using Front.ai platform in creating personalized student services. It utilizes natural language processing and machine learning in enabling personalized conversation. The platform combines semantic neurocomputing and learning algorithms to create conversations that adapt to the students' personal information needs. Since the aim of this study was to develop a new understanding of the adopting AI-enhance chats to the student services, the method we adopted is the case study approach (Eisenhardt & Graebner, 2007).

The experiences of this case experiment are promising yet the build phase has been slower than expected. The use cases were defined and team had earlier experience with several large scale IT system builds. Still the new technology skills adoption and building a demo utilizing external APIs with part-time developers has been relatively slow. The team has been leaning heavily on supplier side support (Front.ai and Headai.com). Therefore, adequate resourcing of skilled developers with dedicated time and partnering would be recommended for future deployments.

References:

Chassignol, M., Khoroshavin, A., Klimova, A., & Bilyatdinova, A. (2018). Artificial Intelligence trends in education: a narrative overview. Procedia Computer Science, 136, 16-24.
Eisenhardt, K. M. & Graebner, M. (2007). Theory Building from Cases: Opportunities and Challenges. Academy of Management Journal 50(1), 25-32.
Renz, A., Krishnaraja, S., & Gronau, E. (2020). Demystification of Artificial Intelligence in Education–How much AI is really in the Educational Technology?. International Journal of Learning Analytics and Artificial Intelligence for Education (iJAI), 2(1).
Xie, H., Chu, H. C., Hwang, G. J., & Wang, C. C. (2019). Trends and development in technology-enhanced adaptive/personalized learning: A systematic review of journal publications from 2007 to 2017. Computers & Education, 140, 103599.

Feel free to contact us at ainlevent@gmail.com

Cover photo from pixabay