Metaverse Understanding

In recent years, the Metaverse has sparked an increasing interest across the globe and is projected to reach a market size of more than $1000B by 2030. This is due to its many potential applications in highly heterogeneous fields, such as entertainment and multimedia consumption, training, and industry. This new technology raises many research challenges since, as opposed to the more traditional scene understanding, metaverse scenarios contain additional multimedia content, such as movies in virtual cinemas and operas in digital theaters, which greatly influence the relevance of the metaverse to a user query. For instance, if a user is looking for Impressionist exhibitions in a virtual museum, only the museums that showcase exhibitions featuring various Impressionist painters should be considered relevant. We introduce the novel problem of text-to-metaverse retrieval, to support the users in finding the most suitable metaverse according to a given textual query. It is a challenging task, since the multimedia content present in the metaverse greatly influences […]

EQAI 2023 – European Summer School on Quantum AI

AILAB-Udine is proud to be one of the organizers of the 2nd European Summer School on Quantum AI, to be held in Udine on May 29 – June 01, 2023. Find the latest updated information in the official website: The summer school will take place in Udine (Italy) on May 29 – June 1, 2023, but it can also be followed remotely. The main topic of this edition is “Quantum Machine and Deep Learning”. Deadlines for application are: ▶ on-site: April 29, 2023, ▶ remote: May 15, 2023. All the speakers will be there in person to make the experience more immersive and interactive. The program will include lectures, tutorials, and dissemination opportunities. The participants may also present their own research work during a dedicated poster session, having the opportunity to interact and discuss with their peers. More information can be found on the dedicated website: Feel free to share this invitation with anyone interested in Quantum Computing, Quantum […]

AI for Forestry Applications

This research project focuses on the application of machine and deep learning methods for forestry applications. In particular, the main focus is forest growing stock prediction in the Friuli Venezia Giulia region (Italy), but the developed methods can be applied to produce estimations of biophysical forest attributes on any large territory. This study will take into account different sources of data such as Forest inventory data from Nationla surveys, multispectral satellite images, climatic data and various environmental features collected through different services. Several methods will be applied to produce a forest-growing stock volume map, which will be useful to create management plans for forestry areas in the region. Traditionally, the growing stock is considered an important indicator of forest health and productivity. The growing stock is estimated through forest inventory under which both qualitative and quantitative parameters are recorded to know the overall health of growing forests. So, we will produce the results that can be considered as a basis […]

Digital Humanities

Inscriptions are a testimony to the past but their poor condition, caused by the deterioration of the material on which they are engraved upon, often makes them partially or completely illegible. The process of restoring these inscriptions is time-consuming and requires the involvement of an expert epigraphist. It is possible to speed-up this process by adopting a semi-automatic assisting tool based on deep neural networks. This project aims to develop a complete methodology, from the acquisition of the inscriptions, to the description of four possible approaches to predict the missing text in a Latin inscription, that our research team plans to implement in the near future as part of a interdisciplinary research project. Research group Alessandro Locaputo (AILAB-Udine) Beatrice Portelli (AILAB-Udine) Emanuela Colombi (DIUM-Udine) Giuseppe Serra (AILAB-Udine)

Semantic Text-video Retrieval

The text-to-video retrieval task requires to rank all the videos in a database based on how semantically close they are to an input query. To do so, both the visual and the textual contents need to be carefully analyzed and understood, meaning that a wide range of Computer Vision and Natural Language Processing techniques are required. Despite the intrinsic difficulty of such a problem, it is a fundamental one: in fact, nowadays several hundreds of hours of video content are uploaded to the Internet every minute, therefore solutions to this important problem are fundamental to perform searches effectively and retrieve all the videos which the user is looking for. Moreover, considering the need for multi-modal content understanding, advancements in this field may lead to improvements in many other problems, including Captioning and Question Answering. Related publications: Alex Falcon, Giuseppe Serra, Oswald Lanz: “Learning Video Retrieval Models with Relevance-Aware Online Mining”, International Conference on Image Analysis and Processing (ICIAP ’21), 2022 […]

Video Question Answering

Video Question Answering (VideoQA) is a task that requires to analyze and jointly reason on both the given video data and a visual content-related question, to produce a meaningful and coherent answer to it. Solving this task would approach human-level capability of the model to deal with both complex video data and the visual contents-related textual data, since it would require to learn to isolate and pinpoint objects of interest in video, to identify and reason about their interactions in both spatial and temporal domains, while finding the essential bindings with the given question. Thus, VideoQA represents a challenging task at the interface between Computer Vision and Natural Language Processing (NLP). Modern approaches to this task involve a wide selection of different techniques, such as: temporal attention and spatial attention, in order to learn which frames and which regions in each frame are more important to solve the task; given the multimodal nature of the data, cross-modality fusion mechanisms, question-answer-aware […]

Adverse Drug Events (ADE) Extraction

Regulators, such as the Food and Drug Administration (FDA) and the European Medicine Agency (EMA), approve every year dozens of drugs, after verifying their safety and therapeutic effectiveness in clinical trials. Sometimes, however, clinical trials are not sufficient to discover all potential Adverse Drug Events (ADE). Pharmacovigilance, therefore, monitors the drugs in the market to ensure that unexpected effects are immediately identified and actions are taken to minimize their harm. This process relies on formal reporting methods, such as physician notes. However, a constantly growing number of patients prefers to describe the side effects on social media platforms, health forums and similar outlets. Patients have started reporting Adverse Drug Event (ADE) on social media, health forums and similar outlets, often utilizing informal language. Given the need to monitor these sources for pharmacovigilance purposes, systems for the automatic extraction of ADE are becoming an important research topic in the NLP community. Recent shared tasks on the topic of ADE extraction have […]

Predictive Maintenance

The remaining useful life (RUL) estimation of a component is an interesting problem within the Prognostics and Health Management (PHM) field, which consists in estimating the number of time steps occurring between the current time step and the end of the component life. Being able to reliably estimate this value can lead to an improvement of the maintenance scheduling and a reduction of the costs associated with it. Data driven approaches are often used in the literature and they are the preferred choice over model-based approaches: in fact, not only they are easier to build, but the data over which they are built can be gathered easily in many industrial applications. During the last years, neural networks like Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) have found many applications in this area, this because of their suitability to uncover hidden patterns within the sensor data. In recent years a greater availability of high quality sensors and easiness of data […]

Pathology Classification based on Limbs Kinematics

With the advancement of the technology of portable sensors, it’s now possible to gather data regarding the arm movements made by people with shoulder pathologies in an uninvasive ways. Such data can then be used in order to train a classifier in order to enstablish whether a person that reports problems with his limb is actually affected by a limb pathology or not. AILab Udine is cooperating with the NCS Lab of the NCS Company of Carpi in order to develop a neural network-based classifier which can be able to detect problems to a patient’ shoulders from the movements of the limbs (abduction and adduction) performed by such patient. The project is based on the Showmotion technology of the NCS Company.

Generalized Born radii computation using linear models and neural networks

Implicit solvent models play an important role in describing the thermodynamics and the dynamics of biomolecular systems. Key to an efficient use of these models is the computation of Generalized Born (GB) radii, which is accomplished by algorithms based on the electrostatics of inhomogeneous dielectric media. The speed and accuracy of such computations is still an issue especially for their intensive use in classical molecular dynamics. Here, we propose an alternative approach that encodes the physics of the phenomena and the chemical structure of the molecules in model parameters which are learned from examples. In our project, GB radii have been computed using i) a linear model and ii) a neural network. The input is the element, the histogram of counts of neighbouring atoms, divided by atom element, within 16 Å. Linear models are ca. 8 times faster than the most widely used reference method and the accuracy is higher with correlation coefficient with the inverse of “perfect” GB radii […]

Fake News Detection (AILAB-Udine – MIT Boston)

In the last few years, we have assisted at the explosion of news sharing and commenting in social networks. While this practice has positive aspects, as it stimulates the debate, it has been polluted by the diffusion of unreliable news, generally referred to as Fake News. Since these contents are often produced with malicious intents and they have a tremendous real-world political and social impact, the Natural Language Processing (NLP) community has been called to propose algorithms for their identification. Most of the currently existing works are so far based on stylistic and linguistic peculiarity of the Fake News texts (such as excessive use emphasis and hyperbolic expressions). As time passes, however, the Fake News tend to be stylistically and linguistically more similar to Real News, so that Fact Checking remains as the only reliable approach to isolate them. In this project, we employ Artificial Intelligence to assess the reliability of news on the base of not only intrinsic criteria […]

Visual Saliency Prediction

When human observers look at an image, attentive mechanisms drive their gazes towards salient regions. Emulating such ability has been studied for more than 80 years by neuroscientists and by computer vision researchers, while only recently, thanks to the large spread of deep learning, saliency prediction models have achieved a considerable improvement. Data-driven saliency has recently gained a lot of attention thanks to the use of Convolutional Neural Networks for predicting gaze fixations. In this project we go beyond standard approaches to saliency prediction, in which gaze maps are computed with a feed-forward network, and we present a novel model which can predict accurate saliency maps by incorporating neural attentive mechanisms. The core of our solution is a Convolutional LSTM that focuses on the most salient regions of the input image to iteratively refine the predicted saliency map. Additionally, to tackle the center bias present in human eye fixations, our model can learn a set of prior maps generated with […]

Ambient Assisted Living ChatBot

A chatbot is a computer program or an Artificial Intelligence agent which conducts a conversation via auditory or textual methods. Such programs are often designed to convincingly simulate how a human would behave as a conversational partner. Chatbots are typically used in dialog systems for various practical purposes including customer service or information acquisition. They may use sophisticated natural language processing systems and are either accessed via virtual assistants, via messaging apps or via individual organizations’ apps and websites. The aim of our project is to study and analyze a series of innovative technologies which, integrated together in a prototype managed by a virtual assistant, allows to renovate the concept of domotics as we know it. Also, since chatbots are user more and more in a working environment, our aim includes developing a virtual assistant integrated with a chatbot to back up maintenance activities in an industrial setting. The project is supported by POR FESR FVG Project and conducted as […]

Egocentric Vision for Cultural Heritage

Augmented Reality and Humanity  present the opportunity for more customization of the museum experience, such as new varieties of self-guided tours or real-time translation of interpretive.  At the end of this year several companies  will release wearable computers with a head-mounted display (such as Google or Vuzix). We’d like to investigate the usage of these devices for Cultural Heritage applications. Augmented reality is a real-time direct or indirect view of a physical real-word environment that has been enhanced/augmented by adding virtual computer-generated information on it. Augmented Reality aims at simplifying the user’s life by bring virtual information non only to his immediate surroundings, but also to any indirect view of the real-world environment, such as live-video stream. AR enhances the user’s perception of and interaction with the real world. While Virtual reality technology or virtual Environment as called by Milgram, completely immerses users in a synthetic world without seeing the real world, AR technology augments the sense of reality by superimposing […]

Arabic Keyphrase Extraction

Arabic keyphrase extraction is a crucial task due to the significant and growing amount of Arabic text on the web generated by a huge population. It is becoming a challenge for the community of Arabic natural language processing because of the severe shortage of resources and published processing systems. In this paper we propose a deep learning based approach for Arabic keyphrase extraction that achieves better performance compared to the related competitive approaches. We also introduce the community with an annotated large-scale dataset of about 6000 scientific abstracts which can be used for training, validating and evaluating deep learning approaches for Arabic keyphrase extraction. Related publications: Helmy M., Vigneshram R. M., Serra G., Tasso C. Applying Deep Learning for Arabic Keyphrase Extraction. In: Proc. of the 4th International Conference on Arabic Computational Linguistics (ACLing 2018), November 17-19 2018, Dubai, United Arab Emirates. Resources: Arabic Abstracts Dataset