Collecting and analysing large amounts of data in the Cloud-to-Edge computing continuum raises novel challenges. Processing all
this data centrally in cloud data centres is not feasible anymore as transferring large amounts of data to the cloud is time-consuming, expensive, degrade performance and may raise security concerns. Therefore, novel distributed computing paradigms, such as edge and fog computing emerged to support processing data closer to its origin. However, such hyper-distributed systems require fundamentally new methods. To overcome the limitation of current centralised application management approaches, Swarmhestrate will develop a completely novel decentralised application-level orchestrator, based on the notion of self-organised interdependent Swarms. Application microservices are managed in a dynamic Orchestration Space by decentralised Orchestration Agents, governed by distributed intelligence that provides matchmaking between application requirements and resources, and supports the dynamic self-organisation of Swarms. Knowledge and trust, essential for the operation of the Orchestration Space, will be managed through blockchain-based trusted solutions using methods of Self-Sovereign Identities (SSI) and Distributed Identifiers (DID). End-to-end security of the overall system will be assured by utilising state-of-the-art cryptographic algorithms and privacy preserving data analytics.
Due to the imminent complexity of the decentralised system, novel simulation approaches will be developed to test and optimise
system behaviour (e.g., energy efficiency) in the early stages of development. Additionally, the simulator will be further extended into a digital twin running in parallel to the physical system and improving its behaviour with predictive feedback. The Swarmchestrate concept will be prototyped on four real-life demonstrators from the areas of flood prevention, parking space management, video analytics and a digital twin of natural habitat.


ELOQUENCE is focused on the research and development of innovative technologies for collaborative voice/chat bots. Voice assistant-powered dialogue engines have previously been deployed in a number of commercial and governmental technological pipelines, with a diverse level of complexity. In our concept, such a complexity can be understood as a problem of analysing unstructured dialogues. ELOQUENCE’s key objective is to better comprehend those unstructured  dialogues and translate them into explainable, safe, knowledge-grounded, trustworthy and bias-controlled language models. We envision to develop a technology capable of (i) learning by its own, (ii) being adaptable from limited corporas across languages, use-cases, or business logic; (iii) being sustainable w.r.t. new computational frameworks and new green-power architectures and, in essence; (iv) serving as a guidance for all European citizens whilst being respectful and showing the best of our European values, specifically supporting safety-critical applications involving humans-in-the-loop.


In a world of increasing preoccupation with artefacts, interacting with our fellow human beings remains one of our most enjoyable, but also one of our practically most critical activities. We derive inspiration from each other, solve problems and chart our future together. Yet our interaction with fellow humans is far from seamless or frictionless: despite much greater worldwide reach, we suffer (perhaps more than ever) from isolation, barriers and separation
due to language, culture, physical distances, time-zones, scheduling conflicts, and distractions to our attention. With greater freedom, reach and flexibility, our isolation and complexities also appear to increase. In our proposed
project “Meetween”, we aim to find solutions to these problems. Rather than artificial intelligence (AI) getting in the way of the human experience, we harness its power to make human-human interaction more seamless and
natural, eliminate language barriers, replace the techno-clutter with support, and permit our participation even across great distances. Meetween is a meeting space of the future, where AI serves to bring us closer.


MARVEL delivers a disruptive Edge-to-Fog-to-Cloud ubiquitous computing framework that enables multi-modal perception and intelligence for audio-visual scene recognition, event detection in a smart city environment.

Previous Projects

  • AGEVOLA: The goal of the project was the development and validation of advanced algorithms, with special attention to Machine Learning based solutions, for the following tasks: voice activity detection, speech enhancement, and speaker diarization. In particular, the application case study is focused on call-center communications. The project was partially funded by Fondazione CariTro and involved as partners PerVoice spa and Universita’ Politecnica delle Marche.
  • AUDIO VISUAL SCENE ANALYSIS – This project is part of a wider cooperation between FBK-ICT and the Centre for Intelligent Sensing of Queen Mary University London. In particular, the research activities focus on advanced solutions for audio-visual processing, using heterogeneous devices in challenging and unconstrained environments. Specifically, the 2 institutions have allocated 2 joint PhD grants on these tasks.
  • CITY SENSING – One of the goals of the Smart Cities and Communities is high impact initiative is to help administrators and citizens understand their city and how it evolves. Therefore, the research line is developing of pervasive, collaborative, multi-source, multi-level monitoring of the city. In particular, at SpeechTek we are working on neural solutions for the detection and classification of acoustic events in public open spaces.
  • SMARTERP – The goal of SmarTerp  is to reduce inefficiencies in interpreting by developing a set of AI-powered tools embedded in a Remote Simultaneous Interpreting system that automates the human task of extracting information in real-time to prevent the mistakes and loss of quality derived from the adoption of remote technologies.
  • EIT CONVERSATIONAL BANKING – The project aims to develop conversational agents interacting, by voice and text, with users asking financial information. Therefore, SpeechTeK will develop ASR systems, in English and Hungarian, capable of dynamically activating proper language models for human-machine interaction.
  • IPRASE – The goal is to automatically estimate the language proficiency in English and German of Italian native-language students in Trentino.
  • PERVOICE-SD – This project investigates on speaker diarization solutions based on DNN embedded representations of speaker identities. PerVoice has partially funded this project with a post-doc grant.
  • Smart Subtitling and Dubbing System (SSDS) – The goal of the project is the development of solutions for automatic translation and dubbing of TV products in different languages. In thigh collaboration with the HLT-MT research unit, SpeechTek efforts will focus on extracting audio features that can improve the quality of both the translation and the dubbing. The project, partially funded by Regione Lazio, is lead by the Italian companies Translated and Sedif.