Bringing AI to the very EDGE: sound event detection

Recently the Digital Society research line of FBK launched a flagship project for the development of city sensing technologies for Smart Cities. One of the area of research is called city sensing in the small where multiple low-power low-cost IoT devices monitor the urban spaces to support the decision process of the city managers. Sound Event Detection (SED) in urban spaces is one of the key component to achieve the project goals.

Each device can potentially generate a large amount of data to be sent via wireless transmission, affecting the energy autonomy and lifetime of devices. One successful approach is distributing the computation at the edge, i.e. performing local pre-processing,but also advanced processing (e.g. machine learning, classification), directly on the wireless node, at ”the thing” level.

Recently, thanks to the use of neural paradigms, SED algorithms have advanced considerably in terms of accuracy and robustness. However these improvements are achieved by using large neural networks, which are increasingly hungry in terms of computational power and memory. This prevents the development of applications for distributed monitoring in public spaces, which require a pervasive network of energy neutral devices composed of cheap, low-power, low-complexity platforms.

We developed an approach for dimensionality and complexity reduction, towards transferring  AI from the cloud to low-power embedded platforms, covering the full pipeline.

Knowledge Distillation

Starting from the publicly available VGG-ish feature extraction combined with a single layer GRU classifier (which consists of 70M parameters), properly using the student-teacher approach we compress are able to compress the network to 20K parameter reducing the classification accuracy only from 75% to 70% on UrbanSound8k.


The model fits on board the STM32L476RG board which features 128MB of RAM, 80 MIPS and just needs 26mW. We implemented the system on board using the CMSIS library and quantizing the model weights to 8-bit by minimizing the overflow probability on each layer.


Paper and code

Check out our paper “Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms
published on the IEEE Journal on Selected Topics o n Signal Processing [ArXiv]

Our code is also available:


Submit a Comment

Your email address will not be published. Required fields are marked *