Deep Learning in Edge – Convolutional Neural Network for Sound Classification on ESP32

Audio Scene Classification (ASC) is one of the basic tasks in the computer acoustics field. It is expected to classify a piece of acquired sound into the correct environment labels, such as dog bark, raining. This project deployed ASC on the embedded platform ESP32, which benefits our daily life and runs at low power consumption. This post will intruduce the process for this project, including audio acquisition, feature extrachtion, neural network and deployment.