MOSEL: a multilingual dataset collection
The MOSEL corpus is a multilingual dataset collection including up to 950K hours of open-source speech recordings covering the 24 official languages of the European Union. We collect data by surveying labeled and unlabeled...
Read More