[1]黄鸿铿,李应.用Bark频谱投影识别低信噪比动物声音[J].智能系统学报,2018,13(4):610-618.[doi:10.11992/tis.201703008]
HUANG Hongkeng,LI Ying.Identifying low-SNR animal sounds based on Bark spectral projection[J].CAAI Transactions on Intelligent Systems,2018,13(4):610-618.[doi:10.11992/tis.201703008]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第4期
页码:
610-618
栏目:
学术论文—智能系统
出版日期:
2018-07-05
- Title:
-
Identifying low-SNR animal sounds based on Bark spectral projection
- 作者:
-
黄鸿铿, 李应
-
福州大学 数学与计算机科学学院, 福建 福州 350116
- Author(s):
-
HUANG Hongkeng, LI Ying
-
College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
-
- 关键词:
-
声音信号; 自动识别; 小波包变换; 随机森林; 环境声音
- Keywords:
-
sound signal; automatic recognition; wavelet packet transform; random forests; environment sound
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201703008
- 摘要:
-
复杂环境声影响低信噪比动物声音的自动识别。为解决这一问题,本文提出一种不同声场景下低信噪比动物声音识别的方法。该方法把声音信号进行Bark尺度的小波包分解,再使用分解系数生成重构信号的频谱,并对频谱进行投影生成Bark频谱投影特征,通过随机森林分类器实现低信噪比动物声音的识别。该文分别在流水声环境、公路环境、风声环境和嘈杂说话声环境下,以不同的信噪比,对40种动物声音进行识别实验。结果表明,结合短时谱估计法、Bark频谱投影特征和随机森林的方法对不同信噪比的各种环境声音中动物声音的平均识别率可以达到80.5%,且在-10 dB的情况下依然保持平均60%以上的识别率。
- Abstract:
-
In this paper, we consider the influence of complex background environments on the automatic recognition of animal sounds with low signal-to-noise ratios (SNRs). We propose a method for identifying low-SNR animal sounds in various background environments. In this method, the sound signal is decomposed by a Bark scale wavelet packet, and the decomposition coefficient is used to generate a spectrogram of the reconstructed signal, which is projected onto a spectrogram to generate a Bark spectral projection (BSP) feature. Random forests (RF) are then used to identify animal sounds with low SNRs. We classified 40 common animal sounds with different SNRs in noise environments such as flowing water, highway, wind, and loud speech. The experimental results show that by combining the proposed methods of short-time spectrum estimation, BSP, and RF in various background environments with different SNRs, the mean identification rate for animal noises can reach 80.5%. In addition, a recognition rate above 60% can be maintained even at –10 dB.
备注/Memo
收稿日期:2017-03-08。
基金项目:国家自然科学基金项目(61075022);福建省自然科学基金项目(2018J01793).
作者简介:黄鸿铿,男,1993年生,硕士研究生,主要研究方向为声音事件检测、信息安全;李应,男,1964年生,教授,博士,主要研究方向为多媒体数据检索、声音事件检测、信息安全。获授权发明专利10项。发表学术论文20余篇。
通讯作者:李应.E-mail:fj_liying@fzu.edu.cn.
更新日期/Last Update:
2018-08-25