10.24425/aoa.2025.154812
Multi-label Bird Species Classification Using Transfer Learning Network
References
Abdul Kareem N., Rajan R. (2023), Multi-label bird species classification using sequential aggregation strategy from audio recordings, Computing and Informatics, 42(5): 1255–1280, https://doi.org/10.31577/cai 2023 5 1255.
Bravo Sanchez F.J., Hossain M.R., English N.B., Moore S.T. (2021), Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture, Scientific Reports, 11: 15733, https://doi.org/10.1038/s41598-021-95076-6.
Briggs F. et al. (2012), Acoustic classification of multiple simultaneous bird species: A multi-instance multilabel approach, The Journal of the Acoustical Society of America, 131(6): 4640–4650, https://doi.org/10.1121/1.4707424.
Cheng Y., Ma M., Li X., Zhou Y. (2021), Multi-label classification of fundus images based on graph convolutional network, BMC Medical Informatics and Decision Making, 21: 82, https://doi.org/10.1186/s12911-021-01424-x.
Deng J., Dong W., Socher R., Li L.J., Li K., Li F.F. (2009), ImageNet: A large-scale hierarchical image database, [in:] 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, https://doi.org/10.1109/CVPR.2009.5206848.
Fagerlund S. (2004), Automatic recognition of bird species by their sounds, MSc. Thesis, Helsinki University of Technology.
Godbole S., Sarawagi S. (2004), Discriminative Methods for Multi-labeled Classification, [in:] Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science, Dai H., Srikant R., Zhang C. [Eds.], 3056: 22–30, https://doi.org/10.1007/978-3-540-24775-3 5.
Gomez-Gomez J., Vidana-Vila E., Sevillano X. (2023), Western Mediterranean Wetland Birds dataset: A new annotated dataset for acoustic bird species classification, Ecological Informatics, 75: 102014, https://doi.org/10.1016/j.ecoinf.2023.102014.
Gunawan K.W., Hidayat A.A., Cenggoro T.W., Pardamean B. (2021), A transfer learning strategy for owl sound classification by using image classification model with audio spectrogram, International Journal on Electrical Engineering and Informatics, 13(3): 546–553, https://doi.org/10.15676/ijeei.2021.13.3.3.
He K., Zhang X., Ren S., Sun J. (2016), Deep residual learning for image recognition, [in:] 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, https://doi.org/10.1109/CVPR.2016.90.
Huang Y.-P., Basanta H. (2021), Recognition of endemic bird species using deep learning models, IEEE Access, 9: 102975–102984, https://doi.org/10.1109/ACCESS.2021.3098532.
Leng Y.R., Dat Tran H. (2014), Multi-label bird classification using an ensemble classifier with simple features, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific, pp. 1–5, https://doi.org/10.1109/APSIPA.2014.7041649.
Li G., Ji Z.F., Chang Y.L., Li S., Qu X.D., Cao D.P. (2021), ML-ANet: A transfer learning approach using adaptation network for multi-label image classification in autonomous driving, Chinese Journal of Mechanical Engineering, 34: 78, https://doi.org/10.1186/s10033-021-00598-9.
Liu A. et al. (2021), Residual recurrent CRNN for end-to-end optical music recognition on monophonic scores, arXiv, http://arxiv.org/abs/2010.13418.
Liu H.T. (2016), A study on multi-label transfer learning algorithm and application in the bird sounds recognition, Msc. Thesis, Nanjing Forestry University.
Michaud F., Sueur J., Le Cesne M., Haupert S. (2023), Unsupervised classification to improve the quality of a bird song recording dataset, Ecological Informatics, 74: 101952, https://doi.org/10.1016/j.ecoinf.2022.101952.
Nishikimi R., Nakamura E., Goto M., Yoshii K. (2021), Audio-to-score singing transcription based on a CRNN-HSMM hybrid model, APSIPA Transactions on Signal and Information Processing, 10(1): e7, https://doi.org/10.1017/ATSIP.2021.4.
Noumida A., Rajan R. (2022), Multi-label bird species classification from audio recordings using attention framework, Applied Acoustics, 197: 108901, https://doi.org/10.1016/j.apacoust.2022.108901.
Paniri M., Dowlatshahi M.B., Nezamabadi-pour H. (2020), MLACO: A multi-label feature selection algorithm based on ant colony optimization, Knowledge-Based Systems, 192: 105285, https://doi.org/10.1016/j.knosys.2019.105285.
Sainath T.N., Vinyals O., Senior A., Sak H. (2015), Convolutional, long short-term memory, fully connected deep neural networks, [in:] 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584, https://doi.org/10.1109/ICASSP.2015.7178838.
Sevilla A., Glotin H. (2017), Audio bird classification with Inception-v4 extended with time and time-frequency attention mechanisms, Working Notes of CLEF 2017 – Conference and Labs of the Evaluation Forum, Cappellato L., Ferro N., Goeuriot L., Mandl T. [Eds.], 1866, https://ceur-ws.org/Vol-1866/paper 177.pdf.
Simonyan K., Zisserman A. (2014), Very deep convolutional networks for large-scale image recognition, arXiv, http://arxiv.org/abs/1409.1556.
Sorower M.S. (2010), A literature survey on algorithms for multi-label learning.
Sprengel E., Jaggi M., Kilcher Y., Hofmann T. (2016), Audio based bird species identification using deep learning techniques, Working Notes of CLEF 2016 – Conference and Labs of the Evaluation forum, Balog K., Cappellato L., Ferro N., Macdonald C. [Eds.], 1609, https://ceur-ws.org/Vol-1609/16090547.pdf.
Szegedy C., Ioffe S., Vanhoucke V., Alemi A. (2017), Inception-v4, Inception-ResNet and the impact of residual connections on learning, [in:] Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), https://doi.org/10.1609/aaai.v31i1.11231.
Szegedy C., Vanhoucke V., Ioffe S., Shlens J., Wojna Z. (2016), Rethinking the Inception Architecture for Computer Vision, [in:] 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826, https://doi.org/10.1109/CVPR.2016.308.
Tao J., Fang X. (2020), Toward multi-label sentiment analysis: a transfer learning based approach, Journal of Big Data, 7: 1, https://doi.org/10.1186/s40537-019-0278-0.
Weiss K., Khoshgoftaar T.M., Wang D. (2016), A survey of transfer learning, Journal of Big Data, 3: 9, https://doi.org/10.1186/s40537-016-0043-6.
Zhang L., Towsey M., Xie J., Zhang J., Roe P. (2016), Using multi-label classification for acoustic pattern detection and assisting bird species surveys, Applied Acoustics, 110: 91–98, https://doi.org/10.1016/j.apacoust.2016.03.027.
DOI: 10.24425/aoa.2025.154812