10.24425/aoa.2025.153662
CAPSE-ViT: A Lightweight Framework for Underwater Acoustic Vessel Classification Using Coherent Spectral Estimation and Modified Vision Transformer
computational complexity.
References
Aslam M.A. et al. (2024), Underwater sound classification using learning based methods: A review, Expert Systems with Applications, 255(Part 1): 124498, https://doi.org/10.1016/j.eswa.2024.124498.
Bianco M.J. et al. (2019), Machine learning in acoustics: Theory and applications, The Journal of the Acoustical Society of America, 146(5): 3590–3628. https://doi.org/10.1121/1.5133944.
Bjorno L. (2017), Underwater acoustic measurements and their applications, [in:] Applied Underwater Acoustics, Neighbors T.H., III, Bradley D. [Eds.], pp. 889–947, Elsevier, https://doi.org/10.1016/B978-0-12-811240-3.00014-X.
Cao X., Togneri R., Zhang X., Yu Y. (2019), Convolutional neural network with second-order pooling for underwater target classification, IEEE Sensors Journal, 19(8): 3058–3066, https://doi.org/10.1109/JSEN.2018.2886368.
Chen J., Han B., Ma X., Zhang J. (2021), Underwater target recognition based on multi-decision LOFAR spectrum enhancement: A deep-learning approach, Future Internet, 13(10): 265, https://doi.org/10.3390/fi13100265.
Chen L., Luo X., Zhou H. (2024), A ship-radiated noise classification method based on domain knowledge embedding and attention mechanism, Engineering Applications of Artificial Intelligence, 127(Part B): 107320, https://doi.org/10.1016/j.engappai.2023.107320.
Cinelli L.P., Chaves G.S., Lima M.V.S. (2018), Vessel classification through convolutional neural networks using passive sonar spectrogram images, [in:] Proceedings of the Simpósio Brasileiro de Telecomunicaçõese Processamento de Sinais (SBrT 2018), pp. 21–25, http://doi.org/10.14209/sbrt.2018.340.
de Carvalho H.T., Avila F.R., Biscainho L.W.P. (2021), Bayesian restoration of audio degraded by lowfrequency pulses modeled via Gaussian process, IEEE Journal of Selected Topics in Signal Processing, 15(1): 90–103, https://doi.org/10.1109/JSTSP.2020.3033410.
de Moura N.N., de Seixas J.M. (2016), Novelty detection in passive SONAR systems using support vector machines, 2015 Latin-America Congress on Computational Intelligence (LA-CCI), https://doi.org/10.1109/LA-CCI.2015.7435957.
Domingos L.C.F., Santos P.E., Skelton P.S.M., Brinkworth R.S.A., Sammut K. (2022), A survey of underwater acoustic data classification methods using deep learning for shoreline surveillance, Sensors, 22(6): 2181, https://doi.org/10.3390/s22062181.
Dosovitskiy A. et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale, arXiv, https://doi.org/10.48550/arXiv.2010.11929.
Feng S., Jiang K., Kong X. (2021), A line spectrum detector based on improved coherent power spectrum estimation, Journal of Physics: Conference Series, 1971(1): 012006, https://doi.org/10.1088/1742-6596/1971/1/012006.
Feng S., Zhu X. (2022), A transformer-based deep learning network for underwater acoustic target recognition, IEEE Geoscience and Remote Sensing Letters, 19: 1–5, https://doi.org/10.1109/LGRS.2022.3201396.
Hegazy A.E., Makhlouf M.A., El-Tawel G.S. (2020), Improved salp swarm algorithm for feature selection, Journal of King Saud University – Computer and Information Sciences, 32(3): 335–344, https://doi.org/10.1016/j.jksuci.2018.06.003.
Hong F., Liu C., Guo L., Chen F., Feng H. (2021), Underwater acoustic target recognition with ResNet18 on shipsear dataset, 2021 IEEE 4th International Conference on Electronics Technology (ICET), pp. 1240–1244, https://doi.org/10.1109/ICET51757.2021.9451099.
Hu G., Wang K., Liu L. (2021), Underwater acoustic target recognition based on depthwise separable convolution neural networks, Sensors, 21(4): 1429, https://doi.org/10.3390/s21041429.
Ikpekha O.W., Eltayeb A., Pandya A., Daniels S. (2018), Operational noise associated with underwater sound emitting vessels and potential effect of oceanographic conditions: A Dublin Bay port area study, Journal of Marine Science and Technology, 23: 228–235, https://doi.org/10.1007/s00773-017-0468-4.
Irfan M., Jiangbin Z., Ali S., Iqbal M., Masood Z., Hamid U. (2021), DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification, Expert Systems with Applications, 183: 115270, https://doi.org/10.1016/j.eswa.2021.115270.
Khishe M., Mohammadi H. (2019), Passive sonar target classification using multi-layer perceptron trained by salp swarm algorithm, Ocean Engineering, 181: 98–108, https://doi.org/10.1016/j.oceaneng.2019.04.013.
Kim K.-I., Pak M.-I., Chon B.-P., Ri C.-H. (2021), A method for underwater acoustic signal classification using convolutional neural network combined with discrete wavelet transform, International Journal of Wavelets, Multiresolution and Information Processing, 19(04): 2050092, https://doi.org/10.1142/S0219691320500927.
Lampert T.A., O’Keefe S.E.M. (2013), On the detection of tracks in spectrogram images, Pattern Recognition, 46(5): 1396–1408, https://doi.org/10.1016/j.patcog.2012.11.009.
Lan H., White P.R., Li N., Li J., Sun D. (2020), Coherently averaged power spectral estimate for signal detection, Signal Processing, 169: 107414, https://doi.org/10.1016/j.sigpro.2019.107414.
Li X., Wang D., Tian Y., Kong X. (2023), A method for extracting interference striations in lofargram based on decomposition and clustering, IET Image Processing, 17(6): 1951–1958, https://doi.org/10.1049/ipr2.12768.
Lim T., Bae K., Hwang C., Lee H. (2007), Classification of underwater transient signals using MFCC feature vector, 2007 9th International Symposium on Signal Processing and Its Applications, ISSPA 2007, Proceedings, pp. 1–4, https://doi.org/10.1109/ISSPA.2007.4555521.
Luo X., Chen L., Zhou H., Cao H. (2023), A survey of underwater acoustic target recognition methods based on machine learning, Journal of Marine Science and Engineering, 11(2): 384, https://doi.org/10.3390/ jmse11020384.
Luo X., Zhang M., Liu T., Huang M., Xu X. (2021), An underwater acoustic target recognition method based on spectrograms with different resolutions, Journal of Marine Science and Engineering, 9(11): 1246, https://doi.org/10.3390/jmse9111246.
McKenna M.F. et al. (2024), Understanding vessel noise across a network of marine protected areas, Environmental Monitoring and Assessment, 196(4): 369, https://doi.org/10.1007/s10661-024-12497-2.
Müller N., Reermann J., Meisen T. (2024), Navigating the depths: A comprehensive survey of deep learning for passive underwater, IEEE Access, 12: 154092–154118, https://doi.org/10.1109/ACCESS.2024.3480788.
Noumida A., Rajan R. (2022), Multi-label bird species classification from audio recordings using attention framework, Applied Acoustics, 197: 108901, https://doi.org/10.1016/j.apacoust.2022.108901.
Pang D., Wang H., Ma J., Liang D. (2023), DCTN: A dense parallel network combining CNN and transforme for identifying plant disease in field, Soft Computing, 27(21): 15549–15561, https://doi.org/10.1007/s00500-023-09071-2.
Park J., Jung D.-J. (2021), Deep convolutional neural network architectures for tonal frequency identification in a lofargram, International Journal of Control, Automation and Systems, 19(2): 1103–1112, https://doi.org/10.1007/s12555-019-1014-4.
Raffel C. et al. (2020), Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of Machine Learning Research, 21(140): 1–67.
Santos-Domınguez D., Torres-Guijarro S., Cardenal-Lopez A., Pena-Gimenez A. (2016), ShipsEar: An underwater vessel noise database, Applied Acoustics, 113: 64–69, https://doi.org/10.1016/j.apacoust.2016.06.008.
Sharma G., Umapathy K., Krishnan S. (2020), Trends in audio signal feature extraction methods, Applied Acoustics, 158: 107020, https://doi.org/10.1016/j.apacoust.2019.107020.
Sherin B.M., Supriya M.H. (2015), Selection and parameter optimization of SVM kernel function for underwater target classification, [in:] 2015 IEEE Underwater Technology (UT), pp. 1–5, https://doi.org/10.1109/UT.2015.7108260.
Siddagangaiah S., Li Y., Guo X., Chen X., Zhang Q., Yang K., Yang Y. (2016), A complexity-based approach for the detection of weak signals in ocean ambient noise, Entropy, 18(3): 101, https://doi.org/10.3390/e18030101.
Singh P., Saha G., Sahidullah M. (2021), Non-linear frequency warping using constant-Q transformation for speech emotion recognition, [in:] 2021 International Conference on Computer Communication and Informatics (ICCCI), pp. 1–6, https://doi.org/10.1109/ICCCI50826.2021.9402569.
Song G., Guo X., Wang W., Ren Q., Li J., Ma L. (2021), A machine learning-based underwater noise classification method, Applied Acoustics, 184: 108333, https://doi.org/10.1016/j.apacoust.2021.108333.
Thomas M., Martin B., Kowarski K., Gaudet B., Matwin S. (2020), Marine mammal species classification using convolutional neural networks and a novel acoustic representation, [in:] Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science, 11908: 290–305, https://doi.org/10.1007/978-3-030-46133-1 18.
Yang Y., Yao Q., Wang Y. (2024), Underwater acoustic target recognition method based on feature fusion and residual CNN, IEEE Sensors Journal, 24(22): 37342–37357, https://doi.org/10.1109/JSEN.2024.3464754.
Yuan F., Ke X., Cheng E. (2019), Joint representation and recognition for ship-radiated noise based on multimodal deep learning, Journal of Marine Science and Engineering, 7(11): 380, https://doi.org/10.3390/jmse7110380.
Zeng Y., Zhang M., Han F., Gong Y., Zhang J. (2019), Spectrum analysis and convolutional neural network for automatic modulation recognition, [in:] IEEE Wireless Communications Letters, 8(3): 929–932, https://doi.org/10.1109/LWC.2019.2900247.
DOI: 10.24425/aoa.2025.153662