10.1515/aoa-2016-0064
Laughter Classification Using Deep Rectifier Neural Networks with a Minimal Feature Subset
References
Bachorowski, J.-A., Smoski, M. J., and Owren, M. J. (2001). The acoustic features of human laughter. Journal of the Acoustical Society of America, 110(3), 1581-1597.
Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H. (2007). Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems, 19, 153-160.
Acoustic analysis of laughter. In Proceedings of ICSLP, pages 927-930, Banff, Canada.
Blomberg, M. and Elenius, K. (1992). Speech recognition using artificial neural networks and dynamic programming. In Proceedings of Fonetik, page 57, Göteborg, Sweden.
Connectionist Speech Recognition: A Hybrid Approach. Kluwer Academic, Norwell.
Brendel, M., Zaccarelli, R., and Devillers, L. (2010). A quick sequential forward floating feature selection algorithm for emotion detection from speech. In Proceedings of Interspeech, pages 1157-1160, Makuhari, Japan.
Hierarchical neural networks and enhanced class posteriors for social signal classification. In Proceedings of ASRU, pages 362-367.
The animal nature of spontaneous human laughter. Evolution and Human Behavior, 35(4), 327-335.
Busso, C., Mariooryad, S., Metallinou, A., and Narayanan, S. (2013). Iterative feature normalization scheme for automatic emotion detection from speech. IEEE Transactions on Affective Computing, 4(4), 386-397.
High-light sound effects detection in audio stream. In Proceedings of ICME, pages 37-40.
On the use of nonverbal speech sounds in human communication. In Proceedings of COST Action 2102: Verbal and Nonverbal Communication Behaviours, pages 117-128, Vietri sul Mare, Italy.
Campbell, N., Kashioka, H., and Ohara, R. (2005). No laughing matter. In Proceedings of Interspeech, pages 465-468, Lisbon, Portugal.
Chandrashekar, G. and Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
Pattern Recognition, a Statistical Approach. Prentice Hall.
Laughter in interaction. Cambridge University Press, Cambridge, UK.
Deep sparse rectifier networks. In Proceedings of AISTATS, pages 315-323.
Goldstein, J. H. and McGhee, P. E. (1972). The psychology of humor: Theoretical perspectives and empirical issues. Academic Press, New York, USA.
{BEA}: {A} multifunctional {H}ungarian spoken language database. The Phonetician, 105(106), 50-61.
Conflict intensity estimation from speech using greedy forward-backward feature selection. In Proceedings of Interspeech, pages 1339-1343, Dresden, Germany.
On evaluation metrics for social signal detection. In Proceedings of Interspeech, pages 2504-2508, Dresden, Germany.
Gosztolya, G., Busa-Fekete, R., and Tóth, L. (2013). Detecting autism, emotions and social signals using {AdaBoost}. In Proceedings of Interspeech, pages 220-224, Lyon, France.
Gosztolya, G., Grósz, T., Busa-Fekete, R., and Tóth, L. (2014). Detecting the intensity of cognitive and physical load using {AdaBoost} and {Deep Rectifier Neural Networks}. In Proceedings of Interspeech, pages 452-456, Singapore.
A comparison of {Deep Neural Network} training methods for {Large Vocabulary Speech Recognition}. In Proceedings of TSD, pages 36-43, Pilsen, Czech Republic.
Grósz, T., Busa-Fekete, R., Gosztolya, G., and Tóth, L. (2015). Assessing the degree of nativeness and {Parkinson's} condition using {Gaussian Processes} and {Deep Rectifier Neural Networks}. In Proceedings of Interspeech, pages 1339-1343.
What's in a laugh? {H}umour, jokes, and laughter in the conversational corpus of the {BNC}. Ph.D. thesis, Universitat Freiburg.
Gupta, R., Audhkhasi, K., Lee, S., and Narayanan, S. S. (2013). Speech paralinguistic event detection using probabilistic time-series smoothing and masking. In Proceedings of Interspeech, pages 173-177.
Nevetés a társalgásban. In K. Laczkó and S. Tátrai, editors, Elmélet és módszer, pages 105-129. ELTE Eötvös József Collegium, Budapest, Hungary.
A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527-1554.
Having a laugh at work: {H}ow humour contributes to workplace culture. Journal of Pragmatics, 34(12), 1683-1710.
Hudenko, W., Stone, W., and Bachorowski, J.-A. (2009). Laughter differs in children with autism: An acoustic analysis of laughs produced by children with and without the disorder. Journal of Autism and Developmental Disorders, 39(10), 1392-1400.
Laughter detection in meetings. In Proceedings of the NIST Meeting Recognition Workshop at ICASSP, pages 118-121, Montreal, Canada.
Automatic laughter detection using neural networks. In Proceedings of Interspeech, pages 2973-2976, Antwerp, Belgium.
Kovács, Gy. and Tóth, L. (2015). Joint optimization of spectro-temporal features and {Deep Neural Nets} for robust automatic speech recognition. Acta Cybernetica, 22(1), 117-134.
{LAFC}am leveraging affective feedback camcorder. In Proceedings of CHI EA, pages 574-575, Minneapolis, MN, USA.
A characterization of the {Gamma} distribution. Annals of Mathematical Statistics, 26(2), 319-324.
The psychology of humor: An integrative approach. Elsevier, Amsterdam, NL.
Neuberger, T. and Beke, A. (2013a). Automatic laughter detection in {H}ungarian spontaneous speech using {GMM}/{ANN} hybrid method. In Proceedings of SJUSK Conference on Contemporary Speech Habits, pages 1-13.
Automatic laughter detection in spontaneous speech using {GMM}-{SVM} method. In Proceedings of TSD, pages 113-120.
Neuberger, T., Beke, A., and Gósy, M. (2014). Acoustic analysis and automatic detection of laughter in {Hungarian} spontaneous speech. In Proceedings of ISSP, pages 281-284.
Nwokah, E. E., Davies, P., Islam, A., Hsu, H.-C., and Fogel, A. (1993). Vocal affect in three-year-olds: a quantitative acoustic analysis of child laughter. Journal of the Acoustical Society of America, 94(6), 3076-3090.
Rothganger, H., Hauser, G., Cappellini, A. C., and Guidotti, A. (1998). Analysis of laughter and speech sounds in {Italian} and {German} students. Naturwissenschaften, 85(8), 394-402.
Salamin, H., Polychroniou, A., and Vinciarelli, A. (2013). Automatic detection of laughter and fillers in spontaneous mobile phone conversations. In Proceedings of SMC, pages 4282-4287.
Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 297-336.
Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., and Williamson, R. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1443-1471.
Valente, and Kim]{compare2013} Schuller, B., Steidl, S., Batliner, A., Vinciarelli, A., Scherer, K., Ringeval, F., Chetouani, M., Weninger, F., Eyben, F., Marchi, E., Salamin, H., Polychroniou, A., Valente, F., and Kim, S. (2013). The {I}nterspeech 2013 {C}omputational {P}aralinguistics {C}hallenge: {S}ocial signals, {C}onflict, {E}motion, {A}utism. In Proceedings of Interspeech.
Feature engineering in context-dependent deep neural networks for conversational speech transcription. In Proceedings of ASRU, pages 24-29.
Building a multimodal laughter database for emotion recognition. In Proceedings of LREC, pages 2347-2350.
Acoustic features of four types of laughter in natural conversational speech. In Proceedings of ICPhS, pages 1958-1961.
Phone recognition with {Deep Sparse Rectifier Neural Networks}. In Proceedings of ICASSP, pages 6985-6989.
Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition. In Proceedings of ICASSP, pages 190-194.
Convolutional deep maxout networks for phone recognition. In Proceedings of Interspeech, pages 1078-1082.
Phone recognition with hierarchical {Convolutional Deep Maxout Networks}. EURASIP Journal on Audio, Speech, and Music Processing, 2015(25), 1-13.
Tóth, L., Gosztolya, G., Vincze, V., Hoffmann, I., Szatlóczki, G., Biró, E., Zsura, F., Pákáski, M., and Kálmán, J. (2015). Automatic detection of mild cognitive impairment from spontaneous speech using {ASR}. In Proceedings of Interspeech, pages 2694-2698, Dresden, Germany.
Paralanguage: A first approximation. Studies in Linguistics, 13}, 1-12.
The typology of paralanguage. Anthropological Linguistics, 3(1), 17-21.
Conventional, biological and environmental factors in speech communication: a modulation theory. Phonetica, 51(1-3), 170-183.
Evidence for demodulation in speech perception. In Proceedings of ICSLP, pages 790-793, Beijing, China.
Truong, K. P. and van Leeuwen, D. A. (2005). Automatic detection of laughter. In Proceedings of Interspeech, pages 485-488, Lisbon, Portugal.
Truong, K. P. and van Leeuwen, D. A. (2007). Automatic discrimination between laughter and speech. Speech Communication, 49(2), 144-158.
Nem verbális hangjelenségek spontán társalgásban. Beszédkutatás, 2011}, 134-148.
Vicsi, K., Sztahó, D., and Kiss, G. (2012). Examination of the sensitivity of acoustic-phonetic parameters of speech to depression. In Proceedings of CogInfoCom, pages 511-515, Kosice, Slovakia.
DOI: 10.1515/aoa-2016-0064