A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition
Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David Kung, Michael Picheny
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2700.pdfCool merge graphs
Detection and Recovery of OOVs for Improved English Broadcast News Captioning
Samuel Thomas (IBM Research AI), Kartik Audhkhasi (IBM Research AI), Zoltan Tuske (IBM Research AI), Yinghui Huang (IBM Research AI), Michael Picheny (IBM Research AI)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2793.pdfNothing new but still important
Disfluencies and Human Speech Transcription Errors
Vicky Zayats (University of Washington), Trang Tran (University of Washington), Courtney Mansfield (University of Washington), Richard Wright (University of Washington), Mari Ostendorf (University of Washington)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/3134.pdfRobust Sound Recognition: A Neuromorphic Approach
Jibin Wu (National University of Singapore), Zihan Pan , Malu Zhang , Rohan Kumar Das , Yansong Chua , Haizhou Li
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/8032.pdfSpiking neural networks
Neural Named Entity Recognition from Subword Units
Abdalghani Abujabal (Max Planck Institute for Informatics), Judith Gaspers (Amazon)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/1305.pdfNames recognition is still important
Unsupervised Acoustic Segmentation and Clustering using Siamese Network Embeddings
Saurabhchand Bhati (The Johns Hopkins University), Shekhar Nayak (Indian Institute of Technology Hyderabad), Sri Rama Murty Kodukula (IIT Hyderabad), Najim Dehak (Johns Hopkins University)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2981.pdfAcoustic Model Bootstrapping Using Semi-Supervised Learning
Langzhou Chen (Amazon Cambridge office), Volker Leutnant (Amazon Aachen office)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2818.pdfBandwidth Embeddings for Mixed-bandwidth Speech Recognition
Gautam Mantena (Apple Inc.), Ozlem Kalinli (Apple Inc), Ossama Abdel-Hamid (Apple Inc), Don McAllaster (Apple Inc)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2589.pdfTowards Debugging Deep Neural Networks by Generating Speech Utterances
Bilal Soomro (University of Eastern Finland), Anssi Kanervisto (University of Eastern Finland), Trung Ngo Trong (University of Eastern Finland), Ville Hautamaki (University of Eastern Finland)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2339.pdfDebugging is very nice idea
A Study for Improving Device-Directed Speech Detection toward Frictionless Human-Machine Interaction
Che-Wei Huang (Amazon), Roland Maas (Amazon.com), Sri Harish Mallidi (Amazon, USA), Bjorn Hoffmeister (Amazon.com)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2840.pdfNice idea, we covered that before
Deep Learning for Orca Call Type Identification — A Fully Unsupervised Approach
Christian Bergler, Manuel Schmitt, Rachael Xi Cheng, Andreas Maier, Volker Barth, Elmar Nöth
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/1857.pdfKinda cool
The STC ASR System for the VOiCES from a Distance Challenge 2019
Ivan Medennikov (STC-innovations Ltd), Yuri Khokhlov (STC-innovations Ltd), Aleksei Romanenko (ITMO University), Ivan Sorokin (STC), Anton Mitrofanov (STC-innovations Ltd), Vladimir Bataev (Speech Technology Center Ltd), Andrei Andrusenko (STC-innovations Ltd), Tatiana Prisyach (STC-innovations Ltd), Mariya Korenevskaya (STC-innovations Ltd), Oleg Petrov (ITMO University), Alexander Zatvornitskiy (Speech Technology Center)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/1574.pdfKaggle type and cool tricks (char based LM), congrats to STC
Continuous Emotion Recognition in Speech – Do We Need Recurrence?
Maximilian Schmitt (ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg), Nicholas Cummins (University of Augsburg), Björn Schuller (University of Augsburg / Imperial College London)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2710.pdfSelf-supervised speaker embeddings
Themos Stafylakis (Omilia - Conversational Intelligence), Johan Rohdin (Brno University of Technology), Oldrich Plchot (Brno University of Technology), Petr Mizera (Czech Technical University in Prague), Lukas Burget (Brno University of Technology)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2842.pdfthe word of the year
Better morphology prediction for better speech systems
Dravyansh Sharma (Carnegie Mellon University), Melissa Wilson (Google LLC), Antoine Bruguier (Google LLC)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/3207.pdfConnecting and Comparing Language Model Interpolation Techniques
Ernest Pusateri, Christophe Van Gysel, Rami Botros, Sameer Badaskar, Mirko Hannemann, Youssef Oualil, Ilya Oparin
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/1822.pdfWorth to remind
Articulation rate as a metric in spoken language assessment
Calbert Graham (University of Cambridge), Francis Nolan (University of Cambridge)
https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2098.pdf