Publications

Compiled from isle_pubs.bib.

Bashima Islam, Nancy L McElwain, Jialu Li, Maria Davila, Yannan Hu, Kexin Hu, Jordan M Bodway, Ashutosh M Dhekne, Romit Roy Choudhury, & Mark Hasegawa-Johnson. Preliminary Technical Validation of LittleBeats™: A Multimodal Sensing Platform to Capture Cardiac Physiology, Motion, and Vocalizations. Preprints, no. 2024010906, Jan, 2024

Liming Wang, Mark Hasegawa-Johnson, & Chang Yoo. Unsupervised Speech Recognition with N-Skipgram and Positional Unigram Matching. Proc. ICASSP, no. 4604, in press, 2024

Heting Gao, Mark Hasegawa-Johnson, & Chang D. Yoo. G2PU: Grapheme-to-Phoneme Transducer with Speech Units. Proc. ICASSP, no. 1746, in press, 2024

Kai Chieh Chang, Mark Hasegawa-Johnson, Nancy L. McElwain, & Bashima Islam. Classification of Infant Sleep/Wake States: Cross-Attention Among Large Scale Pretrained Transformer Networks Using Audio, ECG, and IMU Data. APSIPA ASC, Nov, 2023

Liming Wang, Mark Hasegawa-Johnson, & Chang D. Yoo. A Theory of Unsupervised Speech Recognition. ACL, Jul, 2023

Liming Wang, Junrui Ni, Heting Gao, Jialu Li, Kai Chieh Chang, Xulin Fan, Junkai Wu, Mark Hasegawa-Johnson, & Chang D. Yoo. Speak and Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition. Findings of ACL, Jul, 2023

Nancy McElwain, Bashima Islam, Meghan Fisher, Camille Nebeker, Jordan Marie Bodway, & Mark Hasegawa-Johnson. Evaluating Users’ Experiences of a Child Multimodal Wearable Device: A Mixed Methods Approach. JMIR Human Factors, in press, 2023

Jialu Li, Mark Hasegawa-Johnson, & Nancy McElwain. Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio. Proc. Interspeech, 2023

Wanyue Zhai, & Mark Hasegawa-Johnson. Wav2ToBI: a new approach to automatic ToBI transcription. Proc. Interspeech, 2023

Hee Suk Yoon, Eunseop Yoon, John Harvill, Sunjae Yoon, Mark Hasegawa-Johnson, & Chang D. Yoo. SMSMix: Sense Maintained Sentence Mixup for Word Sense Disambiguation. EMNLP, pp. 1493–1502, Dec, 2022

Eunseop Yoon, Hee Suk Yoon, Dhananjaya Gowda, SooHwan Eom, Daehyeok Kim, John Harvill, Heting Gao, Mark Hasegawa-Johnson, Chanwoo Kim, & Chang D. Yoo. Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction. Proc. Interspeech, 2023

Zhongweiyang Xu, Xulin Fan, & Mark Hasegawa-Johnson. Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction. Proceedings of ICASSP, Recognized as one of the top 3% of papers at the conference, 2023

Wonjune Kang, Mark Hasegawa-Johnson, & Deb Roy. End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions. Proc. Interspeech, 2023

Piotr Zelasko, Siyuan Feng, Laureano Moro-Velazquez, Ali Abavisani, Saurabchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, & Najim Dehak. Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition. Computer Speech and Language, vol. 74, pp. 101358:1-54, Jul, 2022

Haeyong Kang, Rusty John Lloyd Mina, Sultan Rizky Hikmawan Madjid, Jaehong Yoon, Mark Hasegawa-Johnson, Sung Ju Hwang, & Chang D Yoo. Forget-free continual learning with winning subnetworks. Proc. International Conference on Machine Learning (ICML), vol. 162, pp. 10734-10750, Jun, 2022

Jialu Li, & Mark Hasegawa-Johnson. Autosegmental Neural Nets 2.0: An Extensive Study of Training Synchronous and Asynchronous Phones and Tones for Under-Resourced Tonal Languages. IEEE Transactions on Audio, Speech and Language, vol. 30, pp. 1918-1926, May, 2022

Liming Wang, Siyuan Feng, Mark A. Hasegawa-Johnson, & Chang D. Yoo. Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition. ACL, pp. 8027–8047, May, 2022

John Harvill, Mark Hasegawa-Johnson, & Chang D. Yoo. Frame-Level Stutter Detection. Proc. Interspeech 2022, pp. 2843-2847, 2022

John Harvill, Yash Wani, Narendra Ahuja, Mark Hasegawa-Johnson, David Chestek, Mustafa Alam, & David Beiser. Estimation of Respiratory Rate from Breathing Audio. 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2022

Gao, Heting, Ni, Junrui, Zhang, Yang, Qian, Kaizhi, Chang, Shiyu, & Hasegawa-Johnson, Mark. Domain Generalization for Language-Independent Automatic Speech Recognition. Frontiers in Artificial Intelligence, vol. 5, Frontiers Media SA, pp. 806274, 2022

Raymond Yeh, Mark Hasegawa-Johnson, & Alexander Schwing. Equivariance Discovery by Learned Parameter-Sharing. AISTATS, 2022

John Harvill, Roxana Girju, & Mark Hasegawa-Johnson. Syn2Vec: Synset Colexification Graphs for Lexical Semantic Similarity. Proc. NAACL, pp. 5259–5270, 2022

Heting Gao, Xiaoxuan Wang, Sunghun Kang, Rusty Mina, Dias Issa, John Harvill, Leda Sarı, Mark Hasegawa-Johnson, & Chang D. Yoo. Seamless Equal Accuracy Ratio for Inclusive CTC Speech Recognition. Speech Communication, vol. 136, pp. 76-83, 2022

Leda Sarı, Mark Hasegawa-Johnson, & Samuel Thomas. Auxiliary Networks for Joint Speaker Adaptation and Speaker Change Detection. IEEE Transactions on Audio, Speech, and Language, vol. 29, pp. 324-333, 2022

Mahir Morshed, & Mark Hasegawa-Johnson. Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks. Proc. Interspeech 2022, pp. 2298-2302, 2022

Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, & Mark Hasegawa-Johnson. WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models. Proc. Interspeech 2022, pp. 2738-2742, 2022

Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, & Mark Hasegawa-Johnson. Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition. Proc. Interspeech 2022, pp. 461-465, 2022

Chak Ho Chan, Kaizhi Qian, Yang Zhang, & Mark Hasegawa-Johnson. SpeechSplit2.0: Unsupervised Speech Disentanglement for Voice Conversion without Tuning Autoencoder Bottlenecks. ICASSP, pp. 6332-6336, 2022

Odette Scharenborg, & Mark Hasegawa-Johnson. Position Paper: Brain Signal-based Dialogue Systems. Lecture Notes in Computer Science, vol. 714, Marchi, E., Siniscalchi, S.M., Cumani, S., Salerno, V.M., Li, H., eds., Mar, 2021

Jialu Li, Mark Hasegawa-Johnson, & Nancy McElwain. Analysis of Acoustic and Voice Quality Features for the Classification of Infant and Mother Vocalizations. Speech Communication, vol. 133, pp. 41-61, 2021

Andrew Rosenberg, & Mark Hasegawa-Johnson. Automatic Prosody Labeling and Assessment. Oxford Handbook of Language Prosody, Carlos Gussenhoven and Aoju Chen, eds., Oxford University Press, pp. 646-656, 2021

Junzhe Zhu, Mark Hasegawa-Johnson, & Nancy McElwain. A Comparison Study on Infant-Parent Voice Diarization. Proc. ICASSP, pp. 7178-7182, 2021

John Harvill, Yash R. Wani, Mark Hasegawa-Johnson, Narendra Ahuja, David Beiser, & David Chestek. Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation. Proc. Interspeech, pp. 926-930, 2021

Ni, Junrui. Enforcing constraints for multi-lingual and cross-lingual speech-to-text systems. Master’s Thesis, University of Illinois, 2021

Hui Shi, Yang Zhang, Hao Wu, Shiyu Chang, Kaizhi Qian, Mark Hasegawa-Johnson, & Jishen Zhao. Continuous CNN for Nonuniform Time Series. Proc. ICASSP, 2021

Kiran Ramnath, Leda Sarı, Mark Hasegawa-Johnson, & Chang Yoo. Worldly Wise (WoW) – Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering. Proc. NAACL, pp. 1908–1919, 2021

Zhonghao Wang, Mo Yu, Kai Wang, Jinjun Xiaong, Wen-mei Hwu, Mark Hasegawa-Johnson, & Humphrey Shi. Interpretable Visual Reasoning via Induced Symbolic Space. ICCV, pp. 1878-1887, 2021

Leda Sarı, Mark Hasegawa-Johnson, & Chang D. Yoo. Counterfactually Fair Automatic Speech Recognition. IEEE Transactions on Audio, Speech, and Language, vol. 29, pp. 3515-3525, 2021

Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, & Mark Hasegawa-Johnson. Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding. Proc. Interspeech, pp. 1304-1308, 2021

Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Ali Abavisani, Mark Hasegawa-Johnson, Odette Scharenborg, & Najim Dehak. How Phonotactics Affect Multilingual and Zero-shot ASR Performance. Proc. ICASSP, pp. 7238-7242, 2021

Liming Wang, Xinsheng Wang, Mark Hasegawa-Johnson, Odette Scharenborg, & Najim Dehak. Align or Attend? Toward More Efficient and Accurate Spoken Word Discovery Using Speech-to-Image Retrieval. Proc. ICASSP, 2021

John Harvill, Dias Issa, Mark Hasegawa-Johnson, & Changdong Yoo. Synthesis of New Words for Improved Dysarthric Speech Recognition on an Expanded Vocabulary. Proc. ICASSP, pp. 6428-6432, 2021

Heting Gao. Improving multilingual speech recognition systems. Master’s Thesis, University of Illinois, 2021

Kaizhi Qian, Yang Zhang, Shiyu Chang, Chuang Gan, David D. Cox, Mark Hasegawa-Johnson, & Jinjun Xiong. Global Rhythm Style Transfer Without Text Transcriptions. ICML, 2021

Xinsheng Wang, Siyuan Feng, Jihua Zhu, Mark Hasegawa-Johnson, & Odette Scharenborg. Show and Speak: Directly Synthesize Spoken Description of Images. Proc. ICASSP, 2021

Junzhe Zhu, Raymond Yeh, & Mark Hasegawa-Johnson. Multi-Decoder DPRNN: Source Separation for Variable Number of Speakers. Proc. ICASSP, pp. 3420-3424, 2021

Junzhe Zhu, Mark Hasegawa-Johnson, & Leda Sari. Identify Speakers in Cocktail Parties with End-to-End Attention. Proc. Interspeech, pp. 3092-3096, 2020

Ali Abavisani, & Mark Hasegawa-Johnson. Automatic Estimation of Inteligibility Measure for Consonants in Speech. Proc. Interspeech, pp. 1161-1165, 2020

Mark Hasegawa-Johnson, Leanne Rolston, Camille Goudeseune, Gina-Anne Levow, & Katrin Kirchhoff. Grapheme-to-Phoneme Transduction for Cross-Language ASR. Lecture Notes in Computer Science, vol. 12379, pp. 3-19, 2020

Jialu Li, & Mark Hasegawa-Johnson. Autosegmental Neural Nets: Should Phones and Tones be Synchronous or Asynchronous?. Proc. Interspeech, pp. 1027-1031, 2020

Piotr Żelasko, Laureano Moro-Velázquez, Mark Hasegawa-Johnson, Odette Scharenborg, & Najim Dehak. That Sounds Familiar: An Analysis of Phonetic Representations Transfer Across Languages. Proc. Interspeech 2020, pp. 3705-3709, 2020

Justin van der Hout, Mark Hasegawa-Johnson, & Odette Scharenborg. Evaluating Automatically Generated Phoneme Captions for Images. Proc. Interspeech, pp. 2317-2321, 2020

Liming Wang, & Mark Hasegawa-Johnson. A DNN-HMM-DNN Hybrid Model for Discovering Word-Like Units from Spoken Captions and Image Regions. Proc. Interspeech, pp. 1456-1460, 2020

Liming Wang, & Mark Hasegawa-Johnson. Multimodal word discovery and retrieval with spoken descriptions and visual concepts. IEEE Transactions on Audio, Speech and Language, vol. 28, pp. 1560-1573, 2020

Mark Hasegawa-Johnson. Multimodal Distant Supervision. NeurIPS Workshop on Self-Supervised Learning for Speech and Audio, 2020

Leda Sarı, Samuel Thomas, & Mark Hasegawa-Johnson. Training Spoken Language Understanding Systems with Non-Parallel Speech and Text. Proc. ICASSP, pp. 8109-8113, 2020

Leda Sarı, & Mark Hasegawa-Johnson. Deep F-Measure Maximization for End-to-End Speech Understanding. Proc. Interspeech, pp. 1580-1584, 2020

Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson, & David Cox. Unsupervised Speech Decomposition via Triple Information Bottleneck. Proc. International Conference on Machine Learning (ICML), vol. 119, pp. 7836-7846, 2020

Kaizhi Qian, Zeyu Jin, Mark Hasegawa-Johnson, & Gautham Mysore. F0-Consistent Many-to-Many Non-Parallel Voice Conversion via Conditional Autoencoder. Proc. ICASSP, pp. 6284-6288, 2020

Tarek Sakakini, Jong Yoon Lee, Aditya Srinivasa, Renato Azevedo, Victor Sadauskas, Kuangxiao Gu, Suma Bhat, Dan Morrow, James Graumlich, Saqib Walayat, Mark Hasegawa-Johnson, Donald Wilpern, & Ann Willemsen-Dunlap. Automatic Text Simplification of Health Materials in Low-Resource Domains. LOUHI: 11th International Workshop on Health Text Mining and Information Analysis, 2020

Daniel Morrow, Renato F.L. Azevedo, Leda Sari, Kuangxiao Gu, Tarek Sakakini, Mark Hasegawa-Johnson, Suma Bhat, James Graumlich, Thomas Huang, Andrew Hariharan, Yunxin Shao, & Elizabeth Cox. Closing the Loop in Computer Agent/Patient Communication. Proceedings of the 2020 Human Factors and Ergonomics Society Annual Meeting, Chicago, IL, 2020

Junrui Ni, Mark Hasegawa-Johnson, & Odette Scharenborg. The Time-Course of Phoneme Category Adaptation in Deep Neural Networks. Lecture Notes in Artificial Intelligence, vol. 11816, pp. 3-18, Oct, 2019

Yijia Xu. Acoustic Event, Spoken Keyword and Emotional Outburst Detection. Master’s Thesis, University of Illinois, 2019

Leda Sari, Samuel Thomas, Mark Hasegawa-Johnson, & Michael Picheny. Pre-Training of Speaker Embeddings for Low-Latency Speaker Change Detection in Broadcast News. Proc. ICASSP, pp. 3093:1-5, 2019

Odette Scharenborg, Jiska Koemans, Cybelle Smith, Mark A. Hasegawa-Johnson, & Kara D. Federmeier. The Neural Correlates Underlying Lexically-Guided Perceptual Learning. Proc. Interspeech, pp. 1223-1227, 2019

Mary Pietrowicz, Carla Agurto, Jonah Casebeer, Mark Hasegawa-Johnson, Karrie Karahalios, & Guillermo Cecchi. Dimensional Analysis of Laughter in Female Conversational Speech. Proc. ICASSP, pp. 6600-6604, 2019

Leda Sarı, Samuel Thomas, & Mark A. Hasegawa-Johnson. Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks. Proc. Interspeech 2019, pp. 769-773, 2019

Di He, Xuesong Yang, Boon Pang Lim, Yi Liang, Mark Hasegawa-Johnson, & Deming Chen. When CTC Training Meets Acoustic Landmarks. ICASSP, pp. 5996-6000, 2019

Liming Wang, & Mark A. Hasegawa-Johnson. Multimodal Word Discovery and Retrieval with Phone Sequence and Image Concepts. Proc. Interspeech, pp. 2683-2687, 2019

Mark Hasegawa-Johnson, Najim Dehak, & Odette Scharenborg. Position Paper: Indirect Supervision for Dialog Systems in Unwritten Languages. International Workshop on Spoken Dialog Systems, 2019

Laureano Moro-Velazquez, JaeJin Cho, Shinji Watanabe, Mark A. Hasegawa-Johnson, Odette Scharenborg, Heejin Kim, & Najim Dehak. Study of the Performance of Automatic Speech Recognition Systems in Speakers with Parkinson’s Disease. Proc. Interspeech 2019, pp. 3875-3879, 2019

Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, & Mark Hasegawa-Johnson. AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss. Proceedings of Machine Learning Research, vol. 97, pp. 5210-5219, 2019

Daniel Morrow, Renato Azevedo, Leitão Ferreira, Rocio Garcia-Retamero, Mark Hasegawa-Johnson, Thomas Huang, William Schuh, Kuangxiao Gu, & Yang Zhang. Contextualizing numeric clinical test results for gist comprehension: Implications for EHR patient portals. Journal of Experimental Psychology: Applied, vol. 25, no. 1, pp. 41-61, 2019

Renato F.L. Azevedo, Dan Morrow, Kuangxiao Gu, Thomas Huang, Mark Hasegawa-Johnson, P. Soni, S. Tang, Tarek Sakakini, Suma Bhat, Ann Willemsen-Dunlap, & James Graumlich. The Influence of Computer Agent Characteristics on User Preferences in Health Contexts. Proceedings of the 2019 Human Factors and Ergonomics Society Health Care Symposium, 2019

Van Hai Do, Nancy F. Chen, Boon Pang Lim, & Mark Hasegawa-Johnson. Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched Transcription. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 26, no. 3, pp. 501-514, Mar, 2018

Yijia Xu, Mark Hasegawa-Johnson, & Nancy L. McElwain. Infant emotional outbursts detection in infant-parent spoken interactions. Proc. Interspeech, pp. 242-246, 2018

Mark Hasegawa-Johnson. Unwritten Languages as a Test Case for the Theory of Phonetic Universals. Plenary talk delivered at the International Symposium on Chinese Spoken Language Processing, 2018

Raymond A. Yeh, Teck Yian Lim, Chen Chen, Alexander G. Schwing, Mark Hasegawa-Johnson, & Minh N. Do. Image Restoration with Deep Generative Models. Proc. IEEE ICASSP, pp. 6772-6772, 2018

Jialu Li, & Mark Hasegawa-Johnson. A Comparable Phone Set for the TIMIT Dataset Discovered in Clustering of Listen, Attend and Spell. NeurIPS Workshop on Interpretability and Robustness in Audio, Speech, and Language, 2018

Odette Scharenborg, Sebastian Tiesmeyer, Mark Hasegawa-Johnson, & Najim Dehak. Visualizing Phoneme Category Adaptation in Deep Neural Networks. Proc. Interspeech, pp. 1482-1486, 2018

Leda Sari, & Mark Hasegawa-Johnson. Speaker Adaptation with an Auxiliary Network. MLSLP (ISCA Workshop on Machine Learning for Speech and Language Processing), 2018

Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, & Deming Chen. Improved ASR for under-resourced languages through Multi-task Learning with Acoustic Landmarks. Proc. Interspeech, pp. 2618-2622, 2018

Amit Das. Speech Recognition with Probabilistic Transcriptions and End-to-End Systems Using Deep Learning. Master’s Thesis, University of Illinois, 2018

Amit Das, & Mark Hasegawa-Johnson. Improving DNNs Trained With Non-Native Transcriptions Using Knowledge Distillation and Target Interpolation. Proc. Interspeech, pp. 2434-2438, 2018

Lucas Ondel, Pierre Godard, Laurent Besacier, Elin Larsen, Mark Hasegawa-Johnson, Odette Scharenborg, Emmanuel Dupoux, Lukas Burget, François Yvon, & Sanjeev Khudanpur. Bayesian Models for Unit Discovery on a Very Low Resource Language. Proc. ICASSP, pp. 5939-5943, 2018

Wenda Chen, Mark Hasegawa-Johnson, & Nancy Chen. Recognizing Zero-resourced Languages based on Mismatched Machine Transcriptions. Proc. ICASSP, pp. 5979-5983, 2018

Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, & Mark Hasegawa-Johnson. Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. Proc. ICASSP, pp. 5989-5993, 2018

Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, & Deming Chen. Acoustic landmarks contain more information about the phone string than other frames for automatic speech recognition with deep neural network acoustic model. Journal of the Acoustical Society of America, vol. 143, no. 6, pp. 3207-3219, 2018

Leda Sari, Mark Hasegawa-Johnson, S. Kumaran, Georg Stemmer, & N. Nair Krishnakumar. Speaker Adaptive Audio-Visual Fusion for the Open-Vocabulary Section of AVICAR. Proc. Interspeech, pp. 3524-3528, 2018

Odette Scharenborg, Patrick Ebel, Francesco Ciannella, Mark Hasegawa-Johnson, & Najim Dehak. Building an ASR System for Mboshi Using a Cross-language Definition of Acoustic Units Approach. Proc. SLTU (Speech and Language Technology for Under-resourced languages), pp. 167-171, 2018

Odette Scharenborg, Laurent Besacier, Alan Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, & Emmanuel Dupoux. Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the Speaking Rosetta JSALT 2017 Workshop. Proc. ICASSP, 2018

Wenda Chen, Mark Hasegawa-Jonson, & Nancy F.Y. Chen. Topic and Keyword Identification for Low-resourced Speech Using Cross-Language Transfer Learning. Proc. Interspeech, pp. 2047-2051, 2018

Mark Hasegawa-Johnson, Alan Black, Lucas Ondel, Odette Scharenborg, & Francesco Ciannella. Image2speech: Automatically generating audio descriptions of images. Journal of the International Science and General Applications (ISGA), vol. 1, no. 1, 2018

Teck Yian Lim, Raymond Yeh, Yijia Xu, Minh Do, & Mark Hasegawa-Johnson. Time-Frequency Networks for Audio Super-Resolution. Proc. ICASSP, 2018

Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florencio, & Mark Hasegawa-Johnson. Deep Learning Based Speech Beamforming. Proc. ICASSP, pp. 5389-5393, 2018

Renato F. L. Azevedo, Dan Morrow, James Graumlich Ann Willemsen-Dunlap Mark Hasegawa-Johnson Thomas S. Huang, Kuangxiao Gu, Suma Bhat, Tarek Sakakini, Victor Sadauskas, & Donald J. Halpin. Using conversational agents to explain medication instructions to older adults. AMIA Annu Symp Proc., pp. 185–194, 2018

Renato Azevedo, Daniel G. Morrow, Kuangxiao Gu, Thomas Huang, Mark Allan Hasegawa-Johnson, James Graumlich, Victor Sadauskas, Tarek J. Sakakini, Suma Pallathadka Bhat, Ann M. Willemsen-Dunlap, & Donald J. Halpin. Computer Agents and Patient Memory for Medication Information. APA Annual Meeting, 2018

Xiang Kong, Xuesong Yang, Jeung-Yoon Choi, Mark Hasegawa-Johnson, & Stefanie Shattuck-Hufnagel. Landmark-based consonant voicing detection on multilingual corpora. Acoustics 17, Boston, Jun, 2017

Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, & Deming Chen. Selecting frames for automatic speech recognition based on acoustic landmarks. Acoustics 17, Boston, Jun, 2017

Xiang Kong, Xuesong Yang, Jeung-Yoon Choi, Mark Hasegawa-Johnson, & Stefanie Shattuck-Hufnagel. Landmark-based consonant voicing detection on multilingual corpora. Acoustics 17, Boston, Jun, 2017

Daniel Morrow, Mark Hasegawa-Johnson, Thomas Huang, William Schuh, Renato Azevedo, Kuangxiao Gu, Yang Zhang, Bidisha Roy, & Rocio Garcia-Retamero. A Multidisciplinary Approach to Designing and Evaluating Electronic Medical Record Portal Messages that Support Patient Self-Care. Journal of Biomedical Informatics, vol. 69, pp. 63-74, May, 2017

Di He, Zuofu Cheng, Mark Hasegawa-Johnson, & Deming Chen. Using Approximated Auditory Roughness as a Pre-filtering Feature for Human Screaming and Affective Speech AED. Proc. Interspeech, pp. 1914-1918, 2017

Mary Pietrowicz. Exposing the Hidden Vocal Channel: Analysis of Vocal Expression. Master’s Thesis, University of Illinois, 2017

Mary Pietrowicz, Mark Hasegawa-Johnson, & Karrie Karahaliqos. Discovering Dimensions of Perqceived Vocal Expression in Semi-Structured, Unscripted Oral History Accoqunts. Proc. ICASSP, pp. 2901:1-4, 2017

Roger Serwy. Hilbert Phase Methods for Glottal Activity Detection. Master’s Thesis, University of Illinois, 2017

Mark Hasegawa-Johnson, Preethi Jyothi, Wenda Chen, & Van Hai Do. Mismatched Crowdsourcing: Mining Latent Skills to Acquire Speech Transcriptions. Proceedings of Asilomar, 2017

Shiyu Chang, Yang Zhang, Wei Han, Mo Yu andXiaoxiao Guo, Wei Tan, Xiaodong Cui, Michael Witbrock, Mark Hasegawa-Johnson, & Thomas Huang. Dilated Recurrent Neural Networks. NIPS, 2017

Shiyu Chang, Yang Zhang, Jiling Tang, Dawei Yin, Yi Chang, Mark Hasegawa-Johnson, & Thomas Huang. Streaming Recommender Systems. WWW 2017, pp. 381-389, 2017

Raymond Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, & Minh N. Do. Semantic Image Inpainting with Deep Generative Networks. CVPR, pp. 5485-5493, 2017

Van Hai Do, Nancy F. Chen, Boon Pang Lim, & Mark Hasegawa-Johnson. Multi-Task Learning Using Mismatched Transcription for Under-Resourced Speech Recognition. Proc. Interspeech 2017, pp. 734-738, 2017

Odette Scharenborg, Francesco Ciannella, Shruti Palaskar, Alan Black, Florian Metze, Lucas Ondel, & Mark Hasegawa-Johnson. Building an ASR System for a Low-Resource Language Through the Adaptation of a High-Resource Language ASR System: Preliminary Results. Proc. Internat. Conference on Natural Language, Signal and Speech Processing (ICNLSSP), Casablanca, Morocco, 2017

Wenda Chen, Mark Hasegawa-Johnson, Nancy F. Chen, & Boon Pang Lim. Mismatched Crowdsourcing from Multiple Annotator Languages For Recognizing Zero-resourced Languages: A Nullspace Clustering Approach. Proc. Interspeech, pp. 2789-2793, 2017

Pavlos Papadopoulos, Ruchir Travadi, Colin Vaz, Nikolaos Malandrakis, Ulf Hermjakob, Nima Pourdamghani, Michael Pust, Boliang Zhang, Xiaoman Pan, Di Lu, Ying Lin, Ondrej Glembek, Murali Karthick B, Martin Karafiat, Lukas Burget, Mark Hasegawa-Johnson, Heng Ji, Jonathan May, Kevin Knight, & Shrikanth Narayanan. Team ELISA System for DARPA LORELEI Speech Evaluation 2016. Proc. Interspeech, pp. 2053-2057, 2017

Amit Das, Mark Hasegawa-Johnson, & Karel Vesely. Deep Autoencoder Based Multi-task Learning Using Probabilistic Transcription. Proc. Interspeech, pp. 2073-2077, 2017

Yang Zhang. Generative Models for Speech and Time Domain Signals. Master’s Thesis, University of Illinois, 2017

Preethi Jyothi, & Mark Hasegawa-Johnson. Low-Resource Grapheme-to-Phoneme Conversion using Recurrent Neural Networks. Proc. ICASSP, pp. 5030-5034, 2017

Mark Hasegawa-Johnson, Alan Black, Lucas Ondel, Odette Scharenborg, & Francesco Ciannella. Image2speech: Automatically generating audio descriptions of images. Proc. Internat. Conference on Natural Language, Signal and Speech Processing (ICNLSSP), Casablanca, Morocco, 2017

Mark Hasegawa-Johnson, Preethi Jyothi, Daniel McCloy, Majid Mirbagheri, Giovanni di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy Chen, Paul Hager, Tyler Kekona, Rose Sloan, & Adrian KC Lee. ASR for Under-Resourced Languages from Probabilistic Transcription. IEEE/ACM Trans. Audio, Speech and Language, vol. 25, no. 1, pp. 46-59, 2017

Yang Zhang, Dinei Florêncio, & Mark Hasegawa-Johnson. Glottal Model Based Speech Beamforming for ad-hoc Microphone Arrays. Proc. Interspeech 2017, pp. 2675-2679, 2017

Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florencio, & Mark Hasegawa-Johnson. Speech Enhancement Using Bayesian Wavenet. Proc. Interspeech, pp. 2013-2017, 2017

Van Hai Do, Nancy F. Chen, Boon Pang Lim, & Mark Hasegawa-Johnson. Speech recognition of under-resourced languages using mismatched transcriptions. International Conference on Asian Language Processing IALP, Tainan, Taiwan, Nov, 2016

Van Hai Do, Nancy F. Chen, Boon Pang Lim, & Mark Hasegawa-Johnson. A many-to-one phone mapping approach for cross-lingual speech recognition. 12th IEEE-RIVF International Conference on Computing and Communication Technologies, Hanoi, Vietnam, pp. 120-124, Nov, 2016

Daniel Morrow, Mark Hasegawa-Johnson, Thomas Huang, William Schuh, Rocio Garcia-Retamero, Renato Azevedo, Kuangxiao Gu, Yang Zhang, & Bidisha Roy. Multimedia formats can improve older adult comprehension of clinical test results: Implications for Designing Patient Portals. 28th APS Annual Convention (Association for Psychological Science, May, 2016

Mark Hasegawa-Johnson. Speech Production, Speech Perception, and Phonology. Lecture given at the Winter School on Speech and Audio Processing, Chennai, India, Jan, 2016

Mark Hasegawa-Johnson. Prosody. Lecture given at the Winter School on Speech and Audio Processing, Chennai, India, Jan, 2016

Mark Hasegawa-Johnson. Multivariate-State Models for Speech Recognition. Lecture given at the Winter School on Speech and Audio Processing, Chennai, India, Jan, 2016

Mark Hasegawa-Johnson. Limited Data Settings. Lecture given at the Winter School on Speech and Audio Processing, Chennai, India, Jan, 2016

Xuesong Yang, Xiang Kong, Mark Hasegawa-Johnson, & Yanlu Xie. Landmark-based Pronunciation Error Identification on L2 Mandarin Chinese. Speech Prosody, pp. 247-251, 2016

Karen Livescu, Frank Rudzicz, Eric Fosler-Lussier, Mark Hasegawa-Johnson, & Jeff Bilmes. Speech Production in Speech Technologies: Introduction to the CSL Special Issue. Computer Speech and Language, vol. 36, pp. 165-172, 2016

Wenda Chen, Mark Hasegawa-Johnson, & Nancy F. Chen. Mismatched Crowdsourcing based Language Perception for Under-resourced Languages. Procedia Computer Science, vol. 81, pp. 23-29, 2016

Yanlu Xie, Mark Hasegawa-Johnson, Leyuan Qu, & Jinsong Zhang. Landmark of Mandarin Nasal Codas and its Application in Pronunciation Error Detection. Proc. ICASSP, 2016

Yang Zhang, Gautham Mysore, Florian Berthouzoz, & Mark Hasegawa-Johnson. Analysis of Prosody Increment Induced by Pitch Accents for Automatic Emphasis Correction. Speech Prosody, pp. 79-83, 2016

Van Hai Do, Nancy F. Chen, Boon Pang Lim, & Mark Hasegawa-Johnson. Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages. Proc. Interspeech, pp. 3863-3867, 2016

Xiang Kong, Preethi Jyothi, & Mark Hasegawa-Johnson. Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints.. Procedia Computer Science, vol. 81, pp. 30-36, 2016

Lav Varshney, Preethi Jyothi, & Mark Hasegawa-Johnson. Language Coverage for Mismatched Crowdsourcing. Workshop on Information Theory and Applications, 2016

Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Lin, Yi Chang, Mark Hasegawa-Johnson, & Thomas Huang. Positive-Unlabeled Learning in Streaming Networks. KDD, pp. 755-764, 2016

Raymond Yeh, Mark Hasegawa-Johnson, & Minh Do. Stable and Symmetric Filter Convolutional Neural Network. Proc. ICASSP, pp. 2652-2656, 2016

Amit Das, & Mark Hasegawa-Johnson. An investigation on training deep neural networks using probabilistic transcription. Proc. Interspeech, pp. 3858-3862, 2016

Amit Das, Preethi Jyothi, & Mark Hasegawa-Johnson. Automatic speech recognition using probabilistic transcriptions in Swahili, Amharic and Dinka. Proc. Interspeech, pp. 3524-3528, 2016

Chunxi Liu, Preethi Jyothi, Hao Tang, Vimal Manohar, Rose Sloan, Tyler Kekona, Mark Hasegawa-Johnson, & Sanjeev Khudanpur. Adapting ASR for Under-Resourced Languages Using Mismatched Transcriptions. Proc. ICASSP, pp. 5840-5844, 2016

Kaizhi Qian, Yang Zhang, & Mark Hasegawa-Johnson. Application of Local Binary Patterns for SVM based Stop Consonant Detection. Speech Prosody, pp. 1114-1118, 2016

Wenda Chen, Mark Hasegawa-Johnson, Nancy Chen, Preethi Jyothi, & Lav Varshney. Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR. WSSAP (Workshop on South and Southeast Asian Natural Language Processing), pp. 133-141, 2016

Ruobai Wang, Yang Zhang, Zhijian Ou, & Mark Hasegawa-Johnson. Use of Particle Filtering and MCMC for Inference in Probabilistic Acoustic Tube Model. IEEE Workshop on Statistical Signal Processing, 2016

Paine, Tom Le, Khorrami, Pooya, Chang, Shiyu, Zhang, Yang, Ramachandran, Prajit, Hasegawa-Johnson, Mark A, & Huang, Thomas S. Fast wavenet generation algorithm. arXiv preprint arXiv:1611.09482, 2016

Mary Pietrowicz, Mark Hasegawa-Johnson, & Karrie Karahalios. Acoustic Correlates for Perceived Effort Levels in Expressive Speech. Proc. Interspeech, pp. 3720-3724, 2015

Yang Zhang, Nasser Nasrabadi, & Mark Hasegawa-Johnson. Multichannel Transient Acoustic Signal Classification Using Task-Driven Dictionary with Joint Sparsity and Beamforming. Proc. ICASSP, pp. 2591:1-5, 2015

Mahmoud Abunasser. Computational Measures of Linguistic Variation: A Study of Arabic Varieties. Master’s Thesis, University of Illinois, 2015

Amit Das, & Mark Hasegawa-Johnson. Cross-lingual transfer learning during supervised training in low resource scenarios. Proc. Interspeech, pp. 3531-3535, 2015

Preethi Jyothi, & Mark Hasegawa-Johnson. Transcribing continuous speech using mismatched crowdsourcing. Proc. Interspeech 2015, pp. 2774-2778, 2015

Preethi Jyothi, & Mark Hasegawa-Johnson. Improving Hindi Broadcast ASR by Adapting the Language Model and Pronunciation Model Using A Priori Syntactic and Morphophonemic Knowledge. Proc. Interspeech, pp. 3164-3168, 2015

Lee Estelle, Lim Zhi Yi Vanessa, Ang Hui Shan, & Lim Boon Pang. Singapore Hokkien Speech Recognition and Applications. A*STAR research symposium, 2015

Mark Hasegawa-Johnson, Ed Lalor, KC LEe, Preethi Jyothi, Majid Mirbagheri, Amit Das, Giovannie Di Liberto, Brad Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Paul Hager, Tyler Kekona, & Rose Sloan. Probabilistic Transcription. WS15 Group Final Presentation, 2015

Preethi Jyothi, & Mark Hasegawa-Johnson. Transcribing Continuous Speech Using Mismatched Crowdsourcing. Proc. Interspeech, pp. 2774-2778, 2015

Mark Hasegawa-Johnson, Jennifer Cole, Preethi Jyothi, & Lav Varshney. Models of Dataset Size, Question Design, and Cross-Language Speech Perception for Speech Crowdsourcing Applications. Journal of Laboratory Phonology, vol. 6, no. 3-4, pp. 381-431, 2015

Preethi Jyothi, & Mark Hasegawa-Johnson. Acquiring Speech Transcriptions Using Mismatched Crowdsourcing. Proc. AAAI, pp. 1263-1269, 2015

Jia-Chen Ren, Lawrence Angrave, & Mark Hasegawa-Johnson. ClassTranscribe: A New Tool with New Educational Opportunities for Student Crowdsourced College Lecture Transcriptions. SLaTE (the Workshop on Speech and Language Technology in Education), 2015

Jia Chen Ren, Mark Hasegawa-Johnson, & Lawrence Angrave. ClassTranscribe. ICER Conference, 2015

Yang Zhang, Zhijian Ou, & Mark Hasegawa-Johnson. Incorporating AM-FM effect in voiced speech for probabilistic acoustic tube model. Proc. WASPAA, 2015

Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, & Paris Smaragdis. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation. IEEE Trans. Audio, Speech and Language Processing, vol. 23, no. 12, pp. 2136-2147, 2015

Renato F. L. Azevedo, Daniel Morrow, Mark Hasegawa-Johnson, Kuangxiao Gu, Dan Soberal, Thomas Huang, William Schuh , & Rocio Garcia-Retamero. Improving Patient Comprehension of Numeric Health Information. Human Factors Conference, 2015

Mark Hasegawa-Johnson. Probabilistic Segmental Model For Doppler Ultrasound Heart Rate Monitoring. United States Patent Number 8727991, May, 2014

Kaizhi Qian. Regularized Estimation of Gaussian Mixture Models for SVM Based Speaker Recognition. B.S. Thesis, University of Illinois, 2014

Austin Chen, & Mark Hasegawa-Johnson. Mixed Stereo Audio Classification Using a Stereo-Input Mixed-to-Panned Level Feature. IEEE Trans. Speech and Audio Processing, vol. 22, no. 12, pp. 2025-2033, 2014

Austin Chen. Automatic Classification of Electronic Music and Speech/Music Audio Content. Master’s Thesis, University of Illinois, 2014

Preethi Jyothi, Jennifer Cole, Mark Hasegawa-Johnson, & Vandana Puri. An Investigation of Prosody in Hindi Narrative Speech. Proceedings of Speech Prosody, 2014

Sujeeth Bharadwaj, & Mark Hasegawa-Johnson. A PAC-Bayesian Approach to Minimum Perplexity Language Modeling. Proceedings of CoLing, pp. 130-140, 2014

Kai-Hsiang Lin, Pooya Khorrami, Jiangping Wang, Mark Hasegawa-Johnson, & Thomas S. Huang. Foreground Object Detection in Highly Dynamic Scenes Using Saliency. Proceedings of ICIP, pp. 1125-1129, 2014

Zhaowen Wang, Zhangyang Wang, Mark Moll, Po-Sen Huang, Devin Grady, Nasser Nasrabadi, Thomas Huang, Lydia Kavraki, & Mark Hasegawa-Johnson. Active Planning, Sensing and Recognition Using a Resource-Constrained Discriminant POMDP. CVPR Multi-Sensor Fusion Workshop, pp. 740-747, 2014

Preethi Jyothi, Jennifer Cole, Mark Hasegawa-Johnson, & Vandana Puri. An Investigation of Prosody in Hindi Narrative Speech. Proc. Speech Prosody 2014, pp. 623-627, 2014

Xiayu Chen, Yang Zhang, & Mark Hasegawa-Jonson. An iterative approach to decision tree training for context dependent speech synthesis. Proc. Interspeech, pp. 2327-2331, 2014

Alina Khasanova, Jennifer Cole, & Mark Hasegawa-Johnson. Detecting articulatory compensation in acoustic data through linear regression modeling. Proc. Interspeech 2014, pp. 925-929, 2014

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech. The 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, 2014

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Development of a TV Broadcasts Speech Recognition System for Qatari Arabic. The 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, pp. 3057-3061, 2014

Raymond Yeh. Divergence Guided Two Beams Viterbi Algorithm on Factorial HMMs. B.S. Thesis, University of Illinois, 2014

Yang Zhang, Zhijian Ou, & Mark Hasegawa-Johnson. Improvement of Probabilistic Acoustic Tube Model for Speech Decomposition. ICASSP, 2014

Po-Sen Huang, Minje Kim, Paris Smaragdis, & Mark Hasegawa-Johnson. Deep Learning for Monaural Speech Separation. ICASSP, 2014

Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, & Paris Smaragdis. Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks. Proceedings of ISMIR, 2014

Harsh Vardhan Sharma, & Mark Hasegawa-Johnson. Acoustic Model Adaptation using in-domain Background Models for Dysarthric Speech Recognition. Computer Speech and Language, vol. 27, no. 6, pp. 1147–1162, Sep, 2013

Elabbas Benmamoun, & Mark Hasegawa-Johnson. How Different are Arabic Dialects from Each Other and from Classical Arabic. 6th Annual Arabic Linguistics Symposium, Ifrane, Morocco, Jun, 2013

Robert Mertens, Po-Sen Huang, Luke Gottlieb, Gerald Friedland, Ajay Divakaran, & Mark Hasegawa-Johnson. On the Application of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks. International Journal of Multimedia Data Engineering and Management (IJDEM), vol. 3, no. 3, pp. 1-19, Apr, 2013

Kai-Hsiang Lin, Xiaodan Zhuang, Camille Goudeseune, Sarah King, Mark Hasegawa-Johnson, & Thomas S. Huang. Saliency-Maximized Audio Visualization and Efficient Audio-Visual Browsing for Faster-than-Real-Time Human Acoustic Event Detection. ACM Transactions on Applied Perception, 2013

Galen Andrew, Raman Arora, Sujeeth Bharadwaj, Jeff Bilmes, Mark Hasegawa-Johnson, & Karen Livescu. Using articulatory measurements to learn better acoustic features. Proc. Workshop on Speech Production in Automatic Speech Recognition, Lyon, France, 2013

Amit Juneja, & Mark Hasegawa-Johnson. Experiments on context-awareness and phone error propagation in human and machine speech recognition. Proc. Workshop on Speech Production in Automatic Speech Recognition, Lyon, France, 2013

Kyungtae Kim, Kai-Hsiang Lin, Dirk B Walther, Mark A Hasegawa-Johnson, & Thomas S Huang. Automatic Detection of Auditory Salience with Optimized Linear Filters Derived from Human Annotation. Pattern Recognition Letters, vol. 38, no. 1, pp. 78-85, 2013

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. A Transfer Learning Approach for Under-Resourced Arabic Dialects Speech Recognition. Workshop on Less Resourced Languages, new technologies, new challenges and opportunities (LTC 2013), pp. 60-64, 2013

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Automatic Long Audio Alignment for Conversational Arabic Speech. Qatar Foundation Annual Research Conference, 2013

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Development of a Spontaneous Large Vocabulary Speech Recognition System for Qatari Arabic. Qatar Foundation Annual Research Conference, 2013

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. A Framework for Conversational Arabic Speech Long Audio Alignment. Proc. 6th Language and Technology Conference (LTC 2013), pp. 290-293, 2013

Sujeeth Bharadwaj, Mark Hasegawa-Johnson, Jitendra Ajmera, Om Deshmukh, & Ashish Verma. Sparse Hidden Markov Models for Purer Clusters. Proc. ICASSP, 2013

Sarah King, & Mark Hasegawa-Johnson. Accurate Speech Segmentation by Mimicking Human Auditory Processing. Proc. ICASSP, 2013

Po-Sen Huang, Li Deng, Mark Hasegawa-Johnson, & Xiaodong He. Random Features for Kernel Deep Convex Network. Proc. ICASSP, pp. 8096-8900, 2013

Mahmoud Abunasser, Abbas Benmamoun, & Mark Hasegawa-Johnson. Pronunciation Variation Metric for Four Dialects of Arabic. AIDA 10 (Association Internationale de Dialectologie Arabe), Qatar University, 2013

Panying Rong, Torrey Loucks, Heejin Kim, & Mark Hasegawa-Johnson. Relationship between kinematics, F2 slope and speech intelligibility in dysarthria due to cerebral palsy. Clinical Linguistics and Phonetics, vol. 26, no. 9, pp. 806-822, Sep, 2012

Mark Hasegawa-Johnson, Xiaodan Zhuang, Xi Zhou, Camille Goudeseune, Hao Tang, Kai-Hsiang Lin, Mohamed Omar, & Thomas Huang. Toward Better Real-world Acoustic Event Detection. Unpublished presentation given at Seoul National University, May, 2012

Shobhit Mathur, Marshall Scott Poole, Feniosky Pena-Mora, Mark Hasegawa-Johnson, & Noshir Contractor. Detecting interaction links in a collaborating group using manually annotated data. Social Networks, 2012

Hao Tang, Stephen Chu, Mark Hasegawa-Johnson, & Thomas Huang. Partially Supervised Speaker Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 5, pp. 959-971, 2012

Ali Sakr, & Mark Hasegawa-Johnson. Topic Modeling of Phonetic Latin-Spelled Arabic for the Relative Analysis of Genre-Dependent and Dialect-Dependent Variation. CITALA, pp. 153-158, 2012

Po-Sen Huang, Mark Hasegawa-Johnson, Wotao Yin, & Tom Huang. Opportunistic Sensing: Unattended Acoustic Sensor Selection Using Crowdsourcing Models. IEEE Workshop on Machine Learning in Signal Processing, 2012

Po-Sen Huang, Jianchao Yang, Mark Hasegawa-Johnson, Feng Liang, & Thomas S. Huang. Pooling Robust Shift-Invariant Sparse Representations of Acoustic Signals. Proc. Interspeech, pp. 2518-2521, 2012

Po-Sen Huang, Robert Mertens, Ajay Divakaran, Gerald Friedland, & Mark Hasegawa-Johnson. How to Put it into Words—Using Random Forests to Extract Symbol Level Descriptions from Audio Content for Concept Detection. ICASSP, 2012

Camille Goudeseune. Effective browsing of long audio recordings. ACM International Workshop on Interactive Multimedia on Mobile and Portable Devices, 2012

Kai-Hsiang Lin, Xiaodan Zhuang, Camille Goudeseune, Sarah King, Mark Hasegawa-Johnson, & Thomas Huang. Improving Faster-than-Real-Time Human Acoustic Event Detection by Saliency-Maximized Audio Visualization. ICASSP, pp. 2277-2280, 2012

Hosung Nam, Vikramjit Mitra, Mark Tiede, Mark Hasegawa-Johnson, Carol Espy-Wilson, Elliot Saltzman, & Louis Goldstein. A procedure for estimating gestural scores from speech acoustics. J. Acoustical Society of America, vol. 132, no. 6, pp. 3980-3989, 2012

Tim Mahrt, Jennifer Cole, Margaret Fleck, & Mark Hasegawa-Johnson. Accounting for Speaker Variation in the Production of Prominence using the Bayesian Information Criterion. Speech Prosody, 2012

Tim Mahrt, Jennifer Cole, Margaret Fleck, & Mark Hasegawa-Johnson. F0 and the perception of prominence. Proc. Interspeech 2012, pp. 2422-2425, 2012

Mark Hasegawa-Johnson, Elabbas Benmamoun, Eiman Mustafawi, Mohamed Elmahdy, & Rehab Duwairi. On the definition of the word “segmental”. Proc. Speech Prosody 2012, pp. 159-162, 2012

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. A Baseline Speech Recognition System for Levantine Colloquial Arabic. Proceedings of ESOLEC, 2012

Po-Sen Huang, & Mark Hasegawa-Johnson. Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition. International Conference on Arabic Language Processing CITALA, pp. 119-122, 2012

Jui-Ting Huang. Semi-Supervised Learning for Acoustic and Prosodic Modeling in Speech Applications. Master’s Thesis, University of Illinois, 2012

Sarah King, & Mark Hasegawa-Johnson. Detection of Acoustic-Phonetic Landmarks in Mismatched Conditions Using a Biomimetic Model of Human Auditory Processing. CoLing, pp. 589-598, 2012

Mark Hasegawa-Johnson, Elabbas Benmamoun, Eiman Mustafawi, Mohamed Elmahdy, & Rehab Duwairi. On The Definition of the Word `Segmental’. Speech Prosody, pp. 159-162, 2012

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Hybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition. International Journal of Computational Linguistics, vol. 3, no. 1, pp. 88-96, 2012

Mohamed Elmahdy, Mark Hasegawa-Johnson, & Eiman Mustafawi. Hybrid Pronunciation Modeling for Arabic Large Vocabulary Speech Recognition. Qatar Foundation Annual Research Forum, 2012

Sujeeth Bharadwaj, Raman Arora, Karen Livescu, & Mark Hasegawa-Johnson. Multi-View Acoustic Feature Learning Using Articulatory Measurements. IWSML (Internat. Worksh. on Statistical Machine Learning for Sign. Process.), 2012

Mark Hasegawa-Johnson, David Harwath, Harsh Vardhan Sharma, & Po-Sen Huang. Transfer Learning for Multi-Person and Multi-Dialect Spoken Language Interface. presentation given at the 2012 Urbana Neuroengineering Conference, 2012

Harsh Vardhan Sharma. Acoustic Model Adaptation for Recognition of Dysarthric Speech. Master’s Thesis, University of Illinois, 2012

Po-Sen Huang, Scott Deeann Chen, Paris Smaragdis, & Mark Hasegawa-Johnson. Singing-Voice Separation from Monaural Recordings using Robust Principal Component Analysis. ICASSP, 2012

Xiaodan Zhuang. Modeling Audio and Visual Cues for Real-world Event Detection. Master’s Thesis, University of Illinois, Apr, 2011

Tim Mahrt, Jui-Ting Huang, Yoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson, & Margaret Fleck. Feature Sets for the Automatic Detection of Prosodic Prominence. New Tools and Methods for Very Large Scale Phonetics Research, University of Pennsylvania, Jan, 2011

Hosung Nam, Vikramjit Mitra, Mark Tiede, Mark Hasegawa-Johnson, Carol Espy-Wilson, Elliot Saltzman, & Louis Goldstein. Automatic gestural annotation of the U. Wisconsin X-ray Microbeam corpus. Workshop on New Tools and Methods for Very Large Scale Phonetics Research, University of Pennsylvania, Jan, 2011

Jui-Ting Huang, Mark Hasegawa-Johnson, & Jennifer Cole. How Unlabeled Data Change the Acoustic Models For Phonetic Classification. Workshop on New Tools and Methods for Very Large Scale Phonetics Research, University of Pennsylvania, Jan, 2011

Arthur Kantor, & Mark Hasegawa-Johnson. HMM-based Pronunciation Dictionary Generation. Workshop on New Tools and Methods for Very Large Scale Phonetics Research, University of Pennsylvania, Jan, 2011

Rania Al-Sabbagh, Roxana Girju, Mark Hasegawa-Johnson, Elabbas Benmamoun, Rehab Duwairi, & Eiman Mustafawi. Using Web Mining Techniques to Build a Multi-Dialect Lexicon of Arabic. Talk delivered at the Linguistics in the Gulf Conference, 2011

R. Mertens, P.-S. Huang, L. Gottlieb, G. Friedland, & A. Divakaran. On the Application of Speaker Diarization to Audio Concept Detection for Multimedia Retrieval. IEEE International Symposium on Multimedia, pp. 446-451, 2011

Po-Sen Huang, Mark Hasegawa-Johnson, & Thyagaraju Damarla. Exemplar Selection Methods to Distinguish Human from Animal Footsteps. Second Annual Human and Light Vehicle Detection Workshop, Maryland, pp. 14:1-10, 2011

Po-Sen Huang, Thyagaraju Damarla, & Mark Hasegawa-Johnson. Multi-sensory features for Personnel Detection at Border Crossings. Fusion, 2011

Po-Sen Huang, Xiaodan Zhuang, & Mark Hasegawa-Johnson. Improving Acoustic Event Detection using Generalizable Visual Features and Multi-modality Modeling. ICASSP, pp. 349-352, 2011

Heejin Kim, Mark Hasegawa-Johnson, & Adrienne Perlman. Temporal and spectral characteristics of fricatives in dysarthria. Journal of the Acoustical Society of America, vol. 130, pp. 2446, 2011

Heejin Kim, Mark Hasegawa-Johnson, & Adrienne Perlman. Vowel Contrast and Speech Intelligibility in Dysarthria. Folia Phoniatrica et Logopaedica, vol. 63, no. 4, pp. 187-194, 2011

Jeremy Tidemann. Characterization of the Head-Related Transfer Function using Chirp and Maximum Length Sequence Excitation Signals. Master’s Thesis, University of Illinois, 2011

Tim Mahrt, Jui-Ting Huang, Yoonsook Mo, Margaret Fleck, Mark Hasegawa-Johnson, & Jennifer Cole. Optimal models of prosodic prominence using the Bayesian information criterion. Proc. Interspeech, pp. 2037-2040, 2011

İ. Yücel Ozbek, Mark Hasegawa-Johnson, & Mübeccel Demirekler. On Improving Dynamic State Space Approaches to Articulatory Inversion with MAP based Parameter Estimation. IEEE Transactions on Audio, Speech, and Language, vol. 20, no. 1, pp. 67-81, 2011

İ. Yücel Ozbek, Mark Hasegawa-Johnson, & Mübeccel Demirekler. Estimation of Articulatory Trajectories Based on Gaussian Mixture Model (GMM) with Audio-Visual Information Fusion and Dynamic Kalman Smoothing. IEEE Transactions on Audio, Speech, and Language, vol. 19, no. 5, pp. 1180-1195, 2011

Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, & Wolfgang Minker. Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation. Qatar Foundation Annual Research Forum, pp. 244, 2011

Mark Hasegawa-Johnson, Jui-Ting Huang, Roxana Girju, Rehab Mustafa Mohamma Duwairi, Eiman Mohd Tayyeb H B Mustafawi, & Elabbas Benmamoun. Learning to Recognize Speech from a Small Number of Labeled Examples. Qatar Foundation Annual Research Forum, pp. 269, 2011

Mark Hasegawa-Johnson, Jui-Ting Huang, Sarah King, & Xi Zhou. Normalized recognition of speech and audio events. Journal of the Acoustical Society of America, vol. 130, pp. 2524, 2011

Mark Hasegawa-Johnson, Jui-Ting Huang, & Xiaodan Zhuang. Semi-supervised learning for speech and audio processing. Journal of the Acoustical Society of America, vol. 130, pp. 2408, 2011

Boon Pang Lim. Computational Differences between Whispered and Non-whispered Speech. Master’s Thesis, University of Illinois, 2011

Mark Hasegawa-Johnson, Camille Goudeseune, Jennifer Cole, Hank Kaczmarski, Heejin Kim, Sarah King, Timothy Mahrt, Jui-Ting Huang, Xiaodan Zhuang, Kai-Hsiang Lin, Harsh Vardhan Sharma, Zhen Li, & Thomas S. Huang. Multimodal Speech and Audio User Interfaces for K-12 Outreach. APSIPA, pp. 256:1-8, 2011

Xiaodan Zhuang, Xi Zhou, Mark A. Hasegawa-Johnson, & Thomas S. Huang. Real-world Acoustic Event Detection. Pattern Recognition Letters, vol. 31, no. 2, pp. 1543-1551, Sep, 2010

Lae-Hoon Kim. Statistical Model Based Multi-Microphone Speech Processing: Toward Overcoming Mismatch Problem. Master’s Thesis, University of Illinois, Aug, 2010

Xi Zhou, Xiaodan Zhuang, Hao Tang, Mark A. Hasegawa-Johnson, & Thomas S. Huang. Novel Gaussianized Vector Representation for Improved Natural Scene Categorization. Pattern Recognition Letters, vol. 31, no. 8, pp. 702-708, Jun, 2010

David Harwath, & Mark Hasegawa-Johnson. Phonetic Landmark Detection for Automatic Language Identification. Speech Prosody, pp. 100231:1-4, 2010

Suma Bhat, Mark Hasegawa-Johnson, & Richard Sproat. Automatic Fluency Assessment by Signal-Level Measurement of Spontaneous Speech. INTERSPEECH Satellite Workshop on Second Language Studies: Acquisition, Learning, Education and Technology, 2010

Su-Youn Yoon, Mark Hasegawa-Johnson, & Richard Sproat. Landmark-based Automated Pronunciation Error Detection. Proceedings of Interspeech, pp. 614-617, 2010

Suma Bhat, Richard Sproat, Mark Hasegawa-Johnson, & Fred Davidson. Automatic fluency assessment using thin-slices of spontaneous speech. Language Testing Research Colloquium, Denver, CO, 2010

Heejin Kim, Katie Martin, Mark Hasegawa-Johnson, & Adrienne Perlman. Frequency of consonant articulation errors in dysarthric speech. Clinical Linguistics & Phonetics, vol. 24, no. 10, pp. 759-770, 2010

Heejin Kim, Mark Hasegawa-Johnson, & Adrienne Perlman. Acoustic Cues to Lexical Stress in Spastic Dysarthria. Speech Prosody, pp. 100891:1-4, 2010

Heejin Kim, Panying Rong, Torrey M. Loucks, & Mark Hasegawa-Johnson. Kinematic Analysis of Tongue Movement Control in Spastic Dysarthria. Proceedings of Interspeech, pp. 2578-2581, 2010

Bryce E Lobdell, Jont B Allen, & Mark A Hasegawa-Johnson. Intelligibility predictors and neural representation of speech. Speech Communication, 2010

Yoonsook Mo, Jennifer Cole, & Mark Hasegawa-Johnson. Prosodic effects on temporal structure of monosyllabic CVC words in American English. Speech Prosody, pp. 100208:1-4, 2010

Hao Tang, Mark Hasegawa-Johnson, & Thomas S. Huang. Non-Frontal View Facial Expression Recognition. ICME, pp. 1202-1207, 2010

Jui-Ting Huang, Po-Sen Huang, Yoonsook Mo, Mark Hasegawa-Johnson, & Jennifer Cole. Prosody-Dependent Acoustic Modeling Using Variable-Parameter Hidden Markov Models. Speech Prosody, pp. 100623:1-4, 2010

Hao Tang, Mark Hasegawa-Johnson, & Thomas S. Huang. Toward Robust Learning of the Gaussian Mixture State Emission Densities for Hidden Markov Models. ICASSP, 2010

Arthur Kantor. Pronunciation modeling for large vocabulary speech recognition. Master’s Thesis, University of Illinois, 2010

Chi Hu. FSM-Based Pronunciation Modeling using Articulatory Phonological Code. Master’s Thesis, University of Illinois, 2010

Chi Hu, Xiaodan Zhuang, & Mark Hasegawa-Johnson. FSM-Based Pronunciation Modeling using Articulatory Phonological Code. Proceedings of Interspeech, pp. 2274-2277, 2010

Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Espy-Wilson, & Mark Hasegawa-Johnson. A procedure for estimating gestural scores from natural speech. Proceedings of Interspeech, pp. 30-33, 2010

Jui-Ting Huang, & Mark Hasegawa-Johnson. Semi-Supervised Training of Gaussian Mixture Models by Conditional Entropy Minimization. Proceedings of Interspeech, pp. 1353-1356, 2010

Harsh Vardhan Sharma, & Mark Hasegawa-Johnson. State Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition. HLT/NAACL Workshop on Speech and Language Processing for Assistive Technology (SLPAT), pp. 72-79, 2010

Xiaodan Zhuang, Lijuan Wang, Frank Soong, & Mark Hasegawa-Johnson. A Minimum Converted Trajectory Error (MCTE) Approach to High Quality Speech-to-Lips Conversion. Proceedings of Interspeech, pp. 1736-1739, 2010

Lae-Hoon Kim, & Mark Hasegawa-Johnson. Toward Overcoming Fundamental Limitation in Frequency-Domain Blind Source Separation for Reverberant Speech Mixtures. Proceedings of Asilomar, 2010

Lae-Hoon Kim, Kyung-Tae Kim, & Mark Hasegawa-Johnson. Robust Automatic Speech Recognition with Decoder Oriented Ideal Binary Mask Estimation. Proceedings of Interspeech, pp. 2066-2069, 2010

Lae-Hoon Kim, Kyungtae Kim, & Mark Hasegawa-Johnson. Speech enhancement beyond minimum mean squared error with perceptual noise shaping. J. Acoust. Soc. Am., vol. 127, no. 3, pp. 1817, 2010

Lae-Hoon Kim, Mark Hasegawa-Johnson, Gerasimos Potamianos, & Vit Libal. Joint Estimation of DOA and Speech Based on EM Beamforming. ICASSP, 2010

Su-Youn Yoon, Mark Hasegawa-Johnson, & Richard Sproat. Automated Pronunciation Scoring using Confidence Scoring and Landmark-based SVM. Proc. Interspeech, Brighton, pp. 1903-1906, Sep, 2009

I. Yücel Özbek, Mark Hasegawa-Johnson, & Mübeccel Demirekler. Formant Trajectories for Acoustic-to-Articulatory Inversion. Proc. Interspeech, Brighton, pp. 2807-2810, Sep, 2009

Yoonsook Mo, Jennifer Cole, & Mark Hasegawa-Johnson. Prosodic effects on vowel production: evidence from formant structure. Proc. Interspeech, Brighton, pp. 2535-2538, Sep, 2009

Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, & Elliot Saltzman. Articulatory Phonological Code for Word Recognition. Proc. Interspeech, Brighton, pp. 2763-2766, Sep, 2009

Harsh Vardhan Sharma, Mark Hasegawa-Johnson, Jon Gunderson, & Adrienne Perlman. Universal Access: Speech Recognition for Talkers with Spastic Dysarthria. Proc. Interspeech, Brighton, pp. 1451-1454, Sep, 2009

Bowon Lee, & Mark Hasegawa-Johnson. A Phonemic Restoration Approach for Automatic Speech Recognition with Highly Nonstationary Background Noise. DSP in Cars workshop, Dallas, Jul, 2009

Thomas S. Huang, Mark A. Hasegawa-Johnson, Stephen M. Chu, Zhihong Zeng, & Hao Tang. Sensitive Talking Heads. IEEE Signal Processing Magazine, vol. 26, no. 4, pp. 67-72, Jul, 2009

Lae-Hoon Kim, & Mark Hasegawa-Johnson. Optimal Multi-Microphone Speech Enhancement in Cars. DSP in Cars workshop, Dallas, Jul, 2009

Hao Tang, Stephen M. Chu, Mark Hasegawa-Johnson, & Thomas S. Huang. Emotion Recognition from Speech via Boosted Gaussian Mixture Models. International Conference on Multimedia & Expo (ICME’09), pp. 294-297, 2009

Mark Hasegawa-Johnson, Camille Goudeseune, Kai-Hsiang Lin, David Cohen, Xi Zhou, Xiaodan Zhuang, Kyungtae Kim, Hank Kaczmarski, & Thomas Huang. Visual Analytics for Audio. NIPS Workshop on Visual Analytics, 2009

Mark Hasegawa-Johnson. Pattern Recognition in Acoustic Signal Processing. Unpublished presentation at the Machine Learning Summer School, University of Chicago, 2009

Mark Hasegawa-Johnson. Tutorial: Pattern Recognition in Signal Processing. J. Acoust. Soc. Am., vol. 125, pp. 2698, 2009

Mark Hasegawa-Johnson, Xiaodan Zhuang, Xi Zhou, Camille Goudeseune, & Thomas S. Huang. Adaptation of tandem HMMs for non-speech audio event detection. J. Acoust. Soc. Am., vol. 125, pp. 2730, 2009

Xiaodan Zhuang, Jing Huang, Gerasimos Potamianos, & Mark Hasegawa-Johnson. Acoustic Fall Detection using Gaussian Mixture Models and GMM Supervectors. ICASSP, pp. 69-72, 2009

Su-Youn Yoon, Mark Hasegawa-Johnson, & Richard Sproat. Automated Pronunciation Scoring for L2 English Learners. CALICO workshop, 2009

David Cohen, Camille Goudeseune, & Mark Hasegawa-Johnson. Efficient Simultaneous Multi-Scale Computation of FFTs. no. GT-FODAVA-09-01, Georgia Institute of Technology, 2009

Su-Youn Yoon, Lisa Pierce, Amanda Huensch, Eric Juul, Samantha Perkins, Richard Sproat, & Mark Hasegawa-Johnson. Construction of a rated speech corpus of L2 learners’ speech. CALICO journal, 2009

Bryce Lobdell. Models of Human Phone Transcription in Noise Based on Intelligibility Predictors. Master’s Thesis, University of Illinois, 2009

Yoonsook Mo, Jennifer Cole, & Mark Hasegawa-Johnson. How do ordinary listeners perceive prosodic prominence? Syntagmatic vs. Paradigmatic comparison.. J. Acoust. Soc. Am., vol. 125, no. 4, pp. 2572, 2009

Sarah Borys. Lovable Indestructible Grad Student of Chaos. Master’s Thesis, University of Illinois, 2009

Xiaodan Zhuang, Xi Zhou, Mark A. Hasegawa-Johnson, & Thomas S. Huang. Efficient Object Localization with Gaussianized Vector Representation. IMCE, pp. 89-96, 2009

Jui-Ting Huang, Xi Zhou, Mark Hasegawa-Johnson, & Thomas Huang. Kernel Metric Learning for Phonetic Classification. ASRU, pp. 141-145, 2009

Jui-Ting Huang, & Mark Hasegawa-Johnson. On semi-supervised learning of Gaussian mixture models for phonetic classification. NAACL HLT Workshop on Semi-Supervised Learning, pp. 75-83, 2009

Xiaodan Zhuang, Jui-Ting Huang, & Mark Hasegawa-Johnson. Speech Retrieval in Unknown Languages: a Pilot Study. NAACL HLT Cross-Lingual Information Access Workshop (CLIAWS), pp. 3-11, 2009

Jui-Ting Huang, & Mark Hasegawa-Johnson. Unsupervised Prosodic Break Detection in Mandarin Speech. SpeechProsody, pp. 165-168, 2008

Xiaodan Zhuang, & Mark Hasegawa-Johnson. Towards Interpretation of Creakiness in Switchboard. SpeechProsody, pp. 37-40, 2008

Taejin Yoon, Jennifer Cole, & Mark Hasegawa-Johnson. Detecting Non-Modal Phonation in Telephone Speech. SpeechProsody, pp. 33-36, 2008

Xiaodan Zhuang, Xi Zhou, Thomas S. Huang, & Mark Hasegawa-Johnson. Feature Analysis and Selection for Acoustic Event Detection. ICASSP, pp. 17-20, 2008

Xi Zhou, Xiaodan Zhuang, Ming Lui, Hao Tang, Mark Hasegawa-Johnson, & Thomas Huang. HMM-Based Acoustic Event Detection with AdaBoost Feature Selection. Lecture Notes in Computer Science, vol. 4625, pp. 345-353, 2008

Bryce Lobdell, Mark Hasegawa-Johnson, & Jont B. Allen. Human Speech Perception and Feature Extraction. Proc. Interspeech, pp. 1797-1800, 2008

Yoonsook Mo, Jennifer Cole, & Mark Hasegawa-Johnson. Frequency and repetition effects outweigh phonetic detail in prominence perception. LabPhon 11, pp. 29-30, 2008

Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, & Elliot Saltzman. The Entropy of Articulatory Phonological Code: Recognizing Gestures from Tract Variables. Proc. Interspeech, pp. 1489-1492, 2008

Sarah Borys. Lovable Indestructible Grad Student of Chaos. Cartoons published online, 2008

Yang Li. Incremental Training and Growth of Artificial Neural Networks. Master’s Thesis, University of Illinois, 2008

Xiaodan Zhuang, Xi Zhou, Mark Hasegawa-Johnson, & Thomas Huang. Face Age Estimation Using Patch-based Hidden Markov Model Supervectors. ICPR, pp. 1-4, 2008

Xi Zhou, Xiaodan Zhuang, Hao Tang, Mark Hasegawa-Johnson, & Thomas Huang. A Novel Gaussianized Vector Representation for Natural Scene Categorization. ICPR, pp. 1-4, 2008

Xi Zhou, Xiaodan Zhuang, Shuicheng Yan, Shih-Fu Chang, Mark Hasegawa-Johnson, & Thomas S. Huang. SIFT-Bag Kernel for Video Event Analysis. ACM Multimedia, pp. 229-238, 2008

Shuicheng Yan, Xi Zhou, Ming Liu, Mark Hasegawa-Johnson, & Thomas S. Huang. Regression from Patch Kernel. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2008

Jui-Ting Huang, & Mark Hasegawa-Johnson. Maximum Mutual Information Estimation with Unlabeled Data for Phonetic Classification.. Proc. Interspeech, 2008

Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, & Elliot Saltzman. The Entropy of Articulatory Phonological Code: Recognizing Gestures from Tract Variables. Proc. Interspeech, pp. 1489-1492, 2008

Arthur Kantor, & Mark Hasegawa-Johnson. Stream Weight Tuning in Dynamic Bayesian Networks. Proc. ICASSP, pp. 4525-4528, 2008

Sarah Borys. An SVM Front End Landmark Speech Recognition System. Master’s Thesis, University of Illinois, 2008

Mark Hasegawa-Johnson, Jennifer Cole, Ken Chen, Partha Lal, Amit Juneja, Taejin Yoon, Sarah Borys, & Xiaodan Zhuang. Prosodically Organized Automatic Speech Recognition. Language and Linguistics Monograph Series, vol. A25, Academica Sinica, Taiwan, pp. 101-128, 2008

Taejin Yoon, Xiaodan Zhuang, Jennifer Cole, & Mark Hasegawa-Johnson. Voice Quality Dependent Speech Recognition. Language and Linguistics Monograph Series, vol. A25, Academica Sinica, Taiwan, pp. 77-100, 2008

Harsh Vardhan Sharma. Universal Access: Experiments in Automatic Recognition of Dysarthric Speech. Master’s Thesis, University of Illinois, 2008

Heejin Kim, Mark Hasegawa-Johnson, Adrienne Perlman, Jon Gunderson, Thomas Huang, Kenneth Watkin, & Simone Frame. Dysarthric Speech Database for Universal Access Research. Proc. Interspeech, pp. 1741-1744, 2008

Hao Tang, Yun Fu, Jilin Tu, Mark Hasegawa-Johnson, & Thomas S. Huang. Humanoid Audio-Visual Avatar with Emotive Text-to-Speech Synthesis. IEEE Trans. Multimedia, vol. 10, no. 6, pp. 969-981, 2008

Hao Tang, Yuxiao Hu, Yun Fu, Mark Hasegawa-Johnson, & Thomas S. Huang. Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar. IEEE International Conference on Multimedia and Expo (ICME), pp. 1205-1208, 2008

Hao Tang, Xi Zhou, Matthias Odisio, Mark Hasegawa-Johnson, & Thomas Huang. Two-Stage Prosody Prediction for Emotional Text-to-Speech Synthesis. Proc. Interspeech, pp. 2138-2141, 2008

Hao Tang, Yun Fu, Jilin Tu, Thomas Huang, & Mark Hasegawa-Johnson. EAVA: A 3D Emotive Audio-Visual Avatar. IEEE Workshop on Applications of Computer Vision (IEEE WACV), pp. 1-6, 2008

Lae-Hoon Kim, Mark Hasegawa-Johnson, Jun-Seok Lim, & Koeng-Mo Sung. Acoustic model for robustness analysis of optimal multipoint room equalization. J. Acoust. Soc. Am., vol. 123, no. 4, pp. 2043-2053, 2008

Lae-Hoon Kim, & Mark Hasegawa-Johnson. Optimal Speech Estimator Considering Room Response as well as Additive Noise: Different Approaches in Low and High Frequency Range. ICASSP, pp. 4573-4576, 2008

Taejin Yoon, Jennifer Cole, & Mark Hasegawa-Johnson. On the edge: Acoustic cues to layered prosodic domains. Proc. International Congress on Phonetic Sciences (ICPhS), Saarbrücken, pp. 1264:1017-1020, Aug, 2007

Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, & Zhengyou Zhang. Frequency Domain Correspondence for Speaker Normalization. Proc. Interspeech, Antwerp, pp. 274-277, Aug, 2007

Mark Hasegawa-Johnson, Karen Livescu, Partha Lal, & Kate Saenko. Audiovisual Speech Recognition with Articulator Positions as Hidden Variables. Proc. International Congress on Phonetic Sciences (ICPhS), Saarbrücken, pp. 1719:297-302, Aug, 2007

Mark Hasegawa-Johnson. Audio-Visual Speech Recognition: Audio Noise, Video Noise, and Pronunciation Variability. talk given to the Signal Processing Society, IEEE Japan, Jun, 2007

Bowon Lee, & Mark Hasegawa-Johnson. Minimum Mean Squared Error A Posteriori Estimation of High Variance Vehicular Noise. 2007 Biennial on DSP for In-Vehicle and Mobile Systems, Istanbul, Jun, 2007

Karen Livescu, Ozgur Cetin, Mark Hasegawa-Johnson, Simon King, Chris Bartels, Nash Borges, Arthur Kantor, Partha Lal, Lisa Yung, Ari Bezman, Stephen Dawson-Haggerty, Bronwyn Woods, Joe Frankel, Matthew Magimai-Doss, & Kate Saenko. Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer Workshop. ICASSP, pp. 621-624, May, 2007

Taejin Yoon, Jennifer Cole, & Mark Hasegawa-Johnson. On the edge: Acoustic cues to layered prosodic domains.. 81st Annual Meeting of the Linguistic Society of America, Anaheim, CA, Jan, 2007

Taejin Yoon. A Predictive Model of Prosody Through Grammatical Interface: A Computational Approach. Master’s Thesis, University of Illinois, 2007

Tong Zhang, Mark Hasegawa-Johnson, & Stephen E. Levinson. Extraction of Pragmatic and Semantic Salience from Spontaneous Spoken English. Speech Communication, 2007

Xi Zhou, Yu Fun, Ming Liu, Mark Hasegawa-Johnson, & Thomas Huang. Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identification. International Conference on Multimedia and Expo, pp. 188-191, 2007

Ming Liu, Zhengyou Zhang, Mark Hasegawa-Johnson, & Thomas Huang. Exploring Discriminative Learning for Text-Independent Speaker Recognition. ICME, pp. 56-59, 2007

Soo-Eun Chang, Nicoline Ambrose, Kirk Erickson, & Mark Hasegawa-Johnson. Brain Anatomy Differences in Childhood Stuttering. Neuroimage, 2007

Jennifer Cole, Yoonsook Mo, & Mark Hasegawa-Johnson. Signal-based and expectation-based factors in the perception of prosodic prominence. Journal of Laboratory Phonology, 2007

Jennifer Cole, Heejin Kim, Hansook Choi, & Mark Hasegawa-Johnson. Prosodic effects on acoustic cues to stop voicing and place of articulation: Evidence from Radio News speech. J Phonetics, vol. 35, pp. 180-209, 2007

Mark Hasegawa-Johnson. Multi-Stream Approach to Audiovisual Automatic Speech Recognition. IEEE 9th Workshop on Multimedia Signal Processing (MMSP), pp. 328-331, 2007

Yun Fu, Xi Zhou, Ming Liu, Mark Hasegawa-Johnson, & Thomas S. Huang. Lipreading by Locality Discriminant Graph. IEEE International Conference on Image Processing (ICIP), pp. III:325-8, 2007

Karen Livescu, Özgür Çetin, Mark Hasegawa-Johnson, Simon King, Chris Bartels, Nash Borges, Arthur Kantor, Partha Lal, Lisa Yung, Ari Bezman, Stephen Dawson-Hagerty, Bronwyn Woods, Joe Frankel, Mathew Magimai-Doss, & Kate Saenko. Articulatory-Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: 2006 JHU Summer Workshop Final Report.. no. WS06, Johns Hopkins Center for Language and Speech Processing, 2007

Ken Chen, Mark Hasegawa-Johnson, & Jennifer Cole. A Factored Language Model for Prosody-Dependent Speech Recognition. Robust Speech Recognition and Understanding, Michael Grimm and Kristian Kroschel, eds., INTECH Publishing, pp. 319-332, 2007

Weimo Zhu, Mark Hasegawa-Johnson, Karen Chapman-Novakofski, & Arthur Kantor. Cellphone-Based Nutrition E-Diary. National Nutrient Database Conference, 2007

Weimo Zhu, Mark Hasegawa-Johnson, Arthur Kantor, Dan Roth, Yong Gao, Youngsik Park, & Lin Yang. E-coder for Automatic Scoring Physical Activity Diary Data: Development and Validation. ACSM, 2007

Mark Hasegawa-Johnson. Phonology and the Art of Automatic Speech Recognition. Director’s Seminar Series, Beckman Institute, University of Illinois at Urbana-Champaign, Nov, 2006

Rahul Chitturi, & Mark Hasegawa-Johnson. Novel Time-Domain Multi-class SVMs for Landmark Detection. Proc. Interspeech, pp. paper 1904-Thu1CaP.14, Sep, 2006

Mark Hasegawa-Johnson. Object Tracking and Asynchrony in Audio-Visual Speech Recognition.. talk given to the Artificial Intelligence, Vision, and Robotics seminar series, Aug, 2006

Mark Hasegawa-Johnson. Dealing with Acoustic Noise. Part IIII: Video. tutorial presentation given at WS06, Center for Language and Speech Processing, Jul, 2006

Mark Hasegawa-Johnson. Dealing with Acoustic Noise. Part II: Beamforming.. Tutorial presentation given at WS06, Center for Language and Speech Processing, Jul, 2006

Mark Hasegawa-Johnson. Dealing with Acoustic Noise. Part I: Spectral Estimation.. Tutorial presentation given at WS06, Center for Language and Speech Processing, Jul, 2006

Mark Hasegawa-Johnson, Jonathan Gunderson, Adrienne Perlman, & Thomas Huang. HMM-Based and SVM-Based Recognition of the Speech of Talkers with Spastic Dysarthria. ICASSP, pp. III:1060-3, May, 2006

Lae-Hoon Kim, Mark Hasegawa-Johnson, & Keung-Mo Sung. Generalized Optimal Multi-Microphone Speech Enhancement Using Sequential Minimum Variance Distortionless Response (MVDR) Beamforming and Postfiltering. ICASSP, pp. III:65-8, May, 2006

Tong Zhang, Mark Hasegawa-Johnson, & Stephen E. Levinson. Cognitive State Classification in a spoken tutorial dialogue system. Speech Communication, vol. 48, no. 6, 2006

Rajiv Reddy, & Mark Hasegawa-Johnson. Analysis of Pitch Contours in Repetition-Disfluency using Stem-ML. Midwest Computational Linguistics Colloquium, 2006

Soo-Eun Chang, Kirk I. Erickson, Nicoline G. Ambrose, Mark Hasegawa-Johnson, & C.L. Ludlow. Deficient white matter development in left hemisphere speech-language regions in children who stutter. Society for Neuroscience, Atlanta, GA, 2006

Rahul Chitturi, & Mark Hasegawa-Johnson. Novel entropy based moving average refiners for HMM landmarks. Proc. Interspeech 2006, pp. paper 1911-Wed1FoP.8, 2006

Heejin Kim, Taejin Yoon, Jennifer Cole, & Mark Hasegawa-Johnson. Acoustic differentiation of L- and L-L% in Switchboard and Radio News speech. Proceedings of Speech Prosody, Dresden, 2006

Rajiv Reddy. Analysis of Pitch Contours in Repetition-Disfluency Using Stem-ML. B.S. Thesis, University of Illinois, 2006

Bowon Lee. Robust Speech Recognition in a Car Using a Microphone Array. Master’s Thesis, University of Illinois, 2006

Camille Goudeseune, & Bowon Lee. AVICAR: Audio-Visual Speech Recognition in a Car Environment. Promotional Film, 2006

Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole, & Jeung-Yoon Choi. Prosody Dependent Speech Recognition on Radio News Corpus of American English. IEEE Transactions on Speech and Audio Processing, vol. 14, no. 1, pp. 232-245, 2006

Sarah Borys, & Mark Hasegawa-Johnson. Distinctive Feature Based SVM Discriminant Features for Improvements to Phone Recognition on Telephone Band Speech. ISCA Interspeech, Oct, 2005

Lae-Hoon Kim, & Mark Hasegawa-Johnson. Generalized multi-microphone spectral amplitude estimation based on correlated noise model. 119th Convention of the Audio Engineering Society, New York, Oct, 2005

Mark Hasegawa-Johnson, James Baker, Sarah Borys, Ken Chen, Emily Coogan, Steven Greenberg, Amit Juneja, Katrin Kirchhoff, Karen Livescu, Srividya Mohan, Jennifer Muller, Kemal Sönmez, & Tianyu Wang. Landmark-Based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop. ICASSP, pp. 1213-1216, Mar, 2005

Weimo Zhu, Mark Hasegawa-Johnson, & Mital Arun Gandhi. Accuracy of Voice-Recognition Technology in Collecting Behavior Diary Data.. Association of Test Publishers (ATP): Innovations in Testing, Mar, 2005

Tae-Jin Yoon, Cole, Jennifer, Mark Hasegawa-Johnson, & Chilin Shih. Detecting Non-modal Phonation in Telephone Speech. unpublished manuscript, 2005

Tae-Jin Yoon, Cole, Jennifer, Mark Hasegawa-Johnson, & Chilin Shih. Acoustic correlates of non-modal phonation in telephone speech. The Journal of the Acoustical Society of America, vol. 117, no. 4, pp. 2621, 2005

Christopher Co, & 2012.. Room Reconstruction and Navigation Using Acoustically Obtained Room Impulse Responses and a Mobile Robot Platform. Master’s Thesis, University of Illinois, 2005

Taejin Yoon. Mapping Syntax and Prosody. Presentation at the Midwest Computational Linguistics Colloquium, Columbus, OH, 2005

Jeung-Yoon Choi, Mark Hasegawa-Johnson, & Jennifer Cole. Finding Intonational Boundaries Using Acoustic Cues Related to the Voice Source. Journal of the Acoustical Society of America, vol. 118, no. 4, pp. 2579-88, 2005

Jennifer Cole, Mark Hasegawa-Johnson, Chilin Shih, Eun-Kyung Lee, Heejin Kim, H. Lu, Yoonsook Mo, & Tae-Jin Yoon. Prosodic Parallelism as a Cue to Repetition and Hesitation Disfluency. Disfluency In Spontaneous Speech (DISS’05), Aix-en-Provence, France, pp. 53-58, 2005

Yeojin Kim, & Mark Hasegawa-Johnson. Phonetic Segment Rescoring Using SVMs.. Midwest Computational Linguistics Colloquium, Columbus, OH, 2005

Mark Hasegawa-Johnson, James Baker, Steven Greenberg, Katrin Kirchhoff, Jennifer Muller, Kemal Sonmez, Sarah Borys, Ken Chen, Amit Juneja, Katrin Kirchhoff, Karen Livescu, Srividya Mohan, Emily Coogan, & Tianyu Wang. Landmark-Based Speech Recognition: Report of the 2004 Johns Hopkins Summer Workshop.. no. WS04, Johns Hopkins Center for Language and Speech Processing, 2005

Yanli Zheng. Acoustic Modeling and Feature Extraction for Speech Recognition. Master’s Thesis, University of Illinois, 2005

Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole, Sarah Borys, Sung-Suk Kim, Aaron Cohen, Tong Zhang, Jeung-Yoon Choi, Heejin Kim, Taejin Yoon, & Sandra Chavarria. Simultaneous Recognition of Words and Prosody in the Boston University Radio Speech Corpus. Speech Communication, vol. 46, no. 3-4, pp. 418-439, 2005

Tong Zhang, Mark Hasegawa-Johnson, & Stephen E. Levinson. A Hybrid Model for Spontaneous Speech Understanding. Proceedings of the National Conference on Artificial Intelligence, pp. 10.1.1.80.879:1-8, 2005

Arthur Kantor, Weimo Zhu, & Mark Hasegawa-Johnson. Restricted domain speech classification using automatic transcription and SVMs. Midwest Computational Linguistics Colloquium, 2005

Soo-Eun Chang, Nicoline Ambrose, & Mark Hasegawa-Johnson. An MRI (DTI) study on children with persistent developmental stuttering. ASHA Convention, Nov, 2004

Sarah Borys, Mark Hasegawa-Johnson, Ken Chen, & Aaron Cohen. Modeling and Recognition of Phonetic and Prosodic Factors for Improvements to Acoustic Speech Recognition Models. Proc. Interspeech, pp. 3013-3016, Oct, 2004

Mark Hasegawa-Johnson, Stephen E. Levinson, & Tong Zhang. Children’s Emotion Recognition in an Intelligent Tutoring Scenario. Proc. Interspeech, pp. 1441-1444, Oct, 2004

Yanli Zheng, Mark Hasegawa-Johnson, & Sarah Borys. Stop Consonant Classification by Dynamic Formant Trajectory. Proc. Interspeech, pp. 396-399, Oct, 2004

Tae-Jin Yoon, Sandra Chavarria, Jennifer Cole, & Mark Hasegawa-Johnson. Intertranscriber Reliability of Prosodic Labeling on Telephone Conversation Using ToBI. Proc. Interspeech, pp. 2729-2732, Oct, 2004

Mark Hasegawa-Johnson. Landmark-Based Speech Recognition: The Marriage of High-Dimensional Machine Learning Techniques with Modern Linguistic Representations. talk given at Tsinghua University, Oct, 2004

Mark Hasegawa-Johnson, & Ameya Deoras. A Factorial HMM Approach to Robust Isolated Digit Recognition in Background Music.. Proc. Interspeech, pp. 2093-2096, Oct, 2004

Ken Chen, & Mark Hasegawa-Johnson. Modeling pronunciation variation using artificial neural networks for English spontaneous speech. Proc. Interspeech, pp. 400-403, Oct, 2004

Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, & Thomas Huang. AVICAR: Audio-Visual Speech Corpus in a Car Environment.. Proc. Interspeech, pp. 380-383, Oct, 2004

Sarah Borys, Mark Hasegawa-Johnson, Ken Chen, & Aaron Cohen. Modeling and Recognition of Phonetic and Prosodic Factors for Improvements to Acoustic Speech Recognition Models. Proc. Interspeech, pp. 3013-3016, Oct, 2004

Mark Hasegawa-Johnson. Speech Recognition Models of the Interdependence Among Syntax, Prosody, and Segmental Acoustics. talk given at Tsinghua University, Oct, 2004

Mital Gandhi, & Mark Hasegawa-Johnson. Source Separation using Particle Filters. Proc. Interspeech, pp. 2673-2676, Oct, 2004

Mark Hasegawa-Johnson, Sarah Borys, & Ken Chen. Experiments in Landmark-Based Speech Recognition. Sound to Sense: Workshop in Honor of Kenneth N. Stevens, Jun, 2004

Mark Hasegawa-Johnson, Jennifer Cole, Chilin Shih, Ken Chen, Aaron Cohen, Sandra Chavarria, Heejin Kim, Taejin Yoon, Sarah Borys, & Jeung-Yoon Choi. Speech Recognition Models of the Interdependence Among Syntax, Prosody, and Segmental Acoustics. HLT/NAACL Workshop on Higher-Level Knowledge in Automatic Speech Recognition and Understanding, pp. 56-63, May, 2004

Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, & Jennifer Cole. A Maximum Likelihood Prosody Recognizer. SpeechProsody, Nara, Japan, pp. 509-512, Mar, 2004

Yuexi Ren, Sung-Suk Kim, Mark Hasegawa-Johnson, & Jennifer Cole. Speaker-Independent Automatic Detection of Pitch Accent. Proc. Speech Prosody 2004, Nara, Japan, pp. 521-524, Mar, 2004

Heejin Kim, Jennifer Cole, Hansook Choi, & Mark Hasegawa-Johnson. The Effect of Accent on Acoustic Cues to Stop Voicing and Place of Articulation in Radio News Speech. SpeechProsody, Nara, Japan, pp. 29-32, Mar, 2004

Sandra Chavarria, Taejin Yoon, Jennifer Cole, & Mark Hasegawa-Johnson. Acoustic differentiation of ip and IP boundary levels: Comparison of L- and L-L% in the Switchboard corpus. Speech Prosody, Nara, Japan, pp. 333-336, Mar, 2004

Ken Chen, & Mark Hasegawa-Johnson. How Prosody Improves Word Recognition. Speech Prosody, Nara, Japan, pp. 583-586, Mar, 2004

Mark Hasegawa-Johnson, Stephen Levinson, & Tong Zhang. Automatic detection of contrast for speech understanding. Proc. Interspeech 2004, pp. 581-584, 2004

Mark Hasegawa-Johnson, Stephen Levinson, & Tong Zhang. Automatic detection of contrast for speech understanding. Proc. Interspeech 2004, pp. 581-584, 2004

Aaron Cohen. A Survey of Machine Learning Methods for Predicting Prosody in Radio Speech. Master’s Thesis, University of Illinois, 2004

Ken Chen, & Mark Hasegawa-Johnson. An Automatic Prosody Labeling System Using ANN-Based Syntactic-Prosodic Model and GMM-Based Acoustic-Prosodic Model. ICASSP, 2004

Sung-Suk Kim, Mark Hasegawa-Johnson, & Ken Chen. Automatic Recognition of Pitch Movements Using Multilayer Perceptron and Time-Delay Recursive Neural Network. IEEE Signal Processing Letters, vol. 11, no. 7, pp. 645-648, 2004

Yanli Zheng, & Mark Hasegawa-Johnson. Formant Tracking by Mixture State Particle Filter. ICASSP, 2004

Tae-Jin Yoon, Heejin Kim, & Sandra Chavarría.. Local Acoustic Cues Distinguishing Two Levels of prosodic Phrasing: Speech Corpus Evidence. Labphon 9, University of Illinois at Urbana-Champaign, 2004

Mohammad Kamal Omar, & Mark Hasegawa-Johnson. Model Enforcement: A Unified Feature Transformation Framework for Classification and Recognition. IEEE Transactions on Signal Processing, vol. 52, no. 10, pp. 2701-2710, 2004

Ameya Deoras, & Mark Hasegawa-Johnson. A Factorial HMM Approach to Simultaneous Recognition of Isolated Digits Spoken by Multiple Talkers on One Audio Channel. ICASSP, 2004

Stefan Geirhofer. Feature Reduction with Linear Discriminant Analysis and its Performance on Phoneme Recognition. Undergraduate research project, 2004

Yuexi Ren, Mark Hasegawa-Johnson, & Stephen E. Levinson. Semantic analysis for a speech user interface in an intelligent-tutoring system. Intl. Conf. on Intelligent User Interfaces, Madeira, Portugal, 2004

Ken Chen. Prosody Dependent Speech Recognition on American Radio News Speech. Master’s Thesis, University of Illinois, 2004

Yanli Zheng, & Mark Hasegawa-Johnson. Particle Filtering Approach to Bayesian Formant Tracking. IEEE Workshop on Statistical Signal Processing, pp. 581-584, Sep, 2003

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Maximum Conditional Mutual Information Projection For Speech Recognition. Proc. Interspeech, pp. 505-508, Sep, 2003

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Non-Linear Maximum Likelihood Feature Transformation For Speech Recognition. Proc. Interspeech, pp. 2497-2500, Sep, 2003

Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, & Jennifer Cole. Prosody Dependent Speech Recognition with Explicit Duration Modelling at Intonational Phrase Boundaries. Proc. Interspeech, pp. 393-396, Sep, 2003

Tong Zhang, Mark Hasegawa-Johnson, & Stephen E. Levinson. Mental State Detection of Dialogue System Users via Spoken Language. ISCA/IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR), pp. MAP17.1-4, Apr, 2003

Yanli Zheng, & Mark Hasegawa-Johnson. Acoustic segmentation using switching state Kalman Filter. ICASSP, pp. I:752-755, Apr, 2003

Yanli Zheng, Mark Hasegawa-Johnson, & Shamala Pizza. Analysis of the three-dimensional tongue shape using a three-index factor analysis model. Journal of the Acoustical Society of America, vol. 113, no. 1, pp. 478-486, Jan, 2003

Tong Zhang, Mark Hasegawa-Johnson, & Stephen E. Levinson. An empathic-tutoring systemq using spoken language. Australian conference on computer-human interactionq (OZCHI), pp. 498-501, 2003

Ken Chen, Mark Hasegawa-Johnson, & Sung-Suk Kim. An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer. International Conference on Systems, Cybernetics, and Intelligence, 2003

Ken Chen, & Mark Hasegawa-Johnson. Improving the robustness of prosody dependent language modeling based on prosody syntax cross-correlation. ASRU, 2003

Mark Hasegawa-Johnson, Shamala Pizza, Abeer Alwan, Jul Cha, & Katherine Haker. Vowel Category Dependence of the Relationship Between Palate Height, Tongue Height, and Oral Area. Journal of Speech, Language, and Hearing Research, vol. 46, no. 3, pp. 738-753, 2003

Mark Hasegawa-Johnson. Bayesian Learning for Models of Human Speech Perception. IEEE Workshop on Statistical Signal Processing, St. Louis, MO, pp. 393-396, 2003

Jennifer Cole, Hansook Choi, Heejin Kim, & Mark Hasegawa-Johnson. The effect of accent on the acoustic cues to stop voicing in Radio News speech. ICPhS, pp. 2665-2668, 2003

Mohamed Kamal Omar, & Mark Hasegawa-Johnson. Non-linear maximum likelihood feature transformation for speech recognition. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), pp. 2497-2500, 2003

Ameya Deoras. A Factorial HMM Approach to Robust Isolated Digit Recognition in Non-Stationary Noise.. B.S. Thesis, University of Illinois, 2003

Mohamed Kamal Mahmoud Omar. Acoustic Feature Design for Speech Recognition: A Statistical Information-Theoretic Approach.. Master’s Thesis, University of Illinois, 2003

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Approximately Independent Factors of Speech Using Nonlinear Symplectic Transformation. IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 660-671, 2003

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Non-Linear Independent Component Analysis for Speech Recognition. International Conference on Computer, Communication and Control Technologies (CCCT ’03), 2003

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Strong-Sense Class-Dependent Features for Statistical Recognition. IEEE Workshop on Statistical Signal Processing, St. Louis, MO, pp. 473-476, 2003

Ken Chen, Mark Hasegawa-Johnson, & Jennifer Cole. Prosody Dependent Speech Recognition on Radio News. IEEE Workshop on Statistical Signal Processing, St. Louis, MO, 2003

Sarah Borys. Recognition of Prosodic Factors and Detection of Landmarks for Improvements to Continuous Speech Recognition Systems. B.S. Thesis, University of Illinois, 2003

Sarah Borys, Mark Hasegawa-Johnson, & Jennifer Cole. The Importance of Prosodic Factors in Phoneme Modeling with Applications to Speech Recognition. ACL Student Session, 2003

Sarah Borys, Mark Hasegawa-Johnson, & Jennifer Cole. Prosody as a Conditioning Variable in Speech Recognition. Illinois Journal of Undergraduate Research, 2003

Bowon Lee, Mark Hasegawa-Johnson, & Camille Goudeseune. Open Loop Multichannel Inversion of Room Impulse Response. J. Acoust. Soc. Am., vol. 113, no. 4, pp. 2202-2203, 2003

Mark Hasegawa-Johnson, & Abeer Alwan. Speech Coding: Fundamentals and Applications. Wiley Encyclopedia of Telecommunications and Signal Processing, J. Proakis, eds., Wiley and Sons, NY, Dec, 2002

Mohammed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, & Yigal Brandman. An Evaluation of using Mutual Information for Selection of Acoustic-Features Representation of Phonemes for Speech Recognition. Proc. Interspeech, Denver, CO, pp. 2129-2132, Sep, 2002

Stephen E. Levinson, Thomas S. Huang, Mark A. Hasegawa-Johnson, Ken Chen, Stephen Chu, Ashutosh Garg, Zhinian Jing, Danfeng Li, J. Lin, Mohammed Kamal Omar, & Z. Wen. Multimodal Dialog Systems Research at Illinois. ARPA Workshop on Multimodal Speech Recognition and SPINE, Jun, 2002

Zhinian Jing, & Mark Hasegawa-Johnson. Auditory-Modeling Inspired Methods of Feature Extraction for Robust Automatic Speech Recognition. ICASSP, pp. IV:4176, May, 2002

Mohammed Kamal Omar, & Mark Hasegawa-Johnson. Maximum Mutual Information Based Acoustic Features Representation of Phonological Features for Speech Recognition. ICASSP, pp. I:81-84, May, 2002

David Petruncio. Evaluation of Various Features for Music Genre Classification with Hidden Markov Models. B.S. Thesis, University of Illinois, 2002

Mark Hasegawa-Johnson. Finding the Best Acoustic Measurements for Landmark-Based Speech Recognition. Accumu Magazine, vol. 11, Kyoto Computer Gakuin, Kyoto, Japan, pp. 45-47, 2002

Zhinian Jing. Voice Index and Frame Index for Recognition of Digits in Speech Background.. Master’s Thesis, University of Illinois, 2002

James Beauchamp, Heinrich Taube, Sever Tipei, Scott Wyatt, Lippold Haken, & Mark Hasegawa-Johnson. Acoustics, Audio, and Music Technology Education at the University of Illinois. J. Acoust. Soc. Am., vol. 110, no. 5, pp. 2961, 2001

Mark Hasegawa-Johnson. Preliminary Work and Proposed Continuation: Imaging of Speech Anatomy and Behavior.. Unpublished presentation at Universities of Illinois Inter-campus Biomedical Imaging Forum, 2001

Mohammed K. Omar, Mark Hasegawa-Johnson, & Stephen E. Levinson. Gaussian Mixture Models of Phonetic Boundaries for Speech Recognition. ASRU, 2001

Wira Gunawan, & Mark Hasegawa-Johnson. PLP Coefficients can be Quantized at 400 bps. ICASSP, Salt Lake City, UT, pp. 2.2.1-4, 2001

Mark Hasegawa-Johnson. Line Spectral Frequencies are the Poles and Zeros of a Discrete Matched-Impedance Vocal Tract Model. Journal of the Acoustical Society of America, vol. 108, no. 1, pp. 457-460, 2000

Yanli Zheng, & Mark Hasegawa-Johnson. Three Dimensional Tongue shape Factor Analysis. ASHA Leader, vol. 5, no. 16, pp. 144, 2000

Mark Hasegawa-Johnson. Time-frequency distribution of partial phonetic information measured using mutual information. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), pp. vol. 4, 133-136, 2000

Mark Hasegawa-Johnson. Multivariate-State Hidden Markov Models for Simultaneous Transcription of Phones and Formants. ICASSP, Istanbul, pp. 1323-1326, 2000

Wira Gunawan. Distributed Speech Recognition. Master’s Thesis, University of Illinois, 2000

Jul Setsu Cha. Articulatory Speech Synthesis of Female and Male Talkers. Master’s Thesis, UCLA, 2000

Jun Huang, Stephen Levinson, & Mark Hasegawa-Johnson. Signal approximation in Hilbert space and its application on articulatory speech synthesis. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), pp. vol. 2, 775-778, 2000

Mark Hasegawa-Johnson, Jul Cha, Shamala Pizza, & Katherine Haker. CTMRedit: A case study in human-computer interface design. International Conference On Public Participation and Information Technology, Lisbon, pp. 575-584, 1999

Mark Hasegawa-Johnson, Jul Cha, & Katherine Haker. CTMRedit: A Matlab-based tool for segmenting and interpolating MRI and CT images in three orthogonal planes. 21st Annual International Conference of the IEEE/EMBS Society, pp. 1170, 1999

Mark Hasegawa-Johnson. Combining magnetic resonance image planes in the Fourier domain for improved spatial resolution. International Conference On Signal Processing Applications and Technology, Orlando, FL, pp. 81.1-5, 1999

Tomohiko Taniguchi, & Mark Johnson. Speech coding and decoding system. United States Patent Number 5799131, Aug, 1998

Mark Hasegawa-Johnson. Electromagnetic Exposure Safety of the Carstens Articulograph AG100. Journal of the Acoustics Society of America, vol. 104, pp. 2529-2532, 1998

Sumiko Takayanagi, Mark Hasegawa-Johnson, Laurie S. Eisner, & Amy Schaefer-Martinez. Information theory and variance estimation techniques in the analysis of category rating data and paired comparisons.. J. Acoust. Soc. Am., vol. 102, pp. 3091, 1997

Mark A. Hasegawa-Johnson. Formant and Burst Spectral Measurements with Quantitative Error Models for Speech Sound Classification. Master’s Thesis, MIT, 1996

Mark A. Hasegawa-Johnson. Burst spectral measures and formant frequencies can be used to accurately discriminate stop place of articulation. J. Acoust. Soc. Am., vol. 98, pp. 2890, 1995

Tomohiko Taniguchi, Mark Johnson, Yasuji Ohta, Hideki Kurihara, Yoshinori Tanaka, & Yoshihito Sakai. Speech coding system having codebook storing differential vectors between each two adjoining code vectors. United States Patent Number 5323486, Jun, 1994

Mark A. Johnson. A mapping between trainable generalized properties and the acoustic correlates of distinctive features. MIT Speech Communication Group Working Papers, vol. 9, pp. 94-105, 1994

Mark Johnson. Automatic context-sensitive measurement of the acoustic correlates of distinctive features. ICSLP, Yokohama, pp. 1639-1643, 1994

Tomohiko Taniguchi, & Mark Johnson. Speech coding system. United States Patent Number 5245662, Sep, 1993

Tomohiko Taniguchi, Mark Johnson, Hideki Kurihara, Yoshinori Tanaka, & Yasuji Ohta. Speech coding and decoding system. United States Patent Number 5199076, Mar, 1993

Mark A. Johnson. A mapping between trainable generalized properties and the acoustic correlates of distinctive features. J. Acoust. Soc. Am., vol. 94, pp. 1865, 1993

Mark A. Johnson. Using beam elements to model the vocal fold length in breathy voicing. J. Acoust. Soc. Am., vol. 91, pp. 2420-2421, 1992

Mark A. Johnson. Analysis of durational rhythms in two poems by Robert Frost. MIT Speech Communication Group Working Papers, vol. 8, pp. 29-42, 1992

Mark Johnson, & Tomohiko Taniguchi. On-line and off-line computational reduction techniques using backward filtering in CELP speech coders. IEEE Transactions Acoustics, Speech, and Signal Processing, vol. 40, pp. 2090-2093, 1992

Mark A. Johnson, & Tomohiko Taniguchi. Low-complexity multi-mode VXC using multi-stage optimization and mode selection. ICASSP, Toronto, Canada, pp. 221-224, 1991

Tomohiko Taniguchi, Mark A. Johnson, & Yasuji Ohta. Pitch sharpening for perceptually improved CELP, and the sparse-delta codebook for reduced computation. ICASSP, Toronto, Canada, pp. 241-244, 1991

Tomohiko Taniguchi, Fumio Amano, & Mark A. Johnson. Improving the performance of CELP-based speech coding at low bit rates. International Symposium on Circuits and Systems, Singapore, 1991

Mark A. Johnson, & Tomohiko Taniguchi. Computational reduction in sparse-codebook CELP using backward-weighting of the input. Institute of Electr. and Information and Comm. Eng. Symposium DSP 90-15, Hakata, pp. 61-66, 1990

Tomohiko Taniguchi, Mark A. Johnson, & Yasuji Ohta. Multi-vector pitch-orthogonal LPC: quality speech with low complexity at rates between 4 and 8 kbps. ICSLP, Kobe, pp. 113-116, 1990

Mark A. Johnson, & Tomohiko Taniguchi. Pitch-orthogonal code-excited LPC. IEEE Global Telecommunications Conference (GLOBECOM), San Diego, CA, pp. 542-546, 1990