Training

Expanding Accessibility in Neuroscience Through Speech-to-Text Technologies

By Tilak Ratnanather, PhD

Jul 22, 2020

Tilak Ratnanather shakes hands with President Obama.

Campuses worldwide are seeing the first few waves of students with hearing loss who were diagnosed at or just after birth and subsequently benefited from early intervention with digital hearing aids and/or cochlear implants. Now is the time for the global STEM community to leverage speech-to-text (S2T) technologies in new ways that can benefit everyone, including those with hearing loss.

By World Health Organization estimates, over 5% of the world’s population has disabling hearing loss. That number is expected to rise through 2050 and have a significant impact on the representation of people with hearing loss in STEM — less than one percent, according to a report of a NSF workshop.

People with hearing loss face additional challenges in classrooms, laboratories, and other spaces, such as ignorance, invisibility, isolation, and impostor syndrome. For example, a student with single-sided hearing loss may be doing well academically but struggle to follow a lecture, seminar, or lab meeting, which can lead to brain fatigue that cumulatively has a negative effect on learning. Time-honored accommodations such as note-taking and human captioning (speech transcribed by humans) have proven to be useful, but lack of resources limits their availability.

However, the advent of cloud computing and machine learning in recent years has resulted in a quantum leap in the performance of S2T technologies, bringing the so-called word error rate down from 15% to 5% with training, which is on par with human captioning and potentially makes STEM more widely accessible to people with hearing loss. What’s more, the COVID-19 pandemic has changed the landscape of higher education as well as labs, increasing the urgency of leveraging S2T technologies.

Speech-to-Text Technology Options

S2T apps use automatic speech recognition (ASR) algorithms on different electronic devices. Major ones that are freely available include Microsoft Translator, Google’s Live Transcribe, and Web Captioner. Others, which charge a fee for use above a threshold, include Otter.ai, verbit.ai and Live Caption.

Many can be integrated either manually or digitally into online conferencing apps such as Zoom, Google Meet, Microsoft Teams, WebEx, and GoToMeeting. Furthermore, online presentations can be captioned via Presentation Translator and Google Slides.

These apps and software now make it possible for people with hearing loss to participate in classrooms, seminars, lab meetings, one-on-one meetings, and conferences. They can also mitigate the unforeseen impact of facemasks, which make lipreading impossible. When using any of the above options, clarity and audibility are central to ensuring the output is accurate, whether listening to podcasts, videos, or other audio.

Benefits to the Scientific Community

A few institutions have begun to use S2T apps for the whole class in several courses. The Rochester Institute of Technology piloted the use of Presentation Translator in a general biology class of 250 students, and the university has partnered with Microsoft to further support students with hearing loss.

Not only is captioning useful to students with hearing loss, but it can provide a cognitive benefit to anyone, allowing them to check what they heard or catch up if they were momentarily distracted. It should now be possible to embed captioning with shared notetaking to enable the whole class and instructor to see the different interpretations of the same material.

In the lab, S2T apps on smartphones aided by Bluetooth microphones or USB microphones, which have directional mics, can help to alleviate communication barriers imposed by the current need to use facemasks. In fact, demonstration of laboratory and surgical procedures could be facilitated with S2T apps running on movable flat panels.

Conferences, meanwhile, can leverage existing audio-visual technologies to feed audio into a S2T app to provide transcription in the native language of the attendee. This means that any facility that is used for meetings, seminars, or conferences should implement high-fidelity microphones in the same way that wheelchair ramps are now mandated for buildings. While S2T apps do not meet the minimum requirements of mandated accommodations, there is no reason why in the next few years, they should not be as competitive as human captioning and therefore make STEM globally accessible.

For more resources on accessibility for people with hearing loss, visit this knowledge base or detailed technical diagrams of using S2T apps.

Preparation of this article was supported by a R25 NS107167 grant from NIH.

About the Contributor

Tilak Ratnanather, PhD

Born with profound bilateral hearing loss, Tilak Ratnanather is an associate research professor in the Department of Biomedical Engineering at Johns Hopkins University. His research focus is on the analysis of brain structures implicated in neuropsychiatric and neurodevelopmental disorders such as schizophrenia, Alzheimer’s, epilepsy, Huntington’s, deafness and speech, and language processing.

Expanding Accessibility in Neuroscience Through Speech-to-Text Technologies

About the Contributor

More in Training

Follow SfN