OpenAI introduces 'Whisper': Speech Recognition System

 

OpenAI introduces 'Whisper'

OpenAI has introduced a new automatic speech recognition (ASR) system called Whisper as open-source software. According to the business, Whisper allows for a strong transcription in multiple languages as well as translation into English from those languages.

Speech recognition systems, which are at the core of software and services from tech behemoths like Google, Amazon, and Meta, have been developed by countless businesses. Whisper was trained on 680,000 hours of multilingual and "multitask" data gathered from the web, which led to a greater recognition of distinctive dialects, background noise, and technical jargon, but according to OpenAI, this is what makes Whisper unique.

Whisper has flaws, especially regarding text prediction. Whisper may include words in its transcriptions that weren't said, according to OpenAI, because the system was trained on a lot of "noisy" data. This is possibly because Whisper is simultaneously attempting to forecast the next word in audio and to transcribe the audio itself. Whisper also doesn't function equally well across linguistic barriers, exhibiting a higher error rate for speakers of languages underrepresented in the training set.

"The primary intended users of [the Whisper] models are AI researchers studying robustness, generalization, capabilities, biases and constraints of the current model. However, Whisper is also potentially quite useful as an automatic speech recognition solution for developers, especially for English speech recognition," OpenAI wrote in the GitHub repo for Whisper, from where several versions of the system can be downloaded. "[The models] show strong ASR results in ~10 languages. They may exhibit additional capabilities … if fine-tuned on certain tasks like voice activity detection, speaker classification or speaker diarization but have not been robustly evaluated in these areas."

Unfortunately, that last bit is nothing new to the world of speech recognition. Even the finest systems have biases; a 2020 Stanford study found that systems from Amazon, Apple, Google, IBM, and Microsoft made far fewer mistakes — roughly 35% — with white users than Black users. OpenAI anticipates using Whisper's transcription capabilities to enhance current accessibility tools despite this.

Whisper's debut does not necessarily indicate what OpenAI has in store for the future. While concentrating more on commercial projects like DALL-E 2 and GPT-3, the company is also working on several purely theoretical research lines, such as artificial intelligence systems that learn by watching videos.

About Open AI

OpenAI is a non-profit research organization aiming to develop and guide artificial intelligence (AI) in ways that benefit all humanity. Elon Musk and Sam Altman created the business in 2015, with its headquarters in San Francisco, California. OpenAI was partly developed due to its founders' existential worries about the possibility of disaster by irresponsible use and abuse of general-purpose AI. The organization has a long-term focus on AI's capabilities and basic advancements. With a $1 billion endowment, the company's two founders and other investors launched it. Elon Musk left the organization in February 2018 because of potential conflicts with his work at Tesla, the electronics company founded by Nikola Tesla.

The company's declared goal of developing secure artificial general intelligence for humanity's benefit is mirrored in its intention to interact with other academic institutions and individuals openly. Except in situations where they might have a negative impact on safety, the company's research and patents are meant to be accessible to the general public.

Profile picture for user news@insiderapps.com
Peter Daniels
Peter Daniels is the lead journalist for InsiderApps.com


The business app store.
All the best web apps you need for your business. Curated and compared.
1,000+ Apps for every business category you can imagine. We independently review and compare software applications to find you the best ones for you what you need.
To accomplish your goals, you need the right tools.

interview news apps

CompStak

Commercial Real Estate Data Platform

monday.com

Work OS for a collaborative workspace

Adobe InDesign

Layout and page design solution

Corpay One

Bill pay automation & spend management solution

Plerdy

Conversion Rate Optimization Tools

EssentialPIM

Personal information manager

amoCRM

Messaging-powered CRM for small & medium businesses

Trello

Visual project collaboration tool

Jell

Daily Standup Appfor Technical teams

TalentLMS

Training platform to help teams grow

Como

Customer engagement using data

CL1CK

Optimized Discount Popup for Shopify