Oracle APEX: Transcribing Audio to Text Without REST, Directly in the Browser in only 5 Minutes


Whenever I’m researching something to solve a specific need, I usually explore Google and GitHub, downloading apps shared by other devs to learn new techniques. While looking for ways to implement drag and drop in an interactive grid, I stumbled upon an audio-to-text transcription test by Paulo Kunzel [at this link]. Since I already had the app installed, I decided to give it a try.
Here’s my take: I found this approach fantastic because it’s incredibly fast and runs entirely in the browser — no need to sign up for any external service, call REST endpoints, or use AI APIs to process the transcription. I asked ChatGPT to explain how it works, and here’s the answer:
“Speech recognition in the browser using the SpeechRecognition API works natively, without needing to manually send the audio to a server. However, the audio is processed in the cloud — usually by Google’s servers — automatically and transparently. The browser captures sound from the microphone, sends it to the speech recognition servers, and returns the converted text via JavaScript events, all without requiring manual API calls or external libraries.”
After exploring the app a bit, I made a few small tweaks to handle longer audio and decided to share this simple technique here.
Create a button
Give it any label and static ID. In this case, I’ll call the button DITAR and set its
static-id
tost_ditar
. The button type should be one that triggers a Dynamic Action.Create a text field
This will display the transcribed text in real time. In this example, it’s
P31_TEXTO_RAPIDO
.Add the following code
Insert it in the “Execute when Page Loads” section of your APEX page, adjusting the IDs to match your setup:
let recognition = new window.SpeechRecognition();
recognition.grammars = new window.SpeechGrammarList(); recognition.continuous = true;
recognition.lang = 'pt-BR';
recognition.interimResults = true;
recognition.maxAlternatives = 1;
/* recognition.addEventListener('speechend', () =>{ recognition.stop(); }); */
document.querySelector('#st_ditar').addEventListener('mousedown', () =>{ recognition.start(); });
document.querySelector('#st_ditar').addEventListener('mouseup', () =>{ recognition.stop(); });
recognition.addEventListener('error', (event) => { console.error('Erro no reconhecimento do texto: ', event.error); });
recognition.onresult = function (event) { console.log('-----------------------------');
let last = event.results.length - 1;
let texto = event.results[last][0].transcript;
apex.item('P31_TEXTO_RAPIDO').setValue(texto);
Quick Explanation
This script enables speech recognition in the browser using the SpeechRecognition
API. When the user presses and holds the button with id="st_ditar"
, recognition starts (mousedown
), and when they release (mouseup
), it stops. The audio is sent to the browser’s speech recognition service (usually Google or Microsoft), which returns the text. That text is then assigned to the P31_TEXTO_RAPIDO
item in Oracle APEX.
The settings specify:
Language: Brazilian Portuguese (
pt-BR
). You can also change this setting to recognize other languages.interimResults:
true
for real-time feedbackmaxAlternatives:
1
to return just the best matchErrors are logged to the console using
console.error
.
About the Two Key Parameters:
interimResults
true
: Returns partial results as you speak, ideal for real-time transcription.false
: Only returns the final result after a pause, with fewer event triggers but no live feedback.
maxAlternatives
1
: Returns only the most confident interpretation.>1
(e.g., 3): Returns multiple options ranked by confidence — useful if you want to let the user pick or apply fuzzy logic.
My Example Screen Looked Like This:
You can replicate this in your app in under 5 minutes. Try it out and share your feedback in the comments!
Subscribe to my newsletter
Read articles from Valter Zanchetti Filho directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Valter Zanchetti Filho
Valter Zanchetti Filho
Oracle APEX Certified Developer, passionate about the Oracle Database ecosystem. A dedicated Oracle APEX evangelist with experience in enterprise solutions, API integrations, performance tuning, and data modeling. On my blog, I share practical, real-world solutions I use daily — always focused on simplicity and efficiency.