Summary of how to use the ultra-high precision transcription application '' that makes it easy for anyone to use the transcription AI 'Whisper', open source and local operation OK

There are many situations where transcription is necessary, such as meeting minutes and movie creation, but manual transcription is very troublesome. There is also

a method of transcription using OpenAI's transcription AI 'Whisper' , but there is also the problem that initial settings are difficult. With the free transcription service `` '' that makes Whisper easy to use, it is said that it is possible to achieve highly accurate transcription in a very easy and short time, so I actually tried using it. – Transcribe and translate any audio file

When you access from the above link, the following screen will be displayed. Click 'Transcribes for free' to transcribe.

Then you will be asked to sign in with your GitHub account. If you do not have a GitHub account, click 'Create an account'.

When the account creation screen is displayed, enter 'user name', 'e-mail address', 'password', prove that you are not a robot, and click 'Create account'.

Then, a screen to enter an 8-digit authentication code like the following will be displayed.

The verification code will be sent to the email address you entered on the account creation screen.

Enter the 8-digit code provided in the email.

Then, you will move to the following screen, so click 'Authorize Beyondcode'.

Now you can use

The procedure for using is as follows. First, click 'Browse' to select the file you want to transcribe.

When the file selection screen appears, select the desired file. At this time, the selectable file format is either 'mp3' 'mp4' 'mpeg' 'mpga' 'm4a' 'wav' 'webm', and the maximum file size is 25MB. This time, I chose an mp3 file extracted from

the movie 'Inuoh' stage greeting talk movie .

After selecting the file, click 'Transcribe'.

Then, a notification saying 'Transcription in progress. The page will be automatically updated when the transcription is completed' will be displayed, so wait for a while.

This time, the transcription was completed in about 2 minutes and the page was updated. After the page refreshes, scroll down to see the transcription results.

At the bottom of the page, the transcription results are displayed as a slurry. It seems that I am not good at personal names and proper nouns, such as ``Abu-chan'' as ``Amu-chan'' and ``Director Masaaki Yuasa'' as ``Director Ruasan Masaki'', but overall I am able to transcribe with high accuracy. It is amazing that an audio file that exceeds 20 minutes can be transcribed in about 2 minutes with such accuracy.

By clicking the play button at the top of the transcription result, you can check the transcription result of the corresponding part while listening to the audio.

If you want to download the transcription result, click 'Download transcript'.

Then, I was able to download the transcription result in vtt format.

The contents of the downloaded vtt file look like this. Since the transcription results are recorded with time information, you can easily create movies with subtitles using compatible software.

The source code of is published in the following GitHub repository. Also, if you get the OpenAI API yourself, you can build on your local environment.

GitHub - beyondcode/ Transcribe and translate your audio files - for free

in Review,   Web Application, Posted by log1o_hf