Being able to turn audio and video into text quickly and accurately is a powerful tool for all kinds of professionals and content creators. Whether it's to document meetings, turn webinars into blog posts, add captions to videos, or analyze interviews, transcription saves time and unlocks new ways to use your content.
This article details the main ways to transcribe your audio or video files using Tess.
The most straightforward way to get a transcription in Tess AI is through the specific area in AI Copilot. This feature is perfect if you're looking for something simple and fast, without needing to set up complicated prompts.
How to Access and Use:
On the left side menu of the Tess AI platform, find and click on AI Copilot.
Inside AI Copilot's options, choose the Transcription tab.
The transcription interface is split in two sides:
Left Side: Area to upload your audio or video file.
Right Side: Here’s where the transcribed text shows up.
Click "Choose file" on the left section and pick your audio or video file from your computer.
Supported Formats: Tess supports a bunch of formats, like MP3, MP4, MPEG, MPGA, M4A, and more. You can click the info icon (a circled "i") to see the full list.
File Size Limit: Your file can’t be bigger than 200 MB.
After uploading your file, hit the "Tess generate for me" button.
Wait for it to process. The time depends on your file size.
The transcribed text will show up on the right section of the screen.
Additional Resources in the Transcription Area:
Editing: You can directly edit the transcribed text, fix words, add punctuation, or format with bold, italics, and underline.
Cost: Transcription in this area has a fixed cost of 5 credits per run, plus a variable cost of 0.03 credits per second of audio/video. The total cost will be shown.
Output Options:
Copy: Copies all transcribed text to the clipboard.
Download (TXT): Downloads the transcription as a plain text file (.txt).
View as HTML: Lets you see the HTML code of the transcription.
Delete: Removes the generated transcription.
Tip: Always check your file’s format and size before you upload to make sure it’s compatible.
Another flexible way to transcribe files is by using the AI Copilot chat feature together with the Knowledge Base. This method lets you not only transcribe but also interact with the audio’s content, ask for summaries, analyses, or answers to specific questions based on the file.
How to Access and Use:
In the side menu, go to AI Copilot and pick the For Chat option.
In the chat interface, click the attachment icon (usually a paperclip) near the text input box.
Select "Add knowledge base".
In the window that pops up, pick "Audio" as the type of item to add.
Click "Choose file" and pick the audio file from your computer (200 MB limit).
Transcription Settings:
Transcription Model: Pick the transcription engine you want:
Deepgram: Known for being fast.
AssemblyAI: Focused on higher quality.
OpenAI: Offers a good balance between speed and quality.
Rev.ai: Perfect for transcriptions that need timestamping (marking the time for each part).
Language: Pick the audio language (ex: Portuguese).
Context Mode:
RAG: Recommended for bigger files. The AI splits the file and only checks what it needs to answer your question.
Deep Learning: Recommended for smaller files. The AI analyzes the entire content completely.
Click "Save". The file will be processed and added to your knowledge base for the current chat session.
Now you can interact with the audio. To get the full transcription, type a command like: "Transcribe the attached file"
or "Transcribe the audio I sent"
.
Besides transcription, you can ask for summaries, find key points, and more. Example: "Make a bullet-point summary of the attached file"
.
Tip: This method is great when you need more than just the raw text, letting you dig deeper and interactively analyze your audio content.
For users who need to add transcription to more complex workflows or want to create AIs that are great at analyzing audio content, AI Studio lets you build your own custom agents. (This feature is available for Individual or Business plan users).
How to Create a Transcription Agent:
Go to AI Studio in the sidebar menu.
Click on "Add new agent".
Agent Initial Settings:
AI Application Type: Select "Chat" (or "Text", depending on your final goal).
AI Model: You can pick a specific model (e.g., GPT-4o mini) or leave "All LLM Models" so the end user can pick.
Prompt: Set your agent’s persona, goal, and rules. Example for an agent that analyzes classes:
Persona: You are a specialist in pedagogy and educational content analysis.
Goal: Your job is to transcribe the given class and then give a summary of the main topics and three suggestions for the presenter to improve.
Rules: Be clear, concise, and give constructive feedback.
User Input (Class Upload):
Click on "Add a user input" below the system prompt.
Input Type: Select "File upload".
Input Name: Give it a descriptive name, like class recording
.
Transcription Step (AI Step):
Click on "Add an AI step".
Step Category: Select "AI Audio Transcription".
Step Type: Choose the transcription model (e.g., Deepgram Audio Transcription
).
Step Name: Give it a name, like class transcription
.
Media File: Click the link icon and select the variable for the user input you created earlier (e.g., **class-recording**
). This ensures the file uploaded by the user is used for transcription.
Language: Set the audio language.
Integrating the Transcription into the Main Prompt:
Go back to the agent’s System Prompt.
At the spot where you want the transcription to show up for analysis, insert the variable of the AI Step output. Example: Your job is to analyze the following transcribed class: **class-transcript** and then ...
Save and Preview:
Give your agent a name and save it.
Click on "Preview" to test. You’ll be able to upload the audio file, and the agent will follow the prompt instructions, using the transcription generated inside.
Tip: Creating agents is powerful for automating repetitive tasks and building custom AI solutions for your audio analysis needs.
Audio Quality: The better the original audio quality (less background noise, clear speakers), the more accurate the transcription will be.
File Limit: Remember the 200 MB per file limit for all transcription types in Tess AI.
Formats: Make sure your file format is supported before trying to upload.
Timestamping: If you need to know the exact moment for each speech, use the Rev.ai model for chat transcription (Knowledge Base).
Multiple Speakers: For audio with multiple speakers, recording clarity is even more crucial. Some models may have a harder time telling apart overlapping voices.
Tess AI gives you a robust and flexible set of tools for audio and video transcription, covering everything from simple, straightforward needs to complex automated workflows. By mastering the AI Copilot Transcription area, chat transcription with Knowledge Base, and agent creation in AI Studio, you can turn your audio and video content into text efficiently, saving time and getting the most out of your recordings.