If you have recordings from a source that isn’t directly integrated with CallVault — or you just have a file on your computer — you can upload it directly. CallVault will transcribe it and generate an AI summary just like any other call.
Uploading a file
- Open the workspace or folder where you want the call to live
- Click Upload in the top-right corner
- Select your file from your computer (or drag and drop it onto the upload area)
- Give the call a title if you’d like — otherwise CallVault uses the filename
- Click Upload
Transcription starts automatically. Depending on the file length, it usually takes 1–5 minutes.
| Format | Extension | Notes |
|---|
| MP4 | .mp4 | Most common video format — works great |
| M4A | .m4a | Audio exported from Apple devices |
| MP3 | .mp3 | Widely supported audio format |
| WAV | .wav | Uncompressed audio — high quality |
| WebM | .webm | Common browser-recorded format |
| OGG | .ogg | Open audio format |
| MOV | .mov | QuickTime video files |
| MKV | .mkv | Matroska video container |
If your file is in a less common format and upload fails, convert it to MP4 or MP3 using a free tool like Handbrake or VLC before uploading.
File size and length
- Maximum file size: 2 GB per file
- Maximum recording length: There’s no hard cap on duration, but very long recordings (3+ hours) may take longer to transcribe
For multi-hour recordings like all-day workshops, consider splitting the file into segments for faster processing and easier navigation.
Transcription time
Transcription typically runs faster than real-time:
| Recording length | Approximate transcription time |
|---|
| Under 30 minutes | 1–2 minutes |
| 30–60 minutes | 2–4 minutes |
| 1–2 hours | 4–8 minutes |
| 2+ hours | 8–15 minutes |
You’ll receive a notification when transcription is complete and the call is ready to review.
Bulk upload
You can upload multiple files at once by selecting more than one file in the upload dialog. Each file becomes a separate call record and is queued for transcription individually.
After upload
Once transcription completes, the call record will have:
- A full text transcript with speaker labels
- An AI-generated summary with action items and key topics
- All the same features as calls imported from Fathom or Zoom
Speaker labels from uploaded files are inferred by CallVault’s transcription engine and may be less accurate than integrations where participant data is available. You can manually correct speaker names in the transcript view.