How to Get Started
There are 3 steps to getting started:
- Choose a transcription company/model
- Check out the section on Which Model to Choose (below) and make a selection
- I’d recommend Speechmatics for most use-cases
- Generate and save an API key
- Head to the website for the model you’ve chosen, and create an account
- I use Google to sign in because it’s easier
- Then make sure you add a card to your account
- Finally, find the API Keys section and generate a new key
- For more information, check out the section below on Saving Your API Key
- Create a workflow
- Finally, you just need to create a workflow
- Your workflow should start with the “Call Status” trigger
- Then you need to make sure there is a “Wait” step right after that (ideally 30+ seconds - 0.5 minutes)
- After the “Wait” step, you can add a Generate [model choice] Call Transcript (Speechmatics/Deepgram)
- Now you can save that transcript to the notes section, or feed to to ChatGPT to generate a summary, and add that to the notes
Which Model to Choose?
This app gives you the choice of two services and three models.
These are:
- Speechmatics Enhanced
- Deepgram Nova 2 Phone Call
- Deepgram Nova 3
Generally, we’re concerned about two main metrics:
- Word Error Rate (WER): How accurate the model is at figuring out which words were said
- Diarization: Differentiating between the speakers
Here are my recommendations for you:
- Speechmatics Enhanced is currently the best model for diarization and WER, this results in the best transcripts for summarizing and analyzing calls
- Currently priced at $1.04/hour (versus $2.40/hr for native transcripts)
- Use this model if you’re not overly concerned about price, and just want the most reliable summaries and call analysis
- Deepgram is cheaper and allows for redacting sensitive data such as credit card numbers, social security numbers, etc.
- Currently priced at ~$0.26/hr (1/4 that of Speechmatics & 1/9 of native transcripts)
- Use this model if you’re more concerned about price, or if you need to be able to redact sensitive data
- Read more about the Deepgram Transcript action below to see which model you should be using
Workflow Actions
The below workflow actions are entirely interchangable. The model you choose will simply depend on your priorities. To learn more about choosing between them, read the section above.
Generate Speechmatics Transcript
Speechmatics currently has the most accurate speech-to-text (STT) model available.
Whereas with Deepgram you may need to compromise on word error rate and diarization, Speechmatics provides exceptional performance for both.
I’d recommend using this workflow action unless you require redactions.
Generate Deepgram Transcript
This action was included at the request of a user who needed to be able to redact certain sensitive information from call transcripts such as bank account details, credit card details, personally identifiable information, etc.
It also happens to be cheaper, so use this if you are mostly concerned about cost, or if you just need redactions.
Choice of Deepgram Models
Model Name | Speaker Diarization | Word Error Rate |
Nova 2 Phone Call | ✅ Good | ❌ Poor |
Nova 3 | ❌ Poor | ✅ Good |
Generally, Nova 2 Phone Call should provide better results since the improved diarization should allow for a more accurate analysis by ChatGPT.
However, with both of them either you or ChatGPT should be able to infer the differences in words and speakers from one of the transcripts.
I’d suggest starting out with Nova 2 Phone Call, then testing Nova 3 if you’re not satisfied.
Saving Your API Key
Above is a video showing you how to generate an API key for Speechmatics and Deepgram. Any other transcription services should be very similar.
Generally, you want to keep API keys private. For these applications, the API keys aren’t especially sensitive since someone would need a large workload in order to sabotage you, but if they get ahold of it they could cause you some additional billing.
There are 4 methods of storing this API key:
- Least secure: Just paste your API key into the box for the workflow configuration. Best if you’re using a private sub-account where you aren’t worried about other people seeing it.
- Slightly more secure: Create a custom value in your sub-account settings with your API key, then use this custom value in the workflow action field.
- Even more secure: Store your API key in a Google Sheet, then pull it out with the Google Sheets action before running the workflow action in order to use it.
- Most secure: Email or text me your API key, and I can securely store it in the database. This is also the easiest if you have multiple sub-accounts.
Frequent Questions
Do you offer custom development work?
Should I install your app at the agency level or location level?
What’s your support like?
How do I install your app?
Do you offer a free trial?
Can I request a feature or product?
Are there any usage limits on your apps?
Can I use your app on multiple accounts?
How do I uninstall the app?
Need to Get in Touch?
If you have any questions, concerns, or ideas, I’d love to hear them!
Visit the page below to book a call or get in touch right away.