Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added example Podcast_and_Audio_Transcription #665

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SonjeVilas
Copy link

Adds automated audio transcription using Gemini 2.0 with:

✅ Speaker identification (labeled or as Speaker A/B)
✅ Precision timestamps ([HH:MM:SS])
✅ Music/sound effect detection (e.g., [Jingle] or [Song Name])
✅ Clean text output with [END] marker

  • Testing: Verified with podcasts & call recordings.
  • Deps: jinja2, Gemini API client.

Useful for podcasts, interviews, and call analysis.

Adds automated audio transcription using Gemini 2.0 with:
✅ Speaker identification (labeled or as Speaker A/B)
✅ Precision timestamps ([HH:MM:SS])
✅ Music/sound effect detection (e.g., [Jingle] or [Song Name])
✅ Clean text output with [END] marker

Testing: Verified with podcasts & call recordings.
Deps: jinja2, Gemini API client.

Useful for podcasts, interviews, and call analysis.
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added the status:awaiting review PR awaiting review from a maintainer label Apr 4, 2025
@Giom-V
Copy link
Collaborator

Giom-V commented Apr 4, 2025

Thanks @SonjeVilas, that's an interesting example. I won't have time to review it today but I'll try to do it next week.

@@ -0,0 +1,289 @@
{
Copy link
Contributor

@andycandy andycandy Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    %pip install google-genai jinja2

Since google-genai is already installed, it might be cleaner to move the jinja2 installation to the top like genai


Reply via ReviewNB

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can remove that from the code snippet.

@@ -0,0 +1,289 @@
{
Copy link
Contributor

@andycandy andycandy Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick: it’d be good to follow the string formatting conventions used in the cookbook repo—keeping lines concise and using proper indentation for multi-line strings. For example, the formatting could look like:

Template("""
  Generate a transcript of the episode. Include timestamps and identify speakers.
  
  
  Speakers are:
  {% for speaker in speakers %}- {{ speaker }}{% if not loop.last %}\n{% endif %}{% endfor %}
  
  ...
""")
  
  

Reply via ReviewNB

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I can update it

@@ -0,0 +1,289 @@
{
Copy link
Collaborator

@Giom-V Giom-V Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niy: You forgot to close the code snippet


Reply via ReviewNB

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll make that change.

@@ -0,0 +1,289 @@
{
Copy link
Collaborator

@Giom-V Giom-V Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to add an example that readers can use right away. It has to be open sourced.

Worst case, you can use the Apollo 11 audio recording we're already using in other notebooks.

EDIT: better idea, make one using NotebookLM!


Reply via ReviewNB

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can make the notebook more readable to everyone

@@ -0,0 +1,289 @@
{
Copy link
Collaborator

@Giom-V Giom-V Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #47.        model="gemini-2.0-flash",

Can you add a "MODEL_ID" variable like this?

MODEL_ID="gemini-2.0-flash" # @param ["gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-pro-exp-03-25"] {"allow-input":true, isTemplate: true}

As it would make the notebook easier to maintain in the future.


Reply via ReviewNB

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! Definitely I can make a changes

@@ -0,0 +1,289 @@
{
Copy link
Collaborator

@Giom-V Giom-V Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok Thank you for addressing I can resolve the issue

Copy link
Collaborator

@Giom-V Giom-V left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @SonjeVilas,

That's a nice example. On top of what @andycandy already reported, and the minor stuff I pointed out, can you:

Thanks again!

@Giom-V Giom-V self-assigned this Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:awaiting review PR awaiting review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants