This commit brings several improvements to the application: - Updates all Python dependencies in requirements.txt to their latest versions. - Enhances file handling in capture.py by writing to a temporary file before renaming, preventing partial reads. - Strengthens error handling for API calls (OpenAI, ElevenLabs) and file operations in both capture.py and narrator.py. - Makes the ElevenLabs Voice ID configurable via an ELEVEN_VOICE_ID environment variable in narrator.py, with a sensible default. - Aligns the narrator's persona in narrator.py with a "David Attenborough" style by updating the system prompt. - Updates the README.md to remove outdated information, clarify API key usage, and include new configuration options. - Confirms that the current audio saving mechanism is suitable for archival/logging. - Upgrades the OpenAI model to gpt-4-turbo in narrator.py. - Reduces console noise by making the "Say cheese!" message in capture.py print only once. I did not add comprehensive docstrings and comments in this pass. |
||
---|---|---|
assets | ||
frames | ||
.gitignore | ||
README.md | ||
capture.py | ||
narrator.py | ||
requirements.txt |
README.md
David Attenborough narrates your life.
https://twitter.com/charliebholtz/status/1724815159590293764
Want to make your own AI app?
Check out Replicate. We make it easy to run machine learning models with an API.
Setup
Clone this repo, and setup and activate a virtualenv:
python3 -m pip install virtualenv
python3 -m virtualenv venv
source venv/bin/activate
Then, install the dependencies:
pip install -r requirements.txt
Next, make accounts with OpenAI and ElevenLabs and set your API key environment variables. The Python libraries used in this project will automatically detect and use these environment variables for authentication.
export OPENAI_API_KEY=<your-openai-api-key>
export ELEVENLABS_API_KEY=<your-elevenlabs-api-key>
export ELEVEN_VOICE_ID=<your-elevenlabs-voice-id> # Optional, see note below
Note on API Keys and Voice ID:
OPENAI_API_KEY
: Your API key from OpenAI, used for the vision and language model.ELEVENLABS_API_KEY
: Your API key from ElevenLabs, used for text-to-speech.ELEVEN_VOICE_ID
: This environment variable allows you to specify a custom voice from your ElevenLabs account. If this variable is not set, the application will default to using the voice ID "21m00Tcm4TlvDq8ikWAM". You can find your available voice IDs using the ElevenLabs voices API or by checking your account on their website. To use a custom voice, make a new voice in your ElevenLabs account and get its voice ID.
Run it!
python capture.py
In one terminal. In the other, run the narrator:
python narrator.py