ElevenLabs: Speech to text - ComfyUI Workflow

This ComfyUI workflow leverages the powerful capabilities of the ElevenLabs model to transcribe speech from audio or video files into text. The workflow begins with the 'LoadAudio' or 'RecordAudio' node, which allows users to either upload an existing audio file or record new audio directly within the interface. Once the audio input is secured, the 'ElevenLabsSpeechToText' node processes the audio data, utilizing advanced speech recognition algorithms to convert spoken words into accurate text. This text is then made available for preview and editing through the 'PreviewAny' node, ensuring that users can review and refine the transcription as needed. This workflow is particularly useful for creating transcripts from interviews, lectures, or any spoken content, making it an invaluable tool for content creators, researchers, and professionals who need to convert speech to text efficiently.