Diarization-01 Processing Pipeline

The diarization-01 pipeline allows to derive text from audio streams divided into speaker utterances.

The data for the pipeline must be collected via any recording device. Pipeline is accessible via the following link:

✉️ Contributors: David Rug (david.rug98@icloud.com), Elias Mueller (elias.mueller@kit.edu), Ivo Benke (ivo.benke@kit.edu)

💻 How To Use?

To use the pipeline, please prepare the data in the given input format and form as described below. Go to the link to the pipeline endpoint descriptions (see above). There you find the pipeline endpoints and can send the data. There are two options for sending the data to the pipeline:

Graphical User Interface

Interact with a Graphical User Interface. The link is on the website of the pipeline.

Standard API Call

Send a request to the endpoint via an API call method of your choice (Postman, curl, or any other method).

Send an API call to the pipeline and wait for the response. That’s it 😊

⚙️ Specifications

Input Data

Audio file

Input Data Format

.wav, .mp3

Output Data

Table with columns: id, speaker_tag, start_time, end_time, utterance

Output Data Format



Any audio recording device

🛠️ Pipeline Procedure

💬 Get In Touch

Do you have any comments or ideas for improvements for this specific pipeline?

Please let us know via the contact form below.

Feedback - Pipeline
Nach oben scrollen