Upload audio files and get comprehensive speech analysis: transcription with word/phoneme alignment, stress patterns and syntactic pause analysis.

No files stored Local processing Free service based on PLSPP original scripts
Detailed Analysis Capabilities
  • Speech Transcription - Automatic speech recognition with word-level timing
  • Phonetic Alignment - Precise phoneme-level segmentation using WebMAUS
  • Syntactic Structure - Part-of-speech tagging and constituency parsing
  • Pause Analysis - Identify structurant and disfluent pauses based on syntactic structure
  • Lexical Stress Patterns - Identify stressed and unstressed syllables
  • Syllable Detection - Automatic syllable nuclei identification
Privacy & Security
  • • No audio files stored permanently
  • • Local server processing only
  • • Completely free service
  • • GDPR compliant processing
Technology Stack
  • • WhisperX & Crisper Whisper (ASR)
  • • WebMAUS (Phoneme alignment)
  • • spaCy & Berkeley Neural Parser
  • • CMU Pronunciation Dictionary
Contact & Support

Questions or feedback?

Contact Us
🚀 Getting Started: Basic Usage
  1. Upload your audio files using the file uploader below.
  2. Select a predefined scenario or customise the pipeline settings as needed.
  3. Ensure you accept the terms of usage by checking the box.
  4. Click Run Pipeline to start processing.
  5. Monitor progress with the progress bar and dynamic logs.
  6. As you go, visualisations will appear in the Visualisation tab. You can also download output files in the Output tab.

NB. When you leave, your data is automatically deleted.

🍏 Generated Outputs

TextGrid Files

aligned transcription with stress and pause annotations

Pauses Table

per-word pauses annotations

Stress Table

per-word stress annotations

Visualisation Table

synthetic per-word annotations

Statistics Tables

various metrics per speaker and file

Word Confidence

word-level ASR confidence scores (CSV)

Word Alignment

timestamps per word (JSON)

Raw Transcript

speech recognition transcript (TXT)

📊 Visualisation of Pauses Annotations
Sample Pause Visualisation Pauses above a given duration threshold are shown between words. Colors indicate how each pause might impact speech understanding. Red indicates potentially disfluent pauses (occurring at a lower syntactic boundary), while green indicates syntactically structurant pauses. You can also display more classical between-clause, between-phrase, and between-word pause categories.
learn more about it
📊 Visualisation of Stress Annotations
Sample Stress Visualisation Each circle represents a syllable, with size indicating acoustic prominence (by default, the mean of speaker-normalized pitch, intensity, and duration). Red indicates an unexpected stress pattern compared to the reference dictionary. Click on a word to listen to it.
learn more about it

This module performs phoneme transcription and forced alignment. It takes as input a TextGrid file with word annotation.

Check other models
Constituency Parsing
Check other models
Input Settings
Output Settings
For phoneme mode
Drop your audio files here, then click Upload All Files.
If you already run this pipeline before, you can also upload the visualisation.csv file to view the visualisations.
You can also upload TextGrid files with the same name as your audio files if you want to skip the preprocessing modules.

📁 Drag & drop files here or click to select

Files with the same name will be automatically grouped together
💡 For large uploads (>100 files), use folder upload for better performance
Current session ID: ...

No output files available yet.

Output files will appear here when pipeline processing completes.
Visualization Options

No visualization data available yet.

Visualization will appear here when pipeline processing generates output files with annotations.