screen-recorder-analyzer extracts a chronological list of what a user was doing from any screen recording. Combines Whisper audio transcription, Tesseract OCR on keyframes, and GPT action extraction.
GitHub
nometria/screen-recorder-analyzer
PyPI
screen-recorder-analyzer on PyPI
Install
Usage
Output format
Pipeline
| Step | Engine | What it does |
|---|---|---|
| Frame extraction | ffmpeg | Extracts keyframes at configurable intervals |
| OCR | Tesseract → EasyOCR → GCV | Reads on-screen text from each frame |
| Transcription | Whisper | Transcribes audio narration |
| Action extraction | GPT-4 | Combines OCR + audio → structured actions |
Optional engines
Use cases
- User research: Understand what testers are doing in session recordings
- QA automation: Extract action sequences from recorded test runs
- Onboarding analysis: See where users get stuck in screen recordings
- Compliance logging: Generate structured audit trails from recorded sessions