Audio synchronization is one of the most technically demanding aspects of eLearning localization. When a course includes narration, the translated audio almost never matches the original duration — some languages are inherently faster or slower to speak, sentence structures differ, and natural pacing varies by language. This means every slide's timeline must be adjusted to maintain the relationship between narration and on-screen events.
Our audio synchronization process begins during the translation phase. We provide translators with the source audio files alongside the scripts so they understand the pacing and emphasis of the original narration. We also flag any timing-critical segments — such as text that appears word-by-word in sync with narration, or animations triggered at specific audio cue points — so translators can optimize sentence length where possible without sacrificing meaning.
Once voiceover recording is complete (either through our network of professional voice talent or using client-provided recordings), our audio engineers normalize levels, remove artifacts, and match the audio profile of the source files. We deliver audio in the format required by the authoring tool, typically MP3 or WAV at the project's original sample rate.
The synchronization work happens in the authoring tool. Our DTP operators replace audio files on each slide, then adjust cue points, animation timings, and slide durations to match the new narration. For Articulate Storyline, this means modifying the timeline for every slide and layer that contains narrated audio. For Captivate, we adjust slide duration and re-map audio events. For tools that use subtitle tracks, we regenerate the timing data from the new audio.
We offer two service levels for audio: full re-record with professional talent (our standard recommendation), and text-to-speech using premium neural voices for budget-sensitive projects or rapid prototyping. Both options include full timeline synchronization in the authoring tool.