Instead of chunking the thing, we'll improve our precision and clock accuracy. Clock accuracy is improved by flagging the "load time" of the clip in the context, which can be about 500ms (or more) off the context's start line. The precision is just a number in the PlaybackWaveform component.
This tries to prioritize actual voice to decide the waveform, and clamps noise to zero to ensure the waveform doesn't have a perceptually noisy base.
In theory this better matches the overall voice message content.
This all started with a bug where the clock wouldn't update appropriately, and ended with a whole refactoring to support later playback in the timeline.
Playback and recording instances are now independent, and this applies to the <Playback* /> components as well. Instead of those playback components taking a recording, they take a playback instance which has all the information the components need.
The clock was incredibly difficult to do because of the audio context's time tracking and the source's inability to say where it is at in the buffer/in time. This means we have to track when we started playing the clip so we can capture the audio context's current time, which may be a few seconds by the first time the user hits play. We also track stops so we know when to reset that flag.
Waveform calculations have also been moved into the base component, deduplicating the math a bit.
This fixes a bug where we couldn't upload voice messages because the audio buffer was being read, therefore changing the position of the cursor. When this happened, the upload function would claim that the buffer was empty and could not be read.
This makes it easier to keep track of which pieces the client will have already dispatched or been executed, reducing the amount of class members needed.
Critically, this makes it so the 'stop' button (which is currently a send button) actually works even after the automatic stop has happened.
UI is still pending for stopping recording early. This is not covered by this change.
See diff for details. Note that this introduces an "Uploading" state which is not currently used.
At the moment, if a user hits the maximum time then their recording will be broken. This is expected to be fixed in a future PR.
This leads to more reliable frequency/timing information, and involves a whole lot less decoding.
We still maintain ongoing encoded frames to avoid having to do one giant encode at the end, as that could take long enough to be disruptive.