Low Latency Mode causing misleading user experience for both performers and viewers (jarring pitch bends)
I'm certain y'all know about this problem already, but I wanted to tack on some details in case it might help along with solutioning.
Since COVID-19, DJs all flocked to streaming, many of them on Twitch. Low Latency Mode is creating jarring pitch bends that most DJs nor their listeners know about. Savvy listeners think the DJ is just doing a bad job. Even non-savvy listeners can tell something's not right. But DJ's have no idea that this is happening; even if they enable past broadcasts, their VOD won't have this problem. When listeners point this problem out, DJs often have no idea what the listeners are talking about.
I'm sure you know the cause but I'll elaborate for everyone's sake.
Low Latency Mode, if I understand correctly, tries to catch up viewers who've fallen behind (e.g. due to a temporary blip in their Internet), to the latest point in the stream, by ever-so-slightly increasing the speed of playback. This doesn't matter too much for speech: people notice a change, but it doesn't affect their understanding. It also doesn't matter too much for video game music: people notice a change, but it's the ambience / energy that matters, not so much the pitch.
When it comes to music performance though, it can be a big deal, particularly for DJ streams, which have exploded since the virus outbreak. DJs aim to provide a smooth, uninterrupted feel over extended periods of time. But low latency mode disrupts the vibe and even makes people stop dancing. Look, I know how annoying musicians can be, but I'm not exaggerating: I've been in Zoom dance parties or hanging with a few friends where even non-musical ones stop dancing and feel something "off" has happened.
Even non-trained ears can hear pitch differences with surprising accuracy. I can make a recommendation that would make the pitch difference much more tolerable.
By my ears, the pitch shift I hear is around around 50 cents (i.e. about a quarter tone) for music playing at 123 bpm. I suspect the catch-up speed factor actually varies, but I don't know your algorithm. Anyway, centered around 123 bpm (since pitch is logarithmic), each 1 bpm up shifts pitch ~14 cents. So for me to hearing 50 cents, your algo must be pushing 123 bpm to 126-127 bpm. In other words, low latency mode is increasing speed by as much as 3%. It's understandable that 3% sounded totally acceptable from a general playback product perspective that wasn't geared toward DJs. But, this probably has to be reviewed if you want to continue supporting DJ streams. (Which, if you don't care, that's fair :-)
So what if we take a practical approach. "Instrument Timbres and Pitch Estimation in Polyphonic Music," Loeffler, B. D. (2006) says that the smallest pitch difference human ears can detect is 6 cents. But it's not like you have to fool professional musicians. I'm sure even lowering the shift from 50 to 15 cents will make a world of difference. To do so, you have to cap the speed increases to +1% at most.
- Is it viable to cap low latency mode's speed increases at +1%?
- If no, is it viable to do so just for Music & Performance Arts channels?
- If no, is it viable to make it an opt-in setting for the broadcaster?
- Or maybe some compromise where, when the user's not too far behind it caps, but when they are, it does whatever it does today?
I trust y'all to make the right call. I also trust that you may have bigger fish to fry.
Appreciate all your work and wishing the best to everyone there.