Slow server @talkyard
It's getting increasingly frustrating that we're being met with "Server is slow right now" messages on a regular basis. Is there any fix for this coming up or will this continue to happen as often as it does now?
Customers are complaining about it, it's hindering us from effectively communicating with them and frankly it makes the experience really bad for everyone involved. We're trying to make Talkyard the primary communication channel to/from customers because it allows us to gather all of our information in one place. But that also means downtime is really really bad.
- CChristian Scheuer @chrscheuer
I'm also getting my replies deleted. It seems to happen when the server is slow to respond to an "at" mention and when it finally replies my entire response is deleted (because I continued to type after the popup)
- CIn reply tochrscheuer⬆:Christian Scheuer @chrscheuer
It's also frustrating that you can be in the middle of typing a long reply and then the UI turns into grey mode making me unable to continue the work.
- CIn reply tochrscheuer⬆:Christian Scheuer @chrscheuer
Fwiw this has been going on most of today...
- CIn reply tochrscheuer⬆:Christian Scheuer @chrscheuer
This is now 8+ hours with sustained 50% downtime. It's hit or miss if we are connected. Are anyone monitoring this at all? @KajMagnus?
- CIn reply tochrscheuer⬆:Christian Scheuer @chrscheuer
Fwiw drafts feature seems to really be messing up everything when the server is slow. My text input field randomly resets to earlier versions of drafts of other posts while in the middle of typing. Please please please roll back these changes, it is completely devastating to have your post deleted or overwritten in the middle of typing an important response to someone.
- CIn reply tochrscheuer⬆:Christian Scheuer @chrscheuer
I've created separate threads for the various issues we're experiencing, hopefully in a more constructive tone. This thread I'm leaving here to discuss the (still existing) downtime and what can be done to mitigate it and make sure it doesn't happen in the future.
- In reply tochrscheuer⬆:KajMagnus @KajMagnus2018-09-09 04:42:33.367Z
I think that with drafts disabled (as mentioned in another topic) this will happen no longer.
And I should try to reproduce this so I can verify it won't happen, some time later when enabling drafts again. (Maybe drafts could be a per site feature flag, for a start, so can be disabled easily if needed)
Thanks for posting separate topics about the other things.
- CChristian Scheuer @chrscheuer
Cool sounds good. The main forum here at talkyard (the one we're writing in now) didn't seem to be affected by the slowness, it was only happening on our own. But that might have been luck of course - we did have much more traffic to our own site while the issues were there.
- In reply tochrscheuer⬆:KajMagnus @KajMagnus2018-09-20 11:22:33.945Z
Ok so, as per the chat, apparently the problem was too-many-open-connections. Seems this is a Chrome restriction, not the Nginx server config: Chrome allows at most sth like 10 open connections against a single host (whilst the Nginx server is configured to allow 60). And since each tab starts its own long polling request, and might do other requests in parallel (e.g. asking for the forum topic list) after 8 - 9 - 10 tabs, additional requests, block. Until one of the earlier long polling requests finishes, maybe after 10 seconds, maybe after 20 or 30. Which gives the impression that the server is slow.
I believe I can fix this, by adding a service worker, and have it send just one long polling requests at a time, on behalf of all open tabs. And then it'll broadcast the long polling replies, to those tabs.
- CChristian Scheuer @chrscheuer
This sounds like exactly what was happening. Nice research.
I'm curious if any of the standard websocket libraries out there would have this built in already so you wouldn't have to invent your own solution?
- Progresswith handling this problem
- KajMagnus @KajMagnus2018-10-07 08:34:10.481Z
Adding a service worker that sends long polling requests, one at a time, on behalf of all open tabs, does fix this issue. However the changes I had to make, feels a bit risky (lots of changes, and message passing between browser tabs and the service worker), so this will have to wait for a while.
When continuing with this, I'm thinking I'll also look into adding a custom PWA (progressive web app) manifest. ... The first steps towards creating a PWA mobile app. (Later steps: having the service worker offline-cache page JSON content.)
- KajMagnus @KajMagnus2019-05-02 08:35:12.711Z
Now I've stated with the "real" fix — namely a single service worker that keeps just one connection open, to the server (instead of one per browser tab).
I think I will have deployed a version with a fix, to this "test" community (i.e. Talkyard .io), in about a week. Then I'll see how this works, for about a week, and if all fine, I'll deploy to "everyone", with a feature switch so can be toggled off.
- KajMagnus @KajMagnus2019-05-10 14:33:12.735Z
Just deployed a new version, which does long polling via a single service worker (instead of per tab). Now I could open 30+ tabs, without any issues. Will keep this new code running on this Talkyard .io server only, for about a week, to see how this works.
- KajMagnus @KajMagnus2019-05-26 11:05:44.266Z
@chrscheuer a few hours ago I enabled the service worker fix, on your forum. If you want to, you can open 20+ browser tabs and see if / verify-that "The server is slow" error is now gone (whilst being logged in).
- CChristian Scheuer @chrscheuer
FANTASTIC!!!!! <3 <3 <3 THIS WORKS. I'm SO happy :)
- KajMagnus @KajMagnus2019-05-26 11:19:29.780Zreplies tochrscheuer⬆:
Ok :- ) (be careful so you won't crash your browser if hereafter 100+ tabs open o. O )