No internet connection
  1. Home
  2. Issues

Front page loads slowly when a lot of categories are present

By Christian Scheuer @chrscheuer
    2022-07-10 21:24:00.293Z

    Our forum front page takes several seconds to render, which is considered bad by modern standards (and also is just quite annoying). This seems to happen because the forum renders all sub categories - of which we have more than 500.
    It seems that this problem will only grow worse over time.

    Would it be possible to limit the display on the front page so that only the first X sub categories are rendered and the rest hidden / available via a manual click? This would hopefully make the front page load faster even with bigger sites.

    In an ideal world, the front page would load instantly with constant time, no matter the size of the forum. Perhaps it's also worth checking if it has dependencies on things that aren't cached in redis causing it to roundtrip the database for every request?

    • 8 replies
    1. In reply tochrscheuer:

      Looking in Chrome Dev Tools, I see these two issues:

      • All JSON needed to render the category page, isn't included in the first response,
        so Talkyard sends another request to get all categories and recent-topics-per-category.
      • That second request (to -/list-categories-topics) takes long, 2 – 3 seconds. That's surprisingly long.

      Perhaps it's also worth checking if it has dependencies on things that aren't cached in redis causing it to roundtrip the database

      Yes that could be it. (Everything is supposed to be in-memory-cached.)

      Would it be possible to limit the display on the front page so that only the first X sub categories are rendered

      I think that's a good idea — at the same time, this doesn't seem to be the main problem. Dev Tools shows that handling the categories JSON and rendering the categories, takes about 400 ms, which is fairly much, right. But the request to fetch the categories takes 2–3 seconds (the main issue).

      ***

      Another minor issue is that Talkyard includes empty fields in the JSON — if those were all excluded, the categories JSON would shrink with about 50%.

            {
              "id": 90,
              "parentId": 13,
              "name": "Sree - Dolby Atmos",
              "slug": "sree-dolby-atmos",
              "defaultTopicType": 10,
              "newTopicTypes": [10],
              "doItVotesPopFirst": null,    <—— could exclude
              "unlistCategory": false,        all these 'null', 'false', 0 ...
              "unlistTopics": false,
              "includeInSummaries": 0,
              "position": 50,
              "description": "Forum for Sree - Dolby Atmos package",
              "thumbnailUrl": null
            },
      
      1. CChristian Scheuer @chrscheuer
          2022-07-18 13:27:46.947Z

          Yea sounds like you're making good progress figuring out what causes this and it probably needs several iterations to be super quick.

          I would consider any database calls during the loading of a front page as extraneous / something that could be solved better. The way we'd approach it in our infrastructure would be to have an asynchronous job that ran in the background whenever the front page would need updating (for anything that changes what's displayed there) and that job would then cache its results in redis or something similar.

          You could even have that job render static pregenerated html so the async JSON fetch wouldn't need to happen in the case where the front page is the first page loaded - which also makes it better for SSO. Although this step is probably more work. You could start with the job caching just the JSON needed to display the front page.

          I would also consider any delivery of JSON that goes beyond the page wrap (beyond the first page that you can see) as something that could be potentially broken up, all in favor of making the first visible part of the section load faster (or even instantly).
          Lighthouse from Google Chrome is great to troubleshoot and measure loading performance.

          1. CChristian Scheuer @chrscheuer
              2022-07-18 13:31:35.202Z

              These are the results I get from https://forum.soundflow.org/categories (admittedly, on a slow internet connection):

              1. CChristian Scheuer @chrscheuer
                  2022-07-18 13:35:42.246Z

                  For reference, on the same poor internet connection, we get 96 in perf on our own website. (And we need to fix some SEO tags I see)

                  1. Ooops Ty's 48 doesn't look good

                • In reply tochrscheuer:
                  KajMagnus @KajMagnus2022-07-26 08:04:33.415Z2022-07-26 08:17:09.843Z

                  Update: Lots of database queries! Talkyard doesn't currently cache recent topics per category — instead, to render the categories page (which also includes the most recent topics, in each category), Ty looks up recent topics, twice per category, including sub categories. (Twice? One extra query, to find pinned topics.)

                  That's 1 000 database queries, to render the categories page, when there're 500 categories. And that's why the HTTP request to /-/list-categories-topics takes so long.

                  I'll rewrite this to just 1 or 2 queries (using SQL partition by ... over ...). I'd think that'll be enough to make the page ok fast — those queries should take just a few millis. But some time later, the result will be cached as well

                  (Why does Ty currently do one SQL query per category? Probably I thought, long ago: "there's typically just 5 or 10 categories, so this'll be fast enough for now". But now there's 500+ :- ))

                  All JSON will be included in the first response, so the extra HTTP fetch request to get the JSON, will disappear too.

                  I would consider any database calls during the loading of a front page as extraneous / something that could be solved better

                  Me too :- )   (At least for users who aren't logged in. — If one is logged in, it could make sense to query the database for some user specific data.)

                  have an asynchronous job that ran in the background whenever the front page would need updating ... cache its results in redis or something similar

                  Yes, there is (for all pages, not just the frontpage), & cached in Postgres & in-process-memory in the application server.

                  render static pregenerated html so the async JSON fetch wouldn't need to happen in the case where the front page is the first page loaded

                  This is cached :- ) (i.e. static pregenerated html), and normally the JSON is included in a HTML tag directly in the HTTP response, so there's no separate JSON fetch request. However the category list page was an exception.

                  which also makes it better for SSO

                  You have in mind that Google gives some boost to fast loading pages? (Or sth else?)

                  1. CChristian Scheuer @chrscheuer
                      2022-08-03 14:05:01.579Z

                      You have in mind that Google gives some boost to fast loading pages? (Or sth else?)

                      Yes, it's been widely reported that Google uses actual live data from the performance as experienced by real users visiting a site as one of the elements in its ranking system.

                      That being said, improving performance for real people, not just the web crawlers, is of course important as well, as research shows even slight delays in load times drastically impact if (new) users will abandon a site visit. Thus, this would directly impact bottom line.

                • In reply tochrscheuer:

                  Update: This now fixed (in my work-in-progress branch), will deploy hopefully at the end of ... hmm, next week. Some code review left to do & some more auto tests.

                  The 1 000 queries will now be around 20 instead — namely two per base category (not per sub category).

                  (Later, the queries can be cached, although I think that'd mostly be good for reducing server load, but wouldn't be so noticeable, latency wise.)

                  1. Progress
                    with handling this problem
                  2. @KajMagnus marked this topic as Planned 2022-07-12 04:05:34.047Z.
                  3. @KajMagnus marked this topic as Started 2022-07-18 10:03:16.772Z.