No internet connection
  1. Home

Solving the problem that the topmost comments get all upvotes

By KajMagnus @user_145
    2013-02-23 09:32:31.097Z2016-05-03 07:34:18.869Z
    The first comment gets much attention, many votes.

    This assumes a discussion system that sorts comments by number of upvotes.

    [[Update: As of August 2015, I have partly implemented the approach in the first example below (but not the second example). ]]

    A fairly prevalent problem in discussions online: Some early
    posted comments are fairly useful, and get fairly many upvotes,
    but no downvotes (since they are fairly useful). Later on, some really
    interesting comments are posted — but at that time, the early comments have already
    accumulated many upvotes.

    The early comments are therefore shown first, and continue to
    gather more and more upvotes, because they receive all attention.
    The really interesting comments, however, remain forgotten somewhere below,
    because too few people take the time to scroll down, find them and read them.

    A Solution?

    This should solve the above-mentioned problem:

    The computer counts how many people have read each comment, and takes this into
    account, when it sorts all comments. — So, we don't sort by upvotes only, but
    by upvotes divided by attention (that is, how many people have read a

    The fairly useful comments mentioned above got many upvotes, but much more
    attention than upvotes. Therefore the sorting algorithm would move them downwards,
    and the really interesting comments would take their place
    — because the really interesting comments have been upvoted very much,
    compared to how many people have read them.

    The result should be that really interesting comments surface to the top of
    the page, even if they are posted much later than the early comments.

    Broken Solutions

    Please note that dividing by downvotes won't work,
    because early posted fairly useful comments get virtually no downvotes, only upvotes.

    Promoting new comments, in some manner, until they have received a few ratings, won't work well, I think. The result would be top of the page covered with lots of new mediocre comments?

    Two Examples

    How does the computer know which comments you've read? Here follows two examples
    of how the computer could deduce which comments you've read.

    First example. In the picture below, if you upvote the
    comment outlined in orange, the computer would assume you've read all
    comments leading up to that comment, plus any siblings placed before it.
    Those comments are outlined in blue.

    The computer thinks you've read the blue comments, if you upvote the orange comment.

    Second example. The computer could assume that you read comments that are shown on screen.
    Have a look at this demo page[[sorry the demo is broken nowadays (year 2015) because of lots of changes in how the software works]]. In the demo, a red square indicates that the computer thinks you have not yet read the comment. A blue square indicates that the computer thinks you've read it. Look at the demo page for a while, and watch the squares change color to blue.

    In the two examples above: When you upvote a comment, the computer thinks that the other comments you have read (the blue ones) but did not upvote, are not terribly interesting.

    This problem elsewhere, and solved?

    Search engines have a similar problem. They estimate the usefulness of a search result link, by counting how many people click the link. However people tend to click the topmost links only. Here are some articles about how search engines approach this problem (I have not yet read them in whole):

    • Cascade Model (one of the original click models)
    • Dynamic bayesian network model (a more generalizable Cascade Model).
      As far as I can tell, example 1 above reminds of the simplified model described
      in section 5. A SIMPLIFIED MODEL.
    • DBN model with scroll and hover interactions (reminds of example 2 above)

    (Thanks to whathappenedto at Hacker News for posting these links.)

    Your Thoughts?

    Is it a good idea to take into account how many people have read a comment? And not consider only upvotes and downvotes. Are there problems I've overlooked?

    (This article is being discussed at Hacker News
    and a tiny bit at Reddit)

    • 61 replies

    There are 61 replies. Estimated reading time: 23 minutes

    1. A
        2013-03-24 01:08:58.425Z

        good thoughts, thanks for this!

        1. A2
          In reply touser_145:
            2013-03-24 01:23:06.808Z

            I think it's a good idea, and I don't see any problems with it, no obvious ones at least.

            One improvement I could see is giving less weight to a comment as more people see and upvote it. But that would require experimentation.

            1. A3
              In reply touser_145:
                2013-03-24 01:32:29.497Z

                you're a genius.

                1. A3
                  In reply touser_145:
                    2013-03-24 01:50:48.131Z

                    I was an early poster on a political discussion and got a single down vote. That pushed my comment down to the lowest imbedded level. So unless you were showing level 7 comments it didn't even show up. Totally unfair that one disagreeing thumbs down buried my comments.

                    1. A3Anonymous
                        2013-03-24 01:54:29.640Z

                        This is unequivocally the most interesting comment in the world, and yet here it lies at the bottom...hmmm, call me a skeptic.

                        1. In reply toAnonymous:
                          UKajMagnus @user_145
                            2013-03-24 09:42:11.487Z

                            That can be fixed, and is already fixed on this website :-)

                            Details: I think the approach suggested on this page should be combined with algorithms that take into account that one doesn't know how interesting a comment is, after only a few votes. Only after many votes, it's possible to make a good estimate.

                            One way to do this, should be to sort by the lower bound of a confidence interval of the true score of the comment. — This is actually how the discussion system on this website already works :-) So the problem you're referring to, should have been largely solved already, on sites that use this discussion system.

                            Also Reddit has solved the problem you're mentioning (we've chosen the same solution, independently of each other).

                            1. In reply toAnonymous:
                                2013-03-24 03:44:32.785Z

                                Sometimes "highly" disagreed with comments can be interesting to read as well (to see what kind of opinion many people find untrue), so perhaps both could be pushed to the top -- comments with a high upvote to view ratio, and comments with a high downvote to view ratio.

                                In your case, your comment might remain high with only a single down vote, until either (1) time passes and more people view it without voting on it, or (2) up and down votes even out.

                                1. UKajMagnus @user_145
                                    2013-03-25 10:47:23.714Z

                                    I think it'd be interesting help people find strongly controversial comments. I'm not sure about comments that virtually everyone votes down though — in general, I'd guess such comments tend to waste peoples' time (although there are exceptions I suppose).

                                    Finding highly upvoted comments that differs from the current reader's point of view would be really interesting I think :-) But how does the computer know what the current reader thinks, and what a comment "thinks"?

                                  • In reply toAnonymous:
                                    SSweet Anonymous
                                      2013-03-24 02:17:37.265Z

                                      aawww sad story. better luck for next time

                                    • A3
                                      In reply touser_145:
                                        2013-03-24 01:51:02.286Z

                                        So True

                                        1. In reply toAnonymous:
                                            2013-03-24 02:43:06.349Z

                                            Like it

                                          • I
                                            In reply touser_145:
                                              2013-03-24 01:57:48.377Z

                                              Let's see if it works. Actually reddit sort by "best" for instance: the score is proportional to the number of upvotes and the date it was posted.

                                              1. A3
                                                In reply touser_145:
                                                  2013-03-24 02:01:42.578Z

                                                  (empty post)

                                                  1. A3
                                                    In reply touser_145:
                                                      2013-03-25 16:00:48.953Z

                                                      I think it's good, but I think that the replies to comments should be shown to the right of the original comment, rather than scrolling down to see them.

                                                      The reason for that is because it is still more convenient to scroll down than to scroll to the side, even with your fancy system in place. This is because of the page down and page up keys.

                                                      When a lot of low-content comments are posted, as is currently the case, it's important to have a great way to sift through them quickly. Threads, on the other hand, tend to need less space, and therefore can occupy the less searchable horizontal axis.

                                                      1. H
                                                        In reply touser_145:
                                                        Hopefully helpful
                                                          2013-03-26 02:20:18.580Z

                                                          The solution is simple - the weight of a vote should decay over time. It should affect up votes and down votes identically. This is more difficult to implement, but it is essentially what you're trying to approximate with your approach (number of reads being an approximation of the passage of time) and doesn't require trying to guess what's been read.

                                                          1. UKajMagnus @user_145
                                                              2013-03-26 03:58:51.532Z

                                                              That assumes people read all comments but they don't. Imagine a page with 1000 comments — most people read the topmost comments, but I'd guess that less than 1 in 1000 read all comments. Taking only elapsed time (or page views) into account, would unfairly favor the topmost comment with a factor of > 1000, and would not work well.

                                                              One needs to take both page views and position on screen into account.

                                                              It might, however, be a good idea to let a vote decay somewhat with time. Comments might get obsolete after some years (for example). So it might be a good idea to favor newly posted and popular comments, even if they're not quite as popular as a many years old comment. — I'd guess, however, that letting comments and articles be wikis, is good enough. (Then one can fix links that get broken, etcetera)

                                                            • A4
                                                              In reply touser_145:
                                                                2013-03-24 06:42:15.925Z

                                                                Doesn't this confuse long post with interesting post? Now a short, earth-shattering response will get no visibility vs long-winded replies...

                                                                1. UKajMagnus @user_145
                                                                    2013-03-25 11:14:47.524Z

                                                                    I'm not sure. Long posts do take longer to read, so fewer people tend to upvote them. I was thinking that, on the whole, the effect you're mentioning and the effect I'm mentioning cancel each other out.

                                                                    However, my actual implementation might not work very well. Perhaps measuring which comment a visitor is reading, works well mainly on mobile phones (where only 1 comments is shown at a time).

                                                                    I read an article by Paul Graham that discussed "good" and "bad" comments (the "Comments" section in the article), and he wrote that: "There is a strong correlation between comment quality and length", and "Whatever the cause, stupid comments tend to be short."

                                                                  • B
                                                                    In reply touser_145:
                                                                      2013-03-24 14:29:24.948Z

                                                                      well - i see two problems here:

                                                                      1. if you have some collusion of people who upvote their comments in groups it would be very easy to dominate the comments section with new comments ... you have to factor in absolute values somehow because 1000ups/1000views is far better than 1up/1view

                                                                      2. sorting this way will result in a very fluctuating comment section which probably isn't very usable - also most people dont take the time to equally consider all comments (not scrolling down e.g.) when giving upvotes ... maybe the solution should be to increase / decrease the font size of good / bad comments to make good stories more visible while still maintaining some order.

                                                                      btw) going from left to right like on this side is equally unsuitable -- but the reddit-style comment system with threading and hiding bad comments is the best i know

                                                                      1. UKajMagnus @user_145
                                                                          2013-03-25 11:45:10.339Z

                                                                          Re 1: One can use mathematics, to take into account how many people have read a comment. If only a few people have read it and upvoted it, then we really don't know for sure if it is interesting, and we won't give it a score of 100% interesting (1up/1view).

                                                                          The mathematics stuff is called "using the lower bound of a binomial proportion confidence interval" — here's an article about it: How not to sort by average rating.

                                                                          (This discussion system already works in that way actually :-))

                                                                          Re 2: With problem 1 solved, I think this won't be that much of a problem.

                                                                          Re btw: Fairly many people seem to prefer vertical layout rather than horizontal. I'm thinking about making this configurable per website.

                                                                        • F
                                                                          In reply touser_145:
                                                                            2013-03-24 07:32:44.059Z

                                                                            Make reading comments a 2D navigation problem? Could be interesting

                                                                            1. UKajMagnus @user_145
                                                                                2013-03-24 09:46:22.379Z

                                                                                It's a DAG graph actually I think :-) And a directed graph with cycles if one can edit one's comment and refer to what people said, later on.

                                                                                I've been thinking about rendering a graph instead of a tree, but that feels... a little bit too crazy (and hard to implement)

                                                                              • A3
                                                                                In reply touser_145:
                                                                                  2013-03-24 02:09:59.557Z

                                                                                  (empty post)

                                                                                  1. A3
                                                                                    In reply touser_145:
                                                                                      2013-03-24 02:22:47.778Z

                                                                                      only 1 person sees your thread and upvotes = 100% so top comment for next person.

                                                                                      1. UKajMagnus @user_145
                                                                                          2013-03-25 10:59:38.780Z

                                                                                          Actually that's not how I'd implement it. Instead, one can use mathematics, to take into account how many people have read the comment: if only 1 person has read it, then we really don't know for sure if it is interesting, and we won't give it a score of 100%.

                                                                                          The mathematics stuff is called "using the lower bound of a binomial proportion confidence interval" — here's an article about it: How not to sort by average rating.

                                                                                        • G
                                                                                          In reply touser_145:
                                                                                            2013-03-24 02:14:42.084Z


                                                                                            1. S2
                                                                                              In reply touser_145:
                                                                                                2013-03-24 10:00:27.046Z

                                                                                                Well, interesting idea. But I would propose that even people reading a comment, but doing nothing, would count as divisor for "upvoterate".

                                                                                                I know, would be difficult to achive and would have probably use some form of client-side-js to determine, if a comment had been visible in the browser. But seeing a comment and doing nothing, is a strong signal in itself. So you would get a picture more rooted in reality that way.

                                                                                                And yes, what you wrote is quite a problem for online-discussions, as the first-come first-win situation, does not push the best content to the top.

                                                                                                1. UKajMagnus @user_145
                                                                                                    2013-03-25 11:39:46.334Z

                                                                                                    Initially I was thinking about considering all visitors, like you suggested. Instead of considering only visitors who voted on something. But I'm afraid this would use up too much disk storage space (since the computer would have to store information on each and every visitor).

                                                                                                    (The read-and-blue-boxes stuff in example 2 already uses client side Javascript.)

                                                                                                  • A3
                                                                                                    In reply touser_145:
                                                                                                      2013-03-24 07:55:33.161Z

                                                                                                      Cool idea. I'm curious whether the comments that are further below would tend to naturally have a higher "vote / see" ratio because users who go that deep tend to be more engaged anyways.

                                                                                                      1. A3
                                                                                                        In reply touser_145:
                                                                                                          2013-03-24 02:11:13.646Z

                                                                                                          Wow this page is really broken for people who like to highlight text with this mouse...

                                                                                                          1. A3
                                                                                                            In reply touser_145:
                                                                                                              2013-03-24 03:00:52.010Z

                                                                                                              Can you post your code for this the screen viewport trick?

                                                                                                              1. UKajMagnus @user_145
                                                                                                                  2013-03-25 11:06:52.695Z
                                                                                                                • U
                                                                                                                  In reply touser_145:
                                                                                                                  Vijay @user_162
                                                                                                                    2013-03-24 02:37:30.523Z

                                                                                                                    This actually seems like a useful application of a multi-arm bandit algorithm. I definitely recommend John Myles White's O'Reilly book on MAB problems - it's brief and has some great discussion. A general concept is exploration vs. exploitation - using a view to explore a infrequently viewed comment, or exploiting a known good comment - or something in-between.

                                                                                                                    1. UKajMagnus @user_145
                                                                                                                        2013-03-25 11:54:14.024Z

                                                                                                                        Okay, I noticed it's only 88 pages and entitled "Bandit Algorithms for Website Optimization" — then I'd consider reading it soon :-)

                                                                                                                        Thanks for the suggestion

                                                                                                                      • P
                                                                                                                        In reply touser_145:
                                                                                                                          2013-03-24 06:51:23.175Z

                                                                                                                          This seems like an interesting solution, but what do you say to those who prefer a more "forum-based" format of reading? Would there be a more vertical version of this model?

                                                                                                                          1. UKajMagnus @user_145
                                                                                                                              2013-03-25 11:23:30.405Z

                                                                                                                              I think there ought to be — many people apparently like that format. But it'd probably take a while before I get/take the time to implement it. Perhaps someone else might want to do it, if/when Debiki becomes open source. (Under the AGPL license probably.)

                                                                                                                            • G2
                                                                                                                              In reply touser_145:
                                                                                                                              Gary Culliss
                                                                                                                                2013-03-24 03:04:09.210Z

                                                                                                                                I developed a similar technology for search results back in the late 1990's that considered how many times a result was selected by users relative to how many times a result was shown to users. We used similar methods of counting the links above a selected link as being shown to the user, as well pagination breaks, time and other metrics. You can then use ratios of the selections over the views for ranking, but the resulting number will be too volatile at low counts, so you need to build in some padding to account for that fact. Also, in search where results are more persistent, we took account of time by keeping track of the age of clicks and views, and expiring them after certain rolling periods of time. Anyhow, it's neat to see you come up with the idea for news. We applied it everywhere, including things like category listings and even to determine if something might contain inappropriate content.

                                                                                                                                1. UKajMagnus @user_145
                                                                                                                                    2013-03-25 15:16:08.202Z

                                                                                                                                    That's really cool :-) I googled your name and read about Direct Hit Technologies. I noticed you've studied engineering, law and finance — this seems almost a bit crazy :-) (in a good sense of course).

                                                                                                                                    I found this article from 1999. I noticed you were using information about the user to decide which search results to show. Something interesting would be to use info about the user, to decide which comments to prioritize or highlight. For example, if the user is a rich middle aged man that votes on this or that party, then show highly upvoted comments that dissents with people from that group.

                                                                                                                                    The intention would be to contribute to a more tolerant world, where people better understand others with different opinions.

                                                                                                                                    (Perhaps this might even be doable, if one could gather information via Facebook or something. Hmm but there'd be lots of privacy issues of course. — Or people could contribute info about their own anonymous profiles, voluntarily, if they wanted to participate in "the-understand-others-better" project)

                                                                                                                                  • U
                                                                                                                                    In reply touser_145:
                                                                                                                                    Piotr @user_161
                                                                                                                                      2013-03-24 02:29:27.645Z

                                                                                                                                      The idea seams interesting. However, only practice or a fair amount of real statistics (e.g. re-evaluating comments from other post by this rule) will whether it makes sense).

                                                                                                                                      1. D
                                                                                                                                        In reply touser_145:
                                                                                                                                          2013-03-24 03:26:21.447Z

                                                                                                                                          I think it's a great idea. I've had similar thoughts before too, but I'm happy someone actually implemented (a nicely working, at first glance) solution.

                                                                                                                                          1. DDmitri
                                                                                                                                              2013-03-24 03:27:47.550Z

                                                                                                                                              Just want to add, I usually only upvote exceptional comments, or comments I really agree with.

                                                                                                                                              That means there are many good comments that I don't upvote. But on average, it should still work, I would imagine.

                                                                                                                                            • J
                                                                                                                                              In reply touser_145:
                                                                                                                                                2013-03-24 03:17:36.125Z

                                                                                                                                                This is a great idea. I think factoring in the number of people that have read a comment does solve the problem: it equalizes the field.

                                                                                                                                                1. A3
                                                                                                                                                  In reply touser_145:
                                                                                                                                                    2013-03-24 03:11:36.892Z

                                                                                                                                                    People don't tend to read sideways though. They follow the big F shape.

                                                                                                                                                    1. V
                                                                                                                                                      In reply touser_145:
                                                                                                                                                        2013-03-24 02:54:35.650Z

                                                                                                                                                        Upvotes/Downvotes are the bane of all discussion threads. What is this, Junior High? I can read and decide for myself if the comment is brilliant, valid, or just another jackass on the web. Just post the comments in the order they were made, and let the reader decide what to do with them. There's not a single website that uses upvotes/downvotes that hasn't turned into a huge circle jerk session of mutual back patting and popularity contests.

                                                                                                                                                        1. UKajMagnus @user_145
                                                                                                                                                            2013-03-25 11:52:51.610Z

                                                                                                                                                            What if there are 100 comments, or a forum topic with 100 pages filled with comments? Then people won't read them all and the interesting information is lost forever, somewhere in the middle...

                                                                                                                                                            Reddit, SlashDot, StackOverflow (!) + 100 StackExchange sites, and HackerNews are some sites that uses up/downvotes and work well. People do try to get more reputation points, and I think this tends to make them behave well and be more constructive and respectful towards each other.

                                                                                                                                                          • A5
                                                                                                                                                            In reply touser_145:
                                                                                                                                                              2013-03-24 07:00:54.035Z

                                                                                                                                                              F_ck you, it is not the solution.

                                                                                                                                                              1. A3
                                                                                                                                                                In reply touser_145:
                                                                                                                                                                  2013-03-24 09:48:30.900Z

                                                                                                                                                                  I'm guessing you could also play with javascript to detect comments thst have had screentime. That way you might even get rid of voting - just prefer comments that have more screentime.

                                                                                                                                                                  And as another commentor said, you also need to solve the problem with one malicious downvote.

                                                                                                                                                                  1. UKajMagnus @user_145
                                                                                                                                                                      2013-03-25 11:33:34.002Z

                                                                                                                                                                      Measuring the screen time is a really interesting idea. Or rather, measuring how many people read a comment to the end, versus how many people only start reading it. But really hard to implement reliably? Except for mobile phones with a small viewport. Or if there was eye-tracking available :-)

                                                                                                                                                                    • A3
                                                                                                                                                                      In reply touser_145:
                                                                                                                                                                        2013-03-24 02:17:50.611Z

                                                                                                                                                                        I think adding a random factor would allow early comments to jump to the top occasionally to give them the opportunity get upvoted.

                                                                                                                                                                        1. A3
                                                                                                                                                                          In reply touser_145:
                                                                                                                                                                            2013-03-25 13:58:14.151Z

                                                                                                                                                                            i didn't even know there was a horizontal scrollbar! lolz

                                                                                                                                                                            1. U
                                                                                                                                                                              In reply touser_145:
                                                                                                                                                                              Deepank @user_163
                                                                                                                                                                                2013-03-24 03:16:22.984Z

                                                                                                                                                                                I like the tree structure of the comments. You could even explore more by using a bubbles structure for the comments and increasing size of the ones which need attention.

                                                                                                                                                                                1. A3
                                                                                                                                                                                  In reply touser_145:
                                                                                                                                                                                    2013-03-24 08:12:39.094Z

                                                                                                                                                                                    Too wide representation of comments. Super bad on mobile.

                                                                                                                                                                                    1. UKajMagnus @user_145
                                                                                                                                                                                        2013-03-25 11:26:46.258Z

                                                                                                                                                                                        Oops, my intention was that they fit inside the viewport. I'll have to test on my mobile phone again — I made them wider a while ago, because people were complaining that they became too narrow, when they were deeply indented.

                                                                                                                                                                                      • U
                                                                                                                                                                                        In reply touser_145:
                                                                                                                                                                                        Matt @user_164
                                                                                                                                                                                          2013-03-24 04:59:35.808Z

                                                                                                                                                                                          I have thought of this much myself. I hope that commenting and voting continues to improve the way we digest discussions.

                                                                                                                                                                                          1. UKajMagnus @user_145
                                                                                                                                                                                              2013-03-24 11:38:42.544Z

                                                                                                                                                                                              If you'd like to think about it for many more days:

                                                                                                                                                                                              At Hacker News, whathappenedto posted links to long articles about how search engines handle a similar problem. — Perhaps many search engine techniques are actually relevant when it comes to sorting comments on a discussion site — if one views the comment sorting problem as a search for the most interesting comments.

                                                                                                                                                                                            • M
                                                                                                                                                                                              In reply touser_145:
                                                                                                                                                                                                2013-03-24 05:49:06.072Z

                                                                                                                                                                                                Very smart solution. In fact, I hadn't recognized it was a problem until I read the title of your post... then it clicked immediately. Great idea & would definitely surface more interesting comments.

                                                                                                                                                                                                1. UKajMagnus @user_145
                                                                                                                                                                                                    2013-03-24 10:05:10.052Z

                                                                                                                                                                                                    Oh, lucky choice of title then :-) I was wondering if it was too long

                                                                                                                                                                                                    The problem discussed on this page might actually be a fairly recurring problem: People at Hacker News mentioned that search engines have a similar problem (that they've solved): they use link clicks (instead of upvotes) to estimate which search results are useful, but only the topmost search results tend to be clicked.

                                                                                                                                                                                                    Edit: Also see Gary Culliss' comment to the left.

                                                                                                                                                                                                    1. In reply toMatt:
                                                                                                                                                                                                      Show more comments