No internet connection
  1. Home
  2. Ideas

Feature request: Allowing meta-information to topics, and then sorting by those

By @karu
    2023-07-18 12:56:58.883Z

    Hi, I have a feature request with a specific use case in mind, but is probably generalizable and useful in other cases. I am building a forum that would allow people to talk about research papers. Each research paper would become a "topic". I would like to associate at least two properties with each paper/topic, like year of publication, number of citations. Basically I want a set of <key, value> pairs I can associate with a topic. Once these fields have been populated, would like a view that sorts by a given key and shows all the topics in a category.

    I then want something like this: https://../top/category-name/key-name. I am planning to eventually host this on a server I am running. So could help in adding this feature to the source, if people have some guidance on the easiest way to implement this.

    • 18 replies

    There are 18 replies. Estimated reading time: 19 minutes

    1. probably generalizable and useful in other cases

      Yes I can think of other use cases. In fact, I wrote (a lot) about that a while ago: Adding Tags to Talkyard. If you scroll down, you'll find a design sketch showing how key-values, or tags-with-values, could look. And just below that image, there's this text: "Tags have a name (label) ... and, values?" with thoughts about other use case for key-values / tags-with-values.

      I then want something like this: https://../top/category-name/key-name

      That would find pages in category Category-Name, that have the property Key-Name?

      And sort by key value, e.g. sort by number of citations, if the key is Num-Citations, or by, say, publication date? (I'd think that'd be better expressed in a query param, e.g. ?sortBy=Publication-Year,-Num-Citations if you'd like to sort by two keys)

      allow people to talk about research papers

      That sounds nice :- )   Can I ask, what motivates you to do that? Do you have in mind to host it yourself (as a single person "hobby project") or host it at the university somehow? (If you want to reply privately, you could send a PM, click my name.)

      What are the research papers about b.t.w., any specific field

      So could help in adding this feature to the source, if people have some guidance

      I think this is not a good first thing to contribute to (instead, it requires pretty deep knowledge about the software, or so I think). But hearing about how you would want key-values to work, and maybe UI design ideas, would be helpful.
      (If nevertheless you would want to try, I'd suggest starting initially with some smaller tasks instead.)

      1. K@karu
          2023-07-19 19:50:04.875Z

          Hi Kaj(?), The goal is to host it at the university level and it would server as a watering hole of the computer architecture research area (SIGARCH) (https://www.sigarch.org/).
          I am a fairly well known professor in that field. We would self host it to start with, simply because it's a pain to pay by user and get the right funding lined up etc.
          To begin with, we are thinking we will populate it with the top 100 most influential papers in the past 25 years from the conference; and use talkyard as the framework for discussion, online chat etc centered around these.

          Depending on the traction, it's possible other research fields would also want to clone this, and it can take a life of it own.

          Hope this helps.
          -- Karu

          1. Ok, (Kaj or Magnus :- )) I like that project.

            Hmm I think it'll take less time for you, to add those 100 papers, than for <key, value> meta information to appear in Talkyard.

            I wonder what you think about starting without <key, value> support.

            One idea could be to manually create two sorted lists of those 100 papers (one sorted by year, another by citations?). And posting those lists on two pages in the forum — and they could be linked from the forum intro text (the text appearing just below the forum title) so people can find them.

            And thereafter see how things goes (do the discussions seem to get some traction)?

            Tagging the papers, and listing papers by tag, that seems more urgent to me. I guess that's something to prioritize (?)

        • In reply tokaru:

          I've thought a bit about this, and I might be wrong but I think it should be doable pretty soon to add these types of properties, or tags with values, together with adding API support for inserting tags & listing tags. And then, in a 2nd step later, sorting by property/tag values.

          I'm adding database columns for remembering any tag values, in the next version,

          here's how the new database columns for storing property/tag values, look right now: (If you have any feedback?)

          alter table tags_t
              add column  val_type_c   i16_gz_d,
              add column  val_i32_c    i32_d,
              add column  val_f64_c    f64_d,
              add column  val_str_c    text_nonempty_ste250_trimmed_d,
              add column  val_url_c    http_url_ste_250_d,
              add column  val_jsonb_c  jsonb_ste1000_d,
              add constraint  tags_c_valtype_has_val  check (
                  (val_type_c is null)
                    = (num_nonnulls(val_i32_c, val_f64_c, val_str_c, val_url_c, val_jsonb_c) = 0)),
              add constraint  tags_c_max_one_val_for_now  check (
                  num_nonnulls(val_i32_c, val_f64_c, val_str_c, val_url_c, val_jsonb_c) <= 1),
              add constraint  tags_c_val_is_i32_or_txt_for_now  check (
                  num_nonnulls(val_url_c, val_jsonb_c) = 0),
              add constraint  tags_c_valtype_simple_1_for_now  check (
                  (val_type_c = 1) or (val_type_c is null));
          
          
          comment on column  tags_t.val_type_c  is $_$
          
          1 (one) means it's a "simple" value, meaning, it's just what's stored:
          if val_i32_c is not null, the value is an integer, if val_f64_c is
          not null, it's a decimal value and so on.
          
          Later there might be more complex values, e.g. val_f64_c might be
          used to instead store a date (Unix time), or val_f64_c and val_i32_c
          to store the start and duration (seconds) of an event.
          
          The url in val_url_c can be combined with any other value, and makes
          it a link?  However, disabled for now.
          
          Or if val_type_c is html, then val_str_c would be interpreted as
          unsanitized unsafe html. But if val_type_c is simple (1), then it's plain text.
          
          Future plugins can use jsonb to store whatever they want.
          Could allow jsonb together with other values too? Could display
          a '{}' after any numeric or text value, to indicate that there's json.
          
          For now, urls and jsonb aren't allowed — only numbers and plain text.
          $_$; -- '
          

          Any thoughts, things to do differently?

          1. K@karu
              2023-07-24 15:23:21.533Z

              Just a quick comment Magnus. I'll look this over in the next few hours here and get back to you. Thanks for thinking this through and offering the opportunity to comment.

              1. In reply toKajMagnus:
                K@karu
                  2023-07-25 14:04:35.245Z

                  This schema looks good. And with the url, json string - its designed with substantial amount of future support. BTW, are you thinking tags and properties will become the same thing?

                  A tag that associate a string (like a hashtag), would have it's value be a string (and the key would some numeric or internal id)
                  A tag like number of citations for a paper would be something like: name of tag is citation; and the value is the number of citations?

                  I am not completely understanding where the "name" of the tag itself is stored.

                  I am not sure my comments are helping. I think the design you have looks good overall.

                  1. KajMagnus @KajMagnus2023-07-26 04:12:58.615Z2023-07-26 04:36:31.508Z

                    where the "name" of the tag itself is stored

                    Oh, I should have mentioned, there is already a tags_t table already, and each row points to a tag type in a tag_types_t table. The tag type stores the tag name (and how the tag looks, e.g. colors, although most of that hasn't been implemented).

                    substantial amount of future support

                    Yes I think so (or hope so :- )). That's helpful to know (that at least one other person thought it seemed like that as well).

                    are you thinking tags and properties will become the same thing?

                    I think there will be both, and that they'll be different. Tags have tag types, which tells Talkyard how to render a tag, access control (who may use or see tags of a specific type — some tags might be for moderators only), descriptions of the tags, and other things.

                    Whilst I imagine custom properties of a post, would be maybe just a jsonb value. Talkyard wouldn't know what the stuff therein means. If there's a { publishedYear: 1990, somethingElse: "abcd" } property on a page, then, what's Talkyard going to do with that? Rendering those key-values like two tags, would not always be the right thing to do. — I suspect that custom properties are instead more useful to future plugins. Then they decide how to interpret the properties. Whilst Talkyard natively understands what tags are and how & when to show them.

                    A tag like number of citations for a paper would be something like: name of tag is citation; and the value is the number of citations?

                    Yes

                    A tag that associate a string (like a hashtag), would have it's value be a string (and the key would some numeric or internal id)

                    Hmm not sure I understand. I think that would just be tags without values, maybe with their name like: #tag-name (the name stored in tag_types_t). Maybe this was unclear since I didn't mention that a tag name is stored in tag_types_t (?)

                    1. K@karu
                        2023-07-28 17:29:15.664Z

                        Hi Magnus, We are thinking of small deployment in a couple of weeks. DO you think this feature could be done in that timeline. Or should we go ahead and deploy with what's already there?

                        1. Hi Karu, I think this'll have been mostly implemented by then. Right now I'm adding tag values (e.g. Published: 1990) and listing posts with a certain tag (e.g. Comp-Sci or Citations). I suspect, though, that sorting by tag value, won't work yet, by then.

                          If I were you, I guess I'd wait two weeks. Doesn't matter much what you do (the software auto upgrades itself), but maybe it's more fun to install, once most things are in place and you can start using the API to add tags & values.

                  2. In reply tokaru:
                    KajMagnus @KajMagnus2023-07-30 08:07:53.060Z2023-07-30 08:14:13.759Z

                    Once these fields have been populated, would like a view that sorts by a given key and shows all the topics in a category.

                    Could you give some concrete examples of how you would use this?

                    One example maybe could be: In the Computer Science category, I want to see all papers sorted by publication year? Another could be: Show me all Comp Sci papers sorted by number of citations?

                    Or am I misunderstanding something :- )   What did you have in mind?

                    What about queries like: Show all Comp Sci papers from the 1990's, sorted by number of citations?


                    In this: (from the orig post)

                    https://../top/category-name/key-name
                    

                    Could you give some examples of what key-name could be? (And category-name too :- ))

                    1. In reply tokaru:

                      I've made some progress with this. Now it is (in the development version on my laptop) possible to add values to tags, and list tagged topics and filter by value.

                      For example, this search query: tags:published>=1990,citations>=2,comp-sci some text searches only papers published after 1990 and have at least two citations and have been tagged with comp-sci, for the text "some text".

                      Later, you'll be able to sort, maybe something like this:

                      tag:published>1990,citations>=1  sort:tagval:published:asc
                      

                      So, sort by "tagval" i.e. tag value, of the tag named published, in ascending order.

                      Using ElasticSearch for all this :- )   (PostgreSQL is nice but doesn't have all advanced search features needed for this)

                      1. K@karu
                          2023-08-09 16:16:41.904Z

                          Nice - sorry I was travelling and missed the previous one. This progress looks great. We are spending time this week and next week populating the test webiste more.

                          1. Ok :- )   no hurry, the API will change, to support tags with values. Now I just found this nice (or so I think) syntax: tags:Published-Year:desc<=1990,... — both sort order and upper limit, in the same brief expression. So that (or something like that) will work as well. (That example shows papers starting from 1990 and backwards: 1989, 1988 ... while :asc>=1990 would show papers from 1990, 1991, ...)

                        • In reply tokaru:

                          Ok now everything works, in my work-in-progress branch, including creating pages-with-tags-and-values via the API. ... Now time to write some automatic tests, and code review, will take about a week. I'll publish some API docs too, maybe tomorrow, if you have any feedback.

                          I suspect that Talkyard will be the first discussion software in the world that supports tags with values :- )

                          Almost no other software does — one of the few who does, is Kubernetes and their labels key-value pairs (that's not discussion software though). But that's text values only — one can't sort in numerical order or do >=, <= etc comparisons.

                          1. KajMagnus @KajMagnus2023-09-15 13:09:17.424Z2023-10-12 06:54:34.840Z

                            Now writing automatic tests. Things took longer, because of some stuff I discovered during code review.

                            Edit: Plus there were more things to fix, related to ElasticSearch mappings and reindexing. (Now done, October 12)

                          2. In reply tokaru:

                            I'm going to add a Priority: NNN tag-with-value to this forum, so I can sort by priority, and remember what to do next. Just recently there was something I had forgotten for about 2 years :- /

                            1. In reply tokaru:

                              Now this has finally been implemented, & tested etc. Building a new server, to deploy to Prod in a few days, barring any bugs.

                              1. In reply tokaru:

                                Now this has been implemented (since the end of October) — if you search for: tags:priority:desc>3 you'll find topics tagged with "priority", in descending order, and priority > 3. Would work also for things like tags:publish-year:desc<1995,citations>=2 — finds papers up to year 1995 with at least two citations.

                                ( @karu what did you end up doing? I'm thinking / guessing that in the end you went with some other software, since this took long to implement? (Properties-values / tags-with-values, was in the things-planned-to-do list in any case))

                                ***

                                It'd be nice if the search results page would show tags & values too. Right now it shows just the topic titles and the text that got "hit". And something like: is:open is needed, if one wants to ignore topics that's been closed / done.

                                1. Progress
                                  with doing this idea
                                2. @KajMagnus marked this topic as Planned 2023-07-19 19:42:59.719Z.
                                3. @KajMagnus marked this topic as Started 2023-08-08 05:20:09.094Z.
                                4. @KajMagnus marked this topic as Done 2023-12-25 05:41:05.761Z.