Skip to content

More powerful pin filtering of Nearby map via allowing users to modify SPARQL query #3410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
misaochan opened this issue Feb 14, 2020 · 25 comments · Fixed by #4714
Closed

More powerful pin filtering of Nearby map via allowing users to modify SPARQL query #3410

misaochan opened this issue Feb 14, 2020 · 25 comments · Fixed by #4714

Comments

@misaochan
Copy link
Member

As requested by @VojtechDostal , power users familiar with SPARQL would find it useful to be able to partially modify the SPARQL query themselves to filter results. We can probably implement this in a similar manner to https://tools.wmflabs.org/wikishootme#lat=50.0720077&lng=14.424068199999999&zoom=15

@VojtechDostal
Copy link
Collaborator

Adding a screenshot from WikiShootMe. Thank you Josephine for starting this issue :)

sparql filter

@misaochan
Copy link
Member Author

My suggestions on how to implement this:

  • In the "filters" dropdown menu for "place type", we have an additional option: "Advanced options". This should probably stand out from the other place types through the use of different font/color/etc.
  • When the user selects that, they are brought to a dialog that lets them modify the SPARQL. As we are already looking at a map with an existing query rather than a blank slate, I would suggest that we display the default SPARQL query, rather than a blank text field like WikiShootMe.
  • The buttons should probably be "Close", "Reset" and "Apply". If the user chooses "Reset", all their changes are cleared and the default SPARQL is displayed again.
  • If the query is invalid, either we need to check this in the dialog step and inform the user that they need to change it, or display a toast that says "Invalid SPARQL, resetting to default" and do just that.

What do you think @neslihanturan @VojtechDostal @nicolas-raoul ?

@nicolas-raoul
Copy link
Member

nicolas-raoul commented May 4, 2021

Sounds perfect!
About the last step though: If a custom request fails or times out (50% of my SPARQL errors are mistyped variable names, which unfortunately result in hard-to-debug timeouts), I would suggest telling the user to fix or reset their query, so that they don't have to retype it again.
I don't think we have a way to check the validity of a query without actually executing it.

@neslihanturan
Copy link
Collaborator

I love the use case @misaochan . Now I am more clear about this issue to create a UI. I agree @nicolas-raoul for the last step. Also we can have reset to default or save my query buttons for further usages maybe? So that they can bring their query back instead of typing all over again.

@nicolas-raoul
Copy link
Member

nicolas-raoul commented May 4, 2021

Custom SPARQL is rather niche, so personally I don't really think a "save my query" feature would be worth the complexity&maintenance. People who write SPARQL queries should be able to save their queries in a memo app, a text file or even an email draft :-)

By the way, the advantage of letting users modify the whole query (rather than just a fraction like WikiShootMe) is that we can do much deeper customization, for instance here is how I would possibly use this feature:

  1. Exclude schools
  2. Search for labels in all of the languages I know, rather than just one language, before fallback
  3. Artificially increase radius in the areas where I have already taken all pictures
  4. Increase the meaning of what is considered as "destroyed"
  5. Sometimes misuse the "destroyed" feature to simply differentiate between two kinds of items. Example: Black for items that have an article only in the Cebuano Wikipedia
  6. Remove some stuff that I typically do not use, to make queries faster

This is all doable by editing the SPARQL query.

@misaochan
Copy link
Member Author

I agree that we probably don't need a "save my query" function for the time being, and also the full query should be editable.

About the last step though: If a custom request fails or times out (50% of my SPARQL errors are mistyped variable names, which unfortunately result in hard-to-debug timeouts), I would suggest telling the user to fix or reset their query, so that they don't have to retype it again.
I don't think we have a way to check the validity of a query without actually executing it.

Ah, when I said "invalid" I was actually thinking about the request failing - are they different? I agree with both of you, auto-resetting is probably a bad idea now that I think about it. :)

@neslihanturan
Copy link
Collaborator

Hi everyone, there are some points needs to be clarified appeared during my mockup attempts. Currently we use this query:

SELECT
     (SAMPLE(?location) as ?location)
     ?item
     (SAMPLE(COALESCE(?itemLabelPreferredLanguage, ?itemLabelAnyLanguage)) as ?label)
     (SAMPLE(COALESCE(?itemDescriptionPreferredLanguage, ?itemDescriptionAnyLanguage, "?")) as ?description)
     (SAMPLE(?classId) as ?class)
     (SAMPLE(COALESCE(?classLabelPreferredLanguage, ?classLabelAnyLanguage, "?")) as ?classLabel)
     (SAMPLE(COALESCE(?icon0, ?icon1)) as ?icon)
     ?wikipediaArticle
     ?commonsArticle
     (SAMPLE(?commonsCategory) as ?commonsCategory)
     (SAMPLE(?pic) as ?pic)
     (SAMPLE(?destroyed) as ?destroyed)
   WHERE {
     # Around given location...
     SERVICE wikibase:around {
       ?item wdt:P625 ?location.
       bd:serviceParam wikibase:center "Point(${LONG} ${LAT})"^^geo:wktLiteral.
       bd:serviceParam wikibase:radius "${RAD}" . # Radius in kilometers.
     }

     # Get the label in the preferred language of the user, or any other language if no label is available in that language.
     OPTIONAL {?item rdfs:label ?itemLabelPreferredLanguage. FILTER (lang(?itemLabelPreferredLanguage) = "${LANG}")}
     OPTIONAL {?item rdfs:label ?itemLabelAnyLanguage}

     # Get the description in the preferred language of the user, or any other language if no description is available in that language.
     OPTIONAL {?item schema:description ?itemDescriptionPreferredLanguage. FILTER (lang(?itemDescriptionPreferredLanguage) = "${LANG}")}
     OPTIONAL {?item schema:description ?itemDescriptionAnyLanguage }

     # Get Commons category (P373)
     OPTIONAL { ?item wdt:P373 ?commonsCategory. }

     # Get (P18)
     OPTIONAL { ?item wdt:P18 ?pic. }

     # Get (P576)
     OPTIONAL { ?item wdt:P576 ?destroyed. }

     # Get the class label in the preferred language of the user, or any other language if no label is available in that language.
     OPTIONAL {
       ?item p:P31/ps:P31 ?classId.
       OPTIONAL {?classId rdfs:label ?classLabelPreferredLanguage. FILTER (lang(?classLabelPreferredLanguage) = "${LANG}")}
       OPTIONAL {?classId rdfs:label ?classLabelAnyLanguage}

       OPTIONAL {
           ?wikipediaArticle   schema:about ?item ;
                               schema:isPartOf <https://${LANG}.wikipedia.org/> .
         }
       OPTIONAL {
           ?wikipediaArticle   schema:about ?item ;
                               schema:isPartOf <https://en.wikipedia.org/> .
           SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
         }

         OPTIONAL {
           ?commonsArticle   schema:about ?item ;
                               schema:isPartOf <https://commons.wikimedia.org/> .
           SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
         }
     }
   }
   GROUP BY ?item ?wikipediaArticle ?commonsArticle

Where location, label, description, class, classLabel, icon, wikipediaArticle, commonsArticle, commonsCategory, pic, destroyed are variables that we reach and use their values in NearbYResultItem.kt class. On the other hand ${LONG}, ${LAT}, ${RAD} and ${LANG} are the inputs that we send to the query. So:

1- We should notify the user about they can use these inputs (${LONG}, ${LAT}, ${RAD},${LANG}) in their query as well.
2- We should make some fields (i.e. location, label, description, class...) uneditable, from my understanding of SPARQL, we may let the WHERE block can be edited. If you confirm me I will continue to the mockup accordingly. What I mean by WHERE block is:

WHERE {
     # Around given location...
     SERVICE wikibase:around {
       ?item wdt:P625 ?location.
       bd:serviceParam wikibase:center "Point(${LONG} ${LAT})"^^geo:wktLiteral.
       bd:serviceParam wikibase:radius "${RAD}" . # Radius in kilometers.
     }

     # Get the label in the preferred language of the user, or any other language if no label is available in that language.
     OPTIONAL {?item rdfs:label ?itemLabelPreferredLanguage. FILTER (lang(?itemLabelPreferredLanguage) = "${LANG}")}
     OPTIONAL {?item rdfs:label ?itemLabelAnyLanguage}

     # Get the description in the preferred language of the user, or any other language if no description is available in that language.
     OPTIONAL {?item schema:description ?itemDescriptionPreferredLanguage. FILTER (lang(?itemDescriptionPreferredLanguage) = "${LANG}")}
     OPTIONAL {?item schema:description ?itemDescriptionAnyLanguage }

     # Get Commons category (P373)
     OPTIONAL { ?item wdt:P373 ?commonsCategory. }

     # Get (P18)
     OPTIONAL { ?item wdt:P18 ?pic. }

     # Get (P576)
     OPTIONAL { ?item wdt:P576 ?destroyed. }

     # Get the class label in the preferred language of the user, or any other language if no label is available in that language.
     OPTIONAL {
       ?item p:P31/ps:P31 ?classId.
       OPTIONAL {?classId rdfs:label ?classLabelPreferredLanguage. FILTER (lang(?classLabelPreferredLanguage) = "${LANG}")}
       OPTIONAL {?classId rdfs:label ?classLabelAnyLanguage}

       OPTIONAL {
           ?wikipediaArticle   schema:about ?item ;
                               schema:isPartOf <https://${LANG}.wikipedia.org/> .
         }
       OPTIONAL {
           ?wikipediaArticle   schema:about ?item ;
                               schema:isPartOf <https://en.wikipedia.org/> .
           SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
         }

         OPTIONAL {
           ?commonsArticle   schema:about ?item ;
                               schema:isPartOf <https://commons.wikimedia.org/> .
           SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
         }
     }
   }

@nicolas-raoul
Copy link
Member

For my use case number 2 above, I need to modify the COALESCE, which is outside of WHERE.
I am in favor of having just a big text area, and a tooltip that contains a "read more" link to the wiki (such tooltips with "read more" weblinks can be seen in Achievements). I can write that wiki page, it can even contain a collection of purpose-specific custom queries.
This feature is for power users who are knowledgeable in SPARQL.
So I would be in favor of putting on the user the responsibility of having valid SPARQL with all needed inputs/outputs.

@neslihanturan
Copy link
Collaborator

Hmm, we can let people edit COALESCE without editing "as ?label", "as ?description" parts. Them might be superusers when it comes to SPARQL but it is not only about the SPARQL, also about to know about the code in our side and we can not expect them to know the variable names that our code preferred.

@nicolas-raoul
Copy link
Member

We can list the variables in the tooltip or "read more".
Users start from the default query, it is not like they have a blank page.
In SPARQL queries, output variables are always at the same place, and people know to not modify the output variable names.
If they mess up, resetting to default is easy enough.

Having several text fields separated by as ?label/etc areas would make copy/pasting (from the wiki for instance) very cumbersome and error-prone.

@neslihanturan
Copy link
Collaborator

"Having several text fields separated by as ?label/etc areas would make copy/pasting (from the wiki for instance) very cumbersome and error-prone."
hmm very true, I didn't think about this user flow. So I will continue as you said with tooltips, thanks for the inputs @nicolas-raoul !

@neslihanturan
Copy link
Collaborator

Here is my suggestion, I am happy so hear your feedbacks.
attempt1

@nicolas-raoul
Copy link
Member

Looks good to me!

A few minor things:

  • Grey background implies that the text is not editable.
  • - If you are a super user you can edit nearby query below -> You can customize the Nearby query. If you get errors, reset and apply.
  • Put the (i) tooltip after that text, rather than after Advanced options.
  • I suggest this tooltip content:
    • You can customize the Nearby query to show things that interest you, show labels in your preferred languages, etc. The Nearby query is written in SPARQL, which is very powerful but also hard to debug. We suggest you start from the default query and modify things little by little. The good news is that you can always go back to the default: just reset and apply. (or shorter if that does not fit)
    • Read more linking to https://commons-app.github.io/docs.html#nearby-query

@misaochan
Copy link
Member Author

Here is my suggestion, I am happy so hear your feedbacks.

Looks good to me @neslihanturan . :) @nicolas-raoul 's changes sound good as well.

Where location, label, description, class, classLabel, icon, wikipediaArticle, commonsArticle, commonsCategory, pic, destroyed are variables that we reach and use their values in NearbYResultItem.kt class

I think that it might be beneficial to check that these variables all exist when the user taps "Apply" - if any of them don't exist, show an error dialog and do not proceed. Otherwise, for instance if a user removes commonsCategory, they could get a nice map with all the pins shown, but the app will probably crash at the final stage of a Nearby upload since we require a check for that variable. Given that the events are so far apart, they may not connect the crash to their SPARQL modification, and file a crash report that we will be scratching our heads over for months (especially if they don't mention in the crash report that they made a modification).

@nicolas-raoul
Copy link
Member

Checking that the request's String contains the String ?wikipediaArticle (etc) before the WHERE can be worth checking indeed, even if that does not guarantee that wikipediaArticle will actually be an output, even less that it is in the right format (for instance it could contain a number).

I would recommend against checking for the String as ?wikipediaArticle, because someone who reads only English might decide to make their SPARQL faster by removing the whole localization fallback mechanism, removing the need for SAMPLE [...] AS part.

@misaochan
Copy link
Member Author

Makes sense. I wonder if it's possible to make a note in ACRA if a modified query is being used in Nearby?

@neslihanturan
Copy link
Collaborator

Here are mockups after reviews, I hope all good:
attempt2

@nicolas-raoul
Copy link
Member

Looks perfect to me!

Maybe "Advanced" is enough, instead of "Advanced options"?

@misaochan
Copy link
Member Author

Looks great to me. :) I think "Advanced" would look nicer with the UI, but "Advanced Options" seems clearer. I'm OK with either one.

@neslihanturan
Copy link
Collaborator

I also think that Advanced Options is clearer, but I a okay whichever you wish.

@nicolas-raoul
Copy link
Member

Sorry for nitpicking, let's go with "Advanced Options" since you both think it is clearer :-)

@misaochan misaochan added this to the v3.2.0 beta milestone Nov 1, 2021
@ashishkumar468
Copy link
Collaborator

ashishkumar468 commented Nov 8, 2021

Hi @misaochan , question around this

  1. Will the radius expander we apply implicitly around the nearby query be applicable even with the advanced query?

@nicolas-raoul
Copy link
Member

Hi @ashishkumar468

I think most people will not modify the ${RAD} part, in which case no special code is needed.

But if we want to take care of those who modify it, I can see two cases that could be considered as special:

  • The custom SPARQL contains no ${RAD}. In that case, maybe just ignore (do not perform the radius string replacement) and proceed as usual. Maybe no source code change is even necessary depending on the replacement function that was used.
  • The radius expander while loop never ends. To handle that case, I would suggest adding a RADIUS_LOOP_MAX_ITERATIONS (I suggest setting it to 16 meaning 4900 kilometers which works even in the most remote places, I just tried) and throwing an error if the while loop reaches it.

@ashishkumar468
Copy link
Collaborator

Thanks for the input @nicolas-raoul , this should work : )

@misaochan
Copy link
Member Author

misaochan commented Nov 8, 2021

Agreed, handling those cases should be sufficient. :) IMO the basic premise of this feature is that it's only for advanced users, and anything the user changes in that text field can come with a potential of rendering Nearby unusable for them. (Which is why I think, as I mentioned at #3410 (comment) , that we should add a note to the ACRA crash report and Send Logs that a custom SPARQL query was used, so that it doesn't raise false alarms if such errors are reported.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants