Skip to content

Improve data quality of Nearby List #271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tobias47n9e opened this issue Sep 12, 2016 · 3 comments
Closed

Improve data quality of Nearby List #271

tobias47n9e opened this issue Sep 12, 2016 · 3 comments

Comments

@tobias47n9e
Copy link
Member

I was testing the improvements made in #250 today using GPS spoofing. I was quite surprised how many items have question marks as icons. But the problem seems to be the data we use:

https://tools.wmflabs.org/wiki-needs-pictures/data/data.csv

The list classifies 3600 things as "adm3rd", which is probably a artefact of some vandalism? 150 000 items are undefined. That is 75 % of the 200 000 items in the list.

Another issue is that the list gives a string for the target-item of the p31 statement. I think it would be better to use the Q-ID both in the data and in the program code (Don't compare to the English label which could change over time).

Probably the easiest way of solving this would be to switch to a query.wikidata.org request. That returns the p31 statement along with some other information (e.g. the locale of the user to fetch the right labels). With the p31 target we can then assign the icons like this:

https://github.com/misaochan/apps-android-commons/blob/4b01f6e95f79cc507dba1ea0ff8335eef9b11521/app/src/main/java/fr/free/nrw/commons/nearby/NearbyListFragment.java#L192

switch(place.qid) {
                case "Q4022":
                    icon.setImageResource(R.drawable.icon_river;
                    break;
                case "Q355304":
                    icon.setImageResource(R.drawable.icon_river);
                    break;

Items:

Assigning the symbols is not trivial, but for a start we can just collect the most used items in p31 statements.

@Eccenux
Copy link

Eccenux commented Sep 12, 2016

Just checked things around Gdynia and those nearby places don't make any sense. There is no name nor address in the application just those Q1234 ids which had no meaning to me until I read this report.

There is surely structured data and even name which can be provided. E.g. there is a wiki article here:
https://m.wikidata.org/wiki/Q9294048
And here:
https://m.wikidata.org/wiki/Q10929372

On both there is no meaningful data in the app (just the id and link to map).

BTW both articles linked already has pictures on the subject.

@misaochan
Copy link
Member

Thanks for the feedback, guys. I agree, we will need to migrate to a different data source, especially due to the prevalence of QID items. However, there seems to already be a discussion at #260 about this, do you think this is a separate matter or can the two be merged for clarity?

@tobias47n9e
Copy link
Member Author

I guess it is 2 issues with the same underlying cause. I am okay with closing this one in favour or the other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants