Elog.io now up w/ Commons data

List overview All Threads
Download

newer

older

Media Viewer 2014 Reports

Re: [Commons-l] [Wikimania-l]...

Jonas Öberg

11 Dec 2014 11 Dec '14

midnight

Dear all,

thanks for all your help with answering questions and giving feedback over the last couple of months. I'm happy to say that we're finally at a stage where we've hashed 22,452,638 images from Wikimedia Commons and launched Elog.io in public beta: http://elog.io/

Elog.io is an open API as well as browser plugins, that can query and get information about images using a perceptual hash that's easy and quick to calculate in a browser.

What the browser extensions allow you to do is match an image you find "in the wild" against Wikimedia Commons. If it can be matched against an image from Commons, it'll show you the title, author, and license, and give you links back to Wikimedia, the license, and a quick and handy "Copy as HTML" to copy the image and attribution as a HTML snippet for pasting into Word, LibreOffice, Wordpress, etc.

Our API provides lookup functions to find information using a URL (the Commons' page name URL) or using the perceptual hash. You get information back as JSON in W3C Media Annotations format. of course, the information you get back is no better than the one provided by the Commons API, so if you already have a page name URL, you may as well query it directly, and rely on our API only for searching by perceptual hashes.

The algorithm we use for calculating perceptual hashes, which you'll need to query our API, is at http://blockhash.io/

Sincerely, Jonas

Show replies by date

Cornelius Kibelka

11 Dec 11 Dec

8:26 p.m.

Wow, what a nice and interesting browser extension. Congrats!

Just a question: as far as I can see the tool doens't give the complete and correction licensing information, as the source is missing. Or I'm missleading?

Best Cornelius

2014-12-10 19:30 GMT+01:00 Jonas Öberg jonas@commonsmachinery.se:

...

Dear all,

thanks for all your help with answering questions and giving feedback over the last couple of months. I'm happy to say that we're finally at a stage where we've hashed 22,452,638 images from Wikimedia Commons and launched Elog.io in public beta: http://elog.io/

Elog.io is an open API as well as browser plugins, that can query and get information about images using a perceptual hash that's easy and quick to calculate in a browser.

What the browser extensions allow you to do is match an image you find "in the wild" against Wikimedia Commons. If it can be matched against an image from Commons, it'll show you the title, author, and license, and give you links back to Wikimedia, the license, and a quick and handy "Copy as HTML" to copy the image and attribution as a HTML snippet for pasting into Word, LibreOffice, Wordpress, etc.

Our API provides lookup functions to find information using a URL (the Commons' page name URL) or using the perceptual hash. You get information back as JSON in W3C Media Annotations format. of course, the information you get back is no better than the one provided by the Commons API, so if you already have a page name URL, you may as well query it directly, and rely on our API only for searching by perceptual hashes.

The algorithm we use for calculating perceptual hashes, which you'll need to query our API, is at http://blockhash.io/

Sincerely, Jonas

Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l

-- Cornelius Kibelka International Affairs Werkstudent | student trainee Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin Tel.: +49 30 219158260 http://wikimedia.de http://wikimedia.de/Stellen Sie sich eine Welt vor, in der jeder Mensch freien Zugang zu der Gesamtheit des Wissens der Menschheit hat. Helfen Sie uns dabei! http://spenden.wikimedia.de/ Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Jean-Frédéric

12 Dec 12 Dec

8:22 p.m.

Dear Jonas,

2014-12-10

thanks for all your help with answering questions and giving feedback

...

over the last couple of months. I'm happy to say that we're finally at a stage where we've hashed 22,452,638 images from Wikimedia Commons and launched Elog.io in public beta: http://elog.io/

This is awesome :) I’m already a keen user!

I would have a question. Typically people are really happy when an image they uploaded is reused somewhere on the Internet (or angry, depending on the quality of attribution ;-).

We probably cannot scan the Internetz and running every single image out there against elog.io ; but do you by any chance log the successful hits in elog.io ? It would be awesome if we could retrieve the information that elog.io identified File:X at http://example.com/

...

Our API provides lookup functions to find information using a URL (the Commons' page name URL)

I must be doing something wrong, but I have not succeeded in using this :-/ This is supposed to be the URL to query per [1] ? < https://catalog.elog.io/lookup/uri%3E Could you give a working example ?

[1] http://docs.cmcatalog.apiary.io/#get-%2Flookup%2Furi%7B%3Furi%2Ccontext%2Cpa...

-- Jean-Fred

Jean-Frédéric

14 Feb 14 Feb

1:13 a.m.

Hey Jonas,

You may have missed my email for December ; I would still be interested in the answer though − Any chance you might have a look? :) See below.

thanks for all your help with answering questions and giving feedback

...

...
over the last couple of months. I'm happy to say that we're finally at a stage where we've hashed 22,452,638 images from Wikimedia Commons and launched Elog.io in public beta: http://elog.io/

This is awesome :) I’m already a keen user!

I would have a question. Typically people are really happy when an image they uploaded is reused somewhere on the Internet (or angry, depending on the quality of attribution ;-).

We probably cannot scan the Internetz and running every single image out there against elog.io ; but do you by any chance log the successful hits in elog.io ? It would be awesome if we could retrieve the information that elog.io identified File:X at http://example.com/

...
Our API provides lookup functions to find information using a URL (the Commons' page name URL)

I must be doing something wrong, but I have not succeeded in using this :-/ This is supposed to be the URL to query per [1] ? < https://catalog.elog.io/lookup/uri%3E Could you give a working example ?

[1] http://docs.cmcatalog.apiary.io/#get-%2Flookup%2Furi%7B%3Furi%2Ccontext%2Cpa...

Cheers,

-- Jean-Frédéric

Jonas Öberg

1:24 a.m.

Hi Jean-Frédéric,

yes - you're right, I missed that. So sorry!!

...

...
We probably cannot scan the Internetz and running every single image out there against elog.io

No, for that purpose, I think Google Image Reverse Search of TinEye does a fair enough work. Involves a bit of manual digging, but scanning the Internet is indeed something that require quite a bit more resources than we could muster :-)

...

but do you by any chance log the successful hits in elog.io ?

Yes, we do. At the moment, we don't do anything with it though. What we would love to do is to take the information we gather - essentially pairs of images - and run that through PyBossa or similar crowdsourcing platform so that people can contribute to making matches between images by answering the question "Are these two images the same?"

If we get more than a few people saying that they are, we can make that connection in the Elog.io database, which could not only trigger some action on "re-use", but which would also make the information about that image pop up automatically when someone browses it without needing to separately click the "Query" button.

We do not want to add these matches directly without some validation since that would open up the possibility of abuse. But we think the crowdsourcing could work well, and that in itself opens up the possibility of using the same crowd to gather even more information.

Sincerely, Jonas

Andrea Zanni

1:56 a.m.

I don't know if this campaign as been posted to this list. but I think it's relevant: https://www.indiegogo.com/projects/elog-io-building-provenance-for-digital-w...

Aubrey

On Fri, Feb 13, 2015 at 8:54 PM, Jonas Öberg jonas@commonsmachinery.se wrote:

...

Hi Jean-Frédéric,

yes - you're right, I missed that. So sorry!!

...
...
We probably cannot scan the Internetz and running every single image

out

...
...
there against elog.io

No, for that purpose, I think Google Image Reverse Search of TinEye does a fair enough work. Involves a bit of manual digging, but scanning the Internet is indeed something that require quite a bit more resources than we could muster :-)

...
but do you by any chance log the successful hits in elog.io ?

Yes, we do. At the moment, we don't do anything with it though. What we would love to do is to take the information we gather - essentially pairs of images - and run that through PyBossa or similar crowdsourcing platform so that people can contribute to making matches between images by answering the question "Are these two images the same?"

If we get more than a few people saying that they are, we can make that connection in the Elog.io database, which could not only trigger some action on "re-use", but which would also make the information about that image pop up automatically when someone browses it without needing to separately click the "Query" button.

We do not want to add these matches directly without some validation since that would open up the possibility of abuse. But we think the crowdsourcing could work well, and that in itself opens up the possibility of using the same crowd to gather even more information.

Sincerely, Jonas

Commons-l mailing list Commons-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/commons-l

3724

Age (days ago)

3789

Last active (days ago)

commons-l@lists.wikimedia.org

5 comments

4 participants

tags (0)

participants (4)

Andrea Zanni
Cornelius Kibelka
Jean-Frédéric
Jonas Öberg