Skip to content

Image EXIF/XMP metadata removal routine #2863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Jun 4, 2019

Conversation

VitalyVPinchuk
Copy link
Contributor

@VitalyVPinchuk VitalyVPinchuk commented Apr 5, 2019

Description (required)
Updating pull request #1712 to recent app version.

Fixes #1712 [WIP] Implements of in-app EXIF and location anonymization
In response to #2671.

What changes did you make and why?
Uses @ilgazer's code of branch master.

Method redactExifTags() is called from FileProcessor.processFileCoordinates(), which removes selected EXIF tags from file being uploaded.

Tests performed (required)
Tested betaDebug on emulator with API level 28.
Instrumental tests are broken for me now.

Screenshots showing what changed (optional - for UI changes)

@codecov-io
Copy link

codecov-io commented Apr 5, 2019

Codecov Report

Merging #2863 into master will increase coverage by <.01%.
The diff coverage is 3.84%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2863      +/-   ##
=========================================
+ Coverage    3.69%   3.69%   +<.01%     
=========================================
  Files         247     249       +2     
  Lines       12217   12268      +51     
  Branches     1083    1088       +5     
=========================================
+ Hits          451     453       +2     
- Misses      11732   11780      +48     
- Partials       34      35       +1
Impacted Files Coverage Δ
.../main/java/fr/free/nrw/commons/settings/Prefs.java 0% <ø> (ø) ⬆️
...java/fr/free/nrw/commons/upload/FileProcessor.java 3.96% <0%> (-1.38%) ⬇️
...fr/free/nrw/commons/settings/SettingsFragment.java 0% <0%> (ø) ⬆️
...n/java/fr/free/nrw/commons/upload/UploadModel.java 34.25% <0%> (ø) ⬆️
...references/LongTitleMultiSelectListPreference.java 0% <0%> (ø)
.../fr/free/nrw/commons/upload/FileMetadataUtils.java 18.18% <18.18%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d827e4...2fbecab. Read the comment docs.

Copy link
Collaborator

@neslihanturan neslihanturan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a clean PR

@neslihanturan
Copy link
Collaborator

I aprroved it by reading the code, however it needs some tests. Does anyone have some tests photos which has location EXIF?

@ilgazer
Copy link
Contributor

ilgazer commented Apr 10, 2019 via email

@misaochan
Copy link
Member

misaochan commented Apr 10, 2019

Hello @VitalyVPinchuk , thanks for taking over this PR. It seems that the location accuracy feature has been carried over - unfortunately this is very contentious, as per the comments starting #1712 (comment) . Until we can reach a consensus on how to do this without adding erroneous information to Commons, I would strongly suggest removing that part from the PR entirely, and keeping the rest of the anonymization. Then we can go ahead with merging this PR if the tests succeed.

Copy link
Member

@misaochan misaochan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending removal of feature that submits deliberately imprecise coordinates

@VitalyVPinchuk
Copy link
Contributor Author

Hello @misaochan
I've got no objections over your remark. I should've read the thread more carefully.

Should I manage lat/long tags as the rest of the other exif tags? That is, just provide a user an option to keep them or remove them?

@misaochan
Copy link
Member

Should I manage lat/long tags as the rest of the other exif tags? That is, just provide a user an option to keep them or remove them?

Yes, this sounds good, thanks! :)

@VitalyVPinchuk
Copy link
Contributor Author

Pending removal of feature that submits deliberately imprecise coordinates

Updated.

@neslihanturan
Copy link
Collaborator

Hi @VitalyVPinchuk I have tested for camera model and it works. @ilgazer it would be nice using your test images if they include geotag etc. @misaochan which aspects of this PR did you test?

@misaochan
Copy link
Member

@neslihanturan I have not tested this, sorry. I have only looked at the code to check that the location accuracy feature was removed. :)

@maskaravivek
Copy link
Member

@VitalyVPinchuk Can you rename the PR title to make it more descriptive.

@VitalyVPinchuk VitalyVPinchuk changed the title Continuing PR #1712 Image EXIF/XMP metadata removal routine Apr 18, 2019
@maskaravivek
Copy link
Member

@VitalyVPinchuk Can you also add unit test cases for the changes made in this PR.


import static androidx.exifinterface.media.ExifInterface.*;

public class FileMetadataUtils {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit test cases.

@VitalyVPinchuk
Copy link
Contributor Author

Thanks for the thorough analysis, I'll try to fix all of your remarks.

@ilgazer
Copy link
Contributor

ilgazer commented Apr 19, 2019

THese are my test images, I hope they are useful to you as well:
IMG_7154
There is also this image from commons for both EXIF and XMP https://commons.wikimedia.org/wiki/File:Metadata_test_file_-_includes_data_in_IIM,_XMP,_and_Exif.jpg , and these for EXIF data https://github.com/ianare/exif-samples

By the way, the XMP code vas very much a work in progress and wasn't working if I remember correctly. Has anyone tested it? Looking back, starting work on XMP support feels like scope creep. Maybe we should add in the EXIF stuff and put the XMP stuff in a seperate PR?

Copy link
Collaborator

@neslihanturan neslihanturan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all reviewers should be happy by now. Thanks for the great job both @VitalyVPinchuk and @ilgazer .

Copy link
Collaborator

@neslihanturan neslihanturan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, I thought this PR is ready to be merged but Travis fails...

@VitalyVPinchuk
Copy link
Contributor Author

Oh no, I thought this PR is ready to be merged but Travis fails...

I can't understand what causes the issue. Can it be due to adding a folder with sample image?

val filePathTmp: String? = "" + System.getProperty("java.io.tmpdir") + "exif_redact_sample_tmp.jpg"

val inStream = FileInputStream(filePathRef)
val outStream = FileOutputStream(filePathTmp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the details of the test results, it says:

fr.free.nrw.commons.upload.FileProcessorTest > redactExifTags FAILED
    java.io.FileNotFoundException at FileProcessorTest.kt:53

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I removed it. Help needed. Sorry.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @VitalyVPinchuk , let me investigate. Thanks for your fast responses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @neslihanturan
Any luck investigating the problem?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @VitalyVPinchuk sorry for the delay. Currently I have another task, I am planning to finalize this one at May 18. Thanks for keeping the pr up to date.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @VitalyVPinchuk , I am still busy with my current refactor task. Didn't forget about this one though:)

Copy link
Collaborator

@neslihanturan neslihanturan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this issue is sent before our test requirement and there is so much effort on this thread, we decided to merge this. Thanks @VitalyVPinchuk , I will create an issue to add tests for this PR.

@neslihanturan neslihanturan merged commit cc0b059 into commons-app:master Jun 4, 2019
@VitalyVPinchuk VitalyVPinchuk deleted the pr-1712 branch June 4, 2019 14:04
@ashishkumar468
Copy link
Collaborator

ashishkumar468 commented Jun 27, 2019

Hi @VitalyVPinchuk Sorry to bother you after the PR is closed, but this merge has introduced a new bug which prunes out GPS information from the Exif data.
Here is the relevant function

`public static void redactExifTags(ExifInterface exifInterface, Set<String> redactTags) {
        if(redactTags.isEmpty()) return;

         Disposable disposable = Observable.fromIterable(redactTags)
                 .flatMap(tag -> Observable.fromArray(FileMetadataUtils.getTagsFromPref(tag)))
                 .forEach(tag -> {
                     Timber.d("Checking for tag: %s", tag);
                     String oldValue = exifInterface.getAttribute(tag);
                     if (oldValue != null && !oldValue.isEmpty()) {
                         Timber.d("Exif tag %s with value %s redacted.", tag, oldValue);
                         exifInterface.setAttribute(tag, null);
                     }
                 });
         CompositeDisposable disposables = new CompositeDisposable();
         disposables.add(disposable);
         disposables.clear();

         try {
             exifInterface.saveAttributes();
        } catch (IOException e) {
            Timber.w("EXIF redaction failed: %s", e.toString());
        }
    }`

Line Number 125, FileProcessor.Java, current master. I could not exactly understand the utility of this function. Can you explain me what exactly is the purpose of this function?

@VitalyVPinchuk
Copy link
Contributor Author

Hello @ashishkumar468 !
This function is intended to remove EXIF tags (that could be specified via settings activity) such as GPS Latitude/Longitude, Camera/Lens model and so on. As far as I understand this feature is implemented for privacy concerns.

@ashishkumar468
Copy link
Collaborator

Thanks for quick response @VitalyVPinchuk . Does it mean that the attributes which are unticked in Privacy->Manage Exif Tags will not be sent/used ?

@VitalyVPinchuk
Copy link
Contributor Author

Yes it does. Though it doesn't seem for me really straightforward but the community wants it this way. I figured that out from the discussion thread of the initial author of PR that I used.

@ashishkumar468
Copy link
Collaborator

ashishkumar468 commented Jun 27, 2019

IMO, It should have been like by default all the attributes should have been checked, and pruned only when the user explicitly unchecks it. It is very unlikely that the user would go and enable these attributes from the settings on their own and as a result of which we are not using certain exif attributes which we actually need to (for example the GPS coordinated for showing categories), even though the user has not himself actually asked for it?

@misaochan
Copy link
Member

misaochan commented Jun 27, 2019

I apologize for not reading the code thoroughly before this was merged. I was assuming that the user would be given the option of stripping their EXIF information (as per the wording of #70 ), not that it would be stripped by default. FTR, even the web-based Upload Wizard (the default uploader) does not silently strip EXIF, although it allows users the option of removing some of them (i.e. location) during the upload. Check https://commons.wikimedia.org/wiki/File:Sprungli_Confiserie.jpg for instance, which I just uploaded using the default uploader.

I personally feel like silently stripping EXIF by default would be a bad move for us. Users will not be aware of it and will believe that the app is malfunctioning. In order to release with this feature, I think we should at the very least make the default option work the same way as the app always has - with the tags present.

A beneficial addition would be to have a separate upload field that shows the user which EXIF tags are being taken from the image, with a link to the Settings to disable them - but this will likely be a future enhancement.

@neslihanturan @maskaravivek @nicolas-raoul @VojtechDostal thoughts are most welcome.

@neslihanturan
Copy link
Collaborator

I am okay with the either way, but in either options we should emphasize user can change settings "now" and link them to settings page. Otherwise this feature may be forgotten or unrecognized even by privacy concerned people.

@misaochan
Copy link
Member

in either options we should emphasize user can change settings "now" and link them to settings page.

Can we do this in time for this release? Otherwise we may have to revert and bump this feature for the next release.

I personally think releasing without the emphasis is fine, we can always add the emphasis in the next release. It is still an improvement and would not require a revert.

@neslihanturan
Copy link
Collaborator

Sure, not necessary for this release considering we have short time.

@VitalyVPinchuk
Copy link
Contributor Author

So setting all the attributes checked by default seems to a be a good decision.
If I got it right then I could fix it.

@misaochan
Copy link
Member

Thank you @VitalyVPinchuk . :) That's my thoughts on it, yes, but let's give @nicolas-raoul a bit of time to weigh in since he wrote the original issue.

@nicolas-raoul
Copy link
Member

Yes it should be off by default. Thanks!

@maskaravivek
Copy link
Member

Yes it should be off by default. Thanks!

@nicolas-raoul Just to clarify, you mean the EXIF removal should be off by default, right?

@nicolas-raoul
Copy link
Member

@maskaravivek Yes exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants