Skip to content

Increase chunk size to 1MB #4011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 9, 2020
Merged

Conversation

maskaravivek
Copy link
Member

Fixes #3992

@maskaravivek
Copy link
Member Author

@nicolas-raoul I tried reproducing the upload error that you were facing but I wasn't able to reproduce it on my device. I tried the following:

  • used a fast internet connection initially and then tried multiple uploads with slow internet
  • tried single image uploads of 200kb and 4.5mb.
  • tried multi-image uploads with a set of 4 images. each image ~4.5 MB.
  • tried multi-image upload again with a set of 4 images(~4.5 MB each). this time I paused uploads for 2 images to test pause/resume functionality. Once the first two images got uploaded, I resumed the uploads for the remaining images.

However, while testing I noticed a couple of things and have made changes accordingly.

  • initially, the file chunk size was 256 kb ie every file was split into chunks of 256kb each and then each of those chunks was uploaded sequentially. this was done to facilitate pause and resume of uploads but it was leading to a lot of network calls. So I have increased the chunk size to 1 MB.
  • while uploading multiple images at once, some chunks were taking up to 50 seconds to get uploaded. Our network timeout was set to 60 seconds. So I can imagine that some images might fail to upload on lower bandwidths. Therefore I have doubled the timeouts.

cc @misaochan

@maskaravivek
Copy link
Member Author

Also, @nicolas-raoul it would be great if you could test this PR can report any upload failures.

Copy link
Member

@nicolas-raoul nicolas-raoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just uploaded 3 pictures at the same time (not a set, one by one, in slow mode my Internet is so slow that I have time to enter metadata for 3 pictures while the first is still uploading).
All succeeded, which is great!

@nicolas-raoul
Copy link
Member

Ah, I misunderstood the purpose of the PR.
For solving single picture upload problems, it works great.
Sets still do not work, unfortunately. I have sent a log.

@codecov-io
Copy link

codecov-io commented Nov 1, 2020

Codecov Report

Merging #4011 into master will decrease coverage by 0.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #4011      +/-   ##
============================================
- Coverage     10.50%   10.49%   -0.01%     
  Complexity      466      466              
============================================
  Files           335      335              
  Lines         12663    12674      +11     
  Branches        991      991              
============================================
  Hits           1330     1330              
- Misses        11266    11277      +11     
  Partials         67       67              
Impacted Files Coverage Δ Complexity Δ
...a/fr/free/nrw/commons/OkHttpConnectionFactory.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
.../commons/contributions/ContributionViewHolder.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
.../java/fr/free/nrw/commons/di/NetworkingModule.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
.../java/fr/free/nrw/commons/upload/UploadClient.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...java/fr/free/nrw/commons/upload/UploadService.java 0.84% <0.00%> (-0.01%) 1.00 <0.00> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c793827...b7f4dd7. Read the comment docs.

@misaochan
Copy link
Member

Hi @nicolas-raoul , just confirming that sets work for you now? I did see your comments on the issue thread, but those seemed to be describing single uploads?

@misaochan
Copy link
Member

misaochan commented Nov 5, 2020

Hi @maskaravivek , I just tested this on a Samsung Galaxy S20FE running Android 10. Prior to this PR, pausing and resuming worked quite smoothly for me, but currently it seems to be delayed/buggy (I don't think it is a device issue as the specs are pretty high). I made a GIF of the process (sorry for the bad framerate and resolution, had to work around github's upload limit, I can email you the original if you want):

ezgif-4-e2e33b237a7c

  • The first time I pause, when I restart the upload it starts from 0 again even though it was a fifth way through
  • There seems to be a latency of 2 seconds or so when I tap pause. The icon flickers between "pause" and "resume", and the upload just keeps running for 2 seconds, as if it didn't receive the input
  • Whenever I pause, I get a leak reported in LeakCanary. Whenever I resume, I also get a leak. The 2 main leaks seem to be from ReportFragment and UploadActivity
  • If you try this too many times, eventually the icon keeps flickering between pause and resume even when you don't tap it, until you change the state again

On the bright side the uploads were successful. :) I did get one failed upload at first but I was tapping pause/resume far too much, it was probably an edge case.

Also, I guess once Nicolas confirms no more failures with sets, we should probably turn HTTP logging back off?

@nicolas-raoul
Copy link
Member

I uploaded a 5-set on a slow Internet connection, and all pics succeeded except one (I sent a log), which is much better than what I was experiencing on the handleChunkFailure branch :-)

@maskaravivek
Copy link
Member Author

Thank you @misaochan and @nicolas-raoul for testing the PR.

@nicolas-raoul I see that the server returned a timeout for one of the API calls. The upload should succeed on retrying.

Received error in chunk upload
java.net.SocketTimeoutException: timeout
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:656)
	at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:664)
	at okhttp3.internal.http2.Http2Stream$FramingSink.emitFrame(Http2Stream.java:572)
	at okhttp3.internal.http2.Http2Stream$FramingSink.write(Http2Stream.java:543)
	at okio.ForwardingSink.write(ForwardingSink.kt:29)

@misaochan

The first time I pause, when I restart the upload it starts from 0 again even though it was a fifth way through
If I receive an error in uploading the last chunk before pausing, the upload is automatically attempted from the beginning. In general, this shouldn't happen. I just tried a couple of uploads with pause/resume and it continued from the point where it was paused. One quick way to verify is to check for Chunk: Sending Chunk number: 1, offset: 0 log in logcat and verify if it always starts from chunk number 1 after pausing.

There seems to be a latency of 2 seconds or so when I tap pause. The icon flickers between "pause" and "resume", and the upload just keeps running for 2 seconds, as if it didn't receive the input

When the pause is clicked, any chunk after the current chunk is not uploaded. The ongoing chunk will continue to upload. For eg if the file was split into 11 parts and the 5th part is getting upload currently, then on clicking pause part 6 onwards won't be uploaded.

Earlier it was working smoothly as the file was split into chunks of 256kb each but the issue with such a small size is that it leads to more number of chunks for a single file. So the chances of failure are higher. In this PR I had increased chunk size to 1 MB. So for eg, a file of 4.5 MB will be split into 5 parts instead of 18 parts.

As a compromise, I have set the chunk size to 512 KB for now. There would be a latency but it would be lesser than the current latency.

If you try this too many times, eventually the icon keeps flickering between pause and resume even when you don't tap it, until you change the state again

This is because of what I explained above. Also, I am assuming that in general people might be pausing/resuming very frequently.

For the above 2 points, I can add a toast to reassure the user that the upload is being paused.

@maskaravivek
Copy link
Member Author

Whenever I pause, I get a leak reported in LeakCanary. Whenever I resume, I also get a leak. The 2 main leaks seem to be from ReportFragment and UploadActivity

Yes, I also notice the leaks but most of the leaks seem to be from ReportFragment. The app doesn't show any leaks from UploadActivity in the last 4 days. Anyways can you create a separate issue for it?

@misaochan
Copy link
Member

Thanks @nicolas-raoul and @maskaravivek ! The chunk size sounds like a dilemma indeed. Yes, let's try 512 KB for now - if there are upload failures again we can increase to 1 MB again I guess, since that takes precedence over the latency.

For the above 2 points, I can add a toast to reassure the user that the upload is being paused.

That would be great. Let's get one for "Upload paused" and another for "Upload resumed", and then I can merge. :)

@maskaravivek
Copy link
Member Author

@misaochan I have made the changes. I have used "Pausing uploads..." and "Resuming uploads..." so that the user gets an indication that the action might take a couple of seconds.

@misaochan
Copy link
Member

Makes sense. Thanks @maskaravivek ! :)

@misaochan misaochan merged commit 6fa8cc6 into commons-app:master Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upload failures on master
4 participants