-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Improve Unique File Name Search #5877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Unique File Name Search #5877
Conversation
…git alphanumeric hash to append to a file name to make it unique. This improves speed over using an incrementing number to append as there are fewer collisions.
Modified findUniqueFileName() in UploadWorker.kt to use a random 3-di…
// Generate a random 5-character alphanumeric string | ||
val randomHash = (1..3) | ||
.map { chars[random.nextInt(chars.length)] } | ||
.joinToString("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow you are fast!
Would you mind keeping the number sequence strategy?
We are open to a hash strategy, but could you create a GitHub issue to discuss its pros and cons?
Thanks a lot! 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I’m addressing #2506 does have discussion of the different strategies for speeding up this search and from what I could see a hash was voted for as the fastest and easiest to implement. I could revert to the number sequence but the code would remain virtually unchanged. The other option discussed was binary search to maintain the number sequence format of titles but improve search efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah understood thanks! :-)
Sorry for the confusion I was on mobile and did not check the issue properly, I incorrectly assumed it was about #5876 that I had created an hour before.
app/src/main/java/fr/free/nrw/commons/upload/worker/UploadWorker.kt
Outdated
Show resolved
Hide resolved
…git numeric hash rather than the previous 3-digit alphanumeric hash
Modified findUniqueFileName() in UploadWorker.kt to use a random 5-di…
It would be nice to add a unit test to verify that the new code works as expected. Can you do that? (I think you will need to mock If not that's okay - let's leave it as a new issue for later (or for someone else). Is .idea/inspectionProfiles/Project_Default.xml accidantally committed into this patch? |
Sorry, I am hitting exam period so I won't have time to write tests for this. I will open a new issue for it. Also yes, .idea/inspectionProfiles/Project_Default.xml wasn't meant to be changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested with two multi-uploads all using the same caption. It works great!
The #
character is not allowed in filenames and thus gets transformed into -
but that's not an issue as it will be modified anyway in the context of #5876
Thanks a lot! :-)
Changed from increment to alphanumeric hash to append to a file name when file name already exists. Implemented in findUniqueFileName() under UploadWorker.kt.
Fixes #2506
Read comments under issue #2506 for a more in-depth discussion of why this change is necessary and why this is a good solution. findUniqueFileName() was using an incremeanting counter to find unique file names. For example if "Test " already existed as a file title then it would return the title "Test 1" or if "Test 13" exited it would return "Test 14". This made uploading files with popular names very slow as it would sometimes result in many collisions. This implementation has been changed to a random hash, the file name is now appended with a 5-digit numeric hash which will generally result in only 1 collision, making uploading much faster. The updated title now follows the format <existing_title>_#XXX.
Tests performed (required)
Tested testBetaDebugUnitTest on Pixel 9 Pro XL with API level 35.