-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[LANG-1772] Restrict size of cache to prevent overflow errors #1379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[LANG-1772] Restrict size of cache to prevent overflow errors #1379
Conversation
The problem is caused by an integer overflow of
A simpler solution would be to:
|
The test as is blows up GitHub builds so let's use something like |
…ngs method in all other profiles.
I considered that, changing |
Hi @ppkarwasz You've proposed an alternative solution. Would you shows in a PR? |
I'll submit a PR by the end of the week. |
…nside the CachedRandomBits constructor - also checking if the padding produces overflow. No longer using an arbitrary value but being more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcwinters
Thank you for your update.
I think the test should be more of a white box test and test just below and above the overflow. WDYT? @ppkarwasz ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK to me.
I don’t think we need to preemptively generate more than 256 MiB of random data. The goal of generating data in bulk is to take advantage of the fact that random number generators are typically more efficient when producing large chunks of data rather than individual bytes. However, I suspect the optimal amount is significantly less than 256 MiB—we should run some benchmarks to determine the best value.
@jcwinters |
|
* The maximum size of the cache. | ||
* | ||
* <p> | ||
* This is dictated by the {@code if (bitIndex >> 3 >= cache.length)} in the {@link #nextBits(int)} method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the 3
in this expression MUST match the 3
in the expression building cacheSize
in the random(...)
method, then it should be refactored from a magic number in to a constant IMO.
This makes me wonder about the other magic numbers 5
and 10
which beg for documentation if only to help with maintenance.
WDYT?
* <p> | ||
* This is dictated by the {@code if (bitIndex >> 3 >= cache.length)} in the {@link #nextBits(int)} method. | ||
*/ | ||
private static final int MAX_CACHE_SIZE = 0x7FFF_FFFF >> 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this looks better than the 0x1FFF_FFFF
I proposed! 💯
Nit: following in the same direction you proposed, maybe Integer.MAX_VALUE >> 3
would be even easier to understand?
final CachedRandomBits arb = new CachedRandomBits((count * gapBits + 3) / 5 + 10, random); | ||
// For huge strings the padding required can cause an overflow | ||
// 63_913_201 is the highest x such that (21x + 3) / 5 + 10 ? 0x0FFF_FFFF. | ||
final int cacheSize = (count * gapBits + 3) > 0 ? (count * gapBits + 3) / 5 + 10 : 63_913_201; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
63_913_201
is the maximum allowed value for count
, not cacheSize
.
- If
count
is63_913_201
, thencacheSize
is close toMAX_CACHE_SIZE
. - By increasing
count
further, the value of the expression(count * gapBits + 3) / 5 + 10
increases until it almost reachesInteger.MAX_VALUE / 5 + 10
.
If we want cacheSize
to be a non-decreasing function of count
, we should use Integer.MAX_VALUE / 5 + 10 instead of
63_913_201`.
Added a length restriction to
RandomStringutils
, limiting the cache to 60M entries. Because of rejections the bitIndex in the underling cache can overflow when right shifting. Also added a test to verify the fix.This test takes quite a while to run, so if necessary I can create a profile for slow tests to exclude the test from the normal build.