Conversation
- Restore the non-threshold implementation of the Levenshtein distance algorithm from commons-lang3's StringUtils. This version avoids copying arrays. compare(left, right) no longer delegates to compare(left, right, threshold). - Fix the threshold implementation's table striping ASCII art. - Add "threshold" field to LevenshteinDistance. If this is null, then the unlimited version of the algorithm is used. Else, the limited version is used. - Add default and threshold versions of the constructor. - Add static getDefaultInstance() which returns the unlimited version.
- Add "locale" field to FuzzyDistance. Add default and Locale constructor. Add static getDefaultInstance(). A null locale will cause Locale.getLocale() to be called each time compare(left, right) is called. - Fix method name type in FuzzyScoreTest. - Remove tests that no longer fail because of a null Locale. The algorithm is no longer public. If the field is null, Locale.getLocale() is used.
- Fix "s" and "t" -> "left" and "right" variable in Levenshtein distance algorithms.
- Remove unused org.junit.BeforeClass imports from FuzzyScoreTest and LevenshteinDistancetest.
There was a problem hiding this comment.
Better make this check in the constructor and always assign a value to the locale field. This way all instance will be in a valid state after construction.
There was a problem hiding this comment.
I'm not sure. What if Locale.setDefault(Locale) is called afterward? Shouldn't it switch to using the new Locale?
Please confirm. I would also remove the DEFAULT_INSTANCE in that case.
Thanks!
There was a problem hiding this comment.
I providing a default for the locale field a good idea at all? I don't think so. I think the locale parameter has to be mandatory.
- Remove the Locale.getDefault() logic in FuzzyScore. There is no longer a DEFAULT_INSTANCE and the Locale may not be null.
There was a problem hiding this comment.
Even simpler:
private static final ENGLISH_SCORE = new FuzzyScore(Locale.ENGLISH);
Is that OK, or are there reasons to use @BeforeClass (more JUnit-y) or @Before (not static)?
Thanks.
- Replace instances of FuzzyScore(Locale.ENGLISH) in FuzzyScoreTest with a static final ENGLISH_SCORE constant. - Replace instances of LevenshteinDistance() in LevenshteinDistanceTest with a static final UNLIMITED_DISTANCE constant. - Convert LevenshteinDistanceTest.testGetLevenshteinDistance_StringStringInt() into ParameterizedLevenshteinDistanceTest.
Add missing ".similarity" part of package path to @link's in the unit test classes.
Remove unused Map and TreeMap imports from ParameterizedLevenshteinDistanceTest.
|
Added a PR for SANDBOX-493 which I'd like to merge into this branch after/if 493 makes it into master. |
|
Merged, thanks!!! |
This reverts commit 41ad405.
Please review these changes to LevenshteinDistance and FuzzyScore.