Skip to content

CODEC-236: MurmurHash2 and MurmurHash3 implementations.#22

Merged
garydgregory merged 3 commits into
apache:masterfrom
melloware:CODEC-236
Jul 12, 2019
Merged

CODEC-236: MurmurHash2 and MurmurHash3 implementations.#22
garydgregory merged 3 commits into
apache:masterfrom
melloware:CODEC-236

Conversation

@melloware

Copy link
Copy Markdown
Contributor

All MurmurHash implementations are public domain.

Credit was given in Javadoc where both implementations came from.

@coveralls

coveralls commented Jul 10, 2019

Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.4%) to 92.771% when pulling 67fb54d on melloware:CODEC-236 into 45a195e on apache:master.

@garydgregory garydgregory left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@melloware ,
Thank you for your PR.
Code coverage has decreased, which is not what we want when adding new code. Please run mvn -V clean install -Pjacoco site to discover where you are missing code coverage in the new code.

Comment thread src/main/java/org/apache/commons/codec/digest/MurmurHash2.java Outdated
Comment thread src/main/java/org/apache/commons/codec/digest/MurmurHash2.java
Comment thread src/main/java/org/apache/commons/codec/digest/MurmurHash3.java
@garydgregory

Copy link
Copy Markdown
Member

Hm, XXHash32 and others implements java.util.zip.Checksum. Other classes in org.apache.commons.codec.digest do not implement Codec interfaces where some could, like UnixCrypt. Food for thought. But for the new code in this PR, the fit with the Codec interfaces might not be best.

The Travid build (https://travis-ci.org/apache/commons-codec/jobs/556975625) reveals that the Javadoc for the new code is incomplete:

[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:84: warning: no @param for l0
[ERROR] 	public static int hash32(long l0, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:84: warning: no @param for seed
[ERROR] 	public static int hash32(long l0, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:84: warning: no @return
[ERROR] 	public static int hash32(long l0, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:97: warning: no @param for l0
[ERROR] 	public static int hash32(long l0, long l1, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:97: warning: no @param for l1
[ERROR] 	public static int hash32(long l0, long l1, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:97: warning: no @param for seed
[ERROR] 	public static int hash32(long l0, long l1, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:97: warning: no @return
[ERROR] 	public static int hash32(long l0, long l1, int seed) {
[ERROR] 	                  ^
[ERROR] /home/travis/build/apache/commons-codec/src/main/java/org/apache/commons/codec/digest/MurmurHash3.java:275: warning: no @param for offset
[ERROR] 	public static long hash64(byte[] data, int offset, int length, int seed) {
[ERROR] 	                   ^

It would be best if you could address this as well as the decrease in code coverage.

@melloware

Copy link
Copy Markdown
Contributor Author

@garydgregory Thanks for all your feedback. I went back and Javadoc'ed every method and param. I also was able to get Coveralls unit test coverage up 0.4%! When I grabbed the unit tests from Hive it looks like it had incomplete tests for 128 hashing so I created those tests.

@garydgregory

Copy link
Copy Markdown
Member

One more comment: Did you consider providing a MurmurHash1 implementation?

@melloware

Copy link
Copy Markdown
Contributor Author

I did consider MurmurHash1 but from what i can tell it was more of a prototype that led to the creation of Murmur2. From Wikipedia about Murmur1:

"The original MurmurHash was created as an attempt to make a faster function than Lookup3. Although successful, it hadn't been tested thoroughly and wasn't capable of providing 64-bit hashes as in Lookup3. It had a rather elegant design, that would be later built upon in MurmurHash2, combining a multiplicative hash (similar to Fowler–Noll–Vo hash function) with a Xorshift."

@garydgregory garydgregory merged commit 2bcdfd8 into apache:master Jul 12, 2019
@melloware melloware deleted the CODEC-236 branch July 12, 2019 16:51
omosteven pushed a commit to omosteven/commons-codec-lab-work that referenced this pull request Jan 8, 2025
* CODEC-236: MurmurHash2 and MurmurHash3 implementations.

* CODEC-236: Removed author tag.

* CODEC-236: Added javadoc and increased unit test coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants