-
Notifications
You must be signed in to change notification settings - Fork 703
feat: add length unit support in FileSystem limits #781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Different filesystems and operating systems measure file and path lengths in different units:
* macOS and Windows filesystems typically count **UTF-16 code units**.
* Linux and other UNIX filesystems typically count **bytes**.
This change introduces explicit unit support so these limits can be interpreted consistently.
### Key changes
* **New API**
* Added a `LengthUnit` enum and `FileSystem.getLengthUnit()` to expose the unit of measure used by `getMaxFileNameLength()` and `getMaxPathLength()`.
* Added new overloads for `isLegalFileName` and `toLegalFileName` that accept a `Charset`, making conversions between bytes and UTF-16 explicit.
* **Adjusted defaults**
* Reduced the `GENERIC` filesystem defaults:
* File name length → **1020 bytes** (covers 255 UTF-16 characters encoded as up to 3 UTF-8 bytes).
* Path length → **1 MiB** (covers 32,767 UTF-16 code units, again at 3 UTF-8 bytes each).
* **Testing**
* Added unit tests to validate the new API and updated limits.
garydgregory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Refactors comparison and truncation logic into `LengthUnit`, renamed to `NameLengthStrategy`. * Makes the `NameLengthStrategy` value internal-only. * Improves Javadoc for `getMaxFileNameLength` and `getMaxPathLength` to clarify that staying within the reported limit is necessary but not sufficient for a name or path to be valid on all filesystems.
ecki
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments, feel free to ignore (they are more towards the overall concept of this API not your improvement)
garydgregory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ppkarwasz
I have one small comment. Have @ecki's issues been resolved?
|
BTW: here is the place in the JDK I was refering about, but not sure about the conditions it applies I think it only applies to new Path APIs. |
|
Hi @ppkarwasz |
I think I had trouble implementing a similar function compatible with the OS, that’s the base of my comments but I think it might be a good attempt. Not sure it is a good candidate for a tool (or actually it would be a good candidate if it works), but the new code seems to improve the current api, so why not, |
I merged with |
|
Hi @ppkarwasz |
Yes, look at this comment: #781 (comment) Basically I am testing that the limit in UTF-8 bytes is sharp on macOS and you can not create files with names longer than
If I can confirm that the APFS limit is strict, I can disable the test on macOS or remove it entirely. |
|
Uh wait if the purpose of this utility is to truely report the filenamenlength on the systems then the test should not be skipped but the utility be corrected to handle that case? |
|
You’re right: the test itself shouldn’t be disabled, but only the part that makes assumptions which don’t hold across all macOS filesystems. After looking more closely at macOS name length limits:
In 087654e I updated the test logic:
This way the test still enforces the guaranteed minimum (255 bytes), but tolerates differences between APFS and HFS+. |
Different filesystems and operating systems measure file and path lengths in different units:
macOS andWindows filesystems typically count UTF-16 code units.This change introduces explicit unit support so these limits can be interpreted consistently.
Key changes
New API
Added a.LengthUnitenum andFileSystem.getLengthUnit()to expose the unit of measure used bygetMaxFileNameLength()andgetMaxPathLength()isLegalFileNameandtoLegalFileNamethat accept aCharset, making conversions between bytes and UTF-16 explicit.Adjusted defaults
Reduced the
GENERICfilesystem defaults:Testing