Skip to content

Fixes regression on copyURLToFile with partial regression tests#319

Closed
chadlwilson wants to merge 1 commit intoapache:masterfrom
chadlwilson:copyUrlToFileRegression
Closed

Fixes regression on copyURLToFile with partial regression tests#319
chadlwilson wants to merge 1 commit intoapache:masterfrom
chadlwilson:copyUrlToFileRegression

Conversation

@chadlwilson
Copy link
Contributor

@chadlwilson chadlwilson commented Jan 25, 2022

The change in f22b426#r63142538 seemed to accidentally break the contract on copyURLToFile in the Javadoc

* Copies bytes from the URL {@code source} to a file
* {@code destination}. The directories up to {@code destination}
* will be created if they don't already exist. {@code destination}
* will be overwritten if it already exists.

This

  • reverts the change at fault to restore master to a state compatible with the contract
  • adds a couple of tests which passed prior to this commit (with some minor tweaks to deal with test tidy up done between then and now).

I'm not sure of the intent of the original change, so this does not attempt to address that. Feel free to update this PR to do so if it makes sense.

@chadlwilson
Copy link
Contributor Author

chadlwilson commented Jan 25, 2022

If the intent is to re-use NIO logic, I suppose the Files.copy could be preceded by Files.createDirectories(destination.getParentFile().toPath());

The usage of Files.copy probably also needs CopyOptions.REPLACE_EXISTING. Haven't thought through all the other cases either, as there are some missing exception cases here in the tests.

@chadlwilson chadlwilson reopened this Jan 25, 2022
@chadlwilson chadlwilson force-pushed the copyUrlToFileRegression branch from 41408e3 to b1d6806 Compare January 25, 2022 14:07
@chadlwilson chadlwilson changed the title Add (disabled) test for regression on copyURLToFile Fixes regression on copyURLToFile with partial regression tests Jan 25, 2022
…r impl

- when parent dirs need creation
- when target file already exists
- reverts part of f22b426
43bd7ee8d957e13a
@chadlwilson chadlwilson force-pushed the copyUrlToFileRegression branch from b1d6806 to f077bb9 Compare January 25, 2022 14:43
@garydgregory
Copy link
Member

If the intent is to re-use NIO logic, I suppose the Files.copy could be preceded by Files.createDirectories(destination.getParentFile().toPath());

The usage of Files.copy probably also needs CopyOptions.REPLACE_EXISTING. Haven't thought through all the other cases either, as there are some missing exception cases here in the tests.

Hi @chadlwilson
Yes, the idea is to use NIO. Would you adjust the PR?

@chadlwilson
Copy link
Contributor Author

If the intent is to re-use NIO logic, I suppose the Files.copy could be preceded by Files.createDirectories(destination.getParentFile().toPath());
The usage of Files.copy probably also needs CopyOptions.REPLACE_EXISTING. Haven't thought through all the other cases either, as there are some missing exception cases here in the tests.

Hi @chadlwilson Yes, the idea is to use NIO. Would you adjust the PR?

I don't really understand the original "NIO" change, because

  • there doesn't seem to be a linked JIRA issue explaining what is trying to be fixed/addressed?
  • it wasn't consistent as it was - FileUtils.copyURLToFile(url, file, connTimeout, readTimeout) wasn't using the NIO mechanism, so it made the two overloads have different semantics.
  • the comment was Use NIO internally to avoid using finalizable FileInputStream. but I don't understand what this meant. In both cases the input stream was opened on the URL with url.openStream() in the calling method, and I don't understand what FileInputStream has to do with the change, as the type of stream depends on the URL type?

From my perspective, right now master is broken and probably not releasable - and I felt it would be better to take it back to a working/releasable state and then re-introduce NIO changes when the intent is clearer and there is sufficient time to ensure there aren't regressions.

asfgit pushed a commit that referenced this pull request Jan 30, 2022
overwrite target file #319.

- Based on PR #39 but different.
- New assert method leaks input stream; this bug was copied from
existing code into a new method.
- Use NIO.
- Tests don't need to schedule files for deletion on exit when JUnit
already manages the directory.
@garydgregory
Copy link
Member

Hello @chadlwilson
I adapted the new tests and provided a different fix.
Please test git master for your use case.
TY!

@chadlwilson
Copy link
Contributor Author

chadlwilson commented Jan 30, 2022

Okay thanks, will give it a go when the snapshot is rebuilt.

@chadlwilson
Copy link
Contributor Author

Does something need to be done to trigger a snapshot build off master? I thought they were dailies if there were any new commits, but I don't see new snapshot builds yet at https://repository.apache.org/content/repositories/snapshots/commons-io/commons-io/2.12.0-SNAPSHOT/

@garydgregory
Copy link
Member

Someone has to set it up on https://ci2.apache.org/#/ IIRC

@chadlwilson
Copy link
Contributor Author

Hmm, if it hasn't been set up on ci2/buildbot I'm a bit confused as to how those previous snapshots got there as https://builds.apache.org/job/Commons/job/commons-io/ seems to imply the build there has been failing for a year. Eek.

@garydgregory
Copy link
Member

I just published a snapshot and reconfigured the Jenkins job to only use Java 8.

@chadlwilson
Copy link
Contributor Author

Thanks! I have got past this particular error and the bunch of tests that were triggering it, but now there seems to be another problem due to 1e73dd9 or perhaps related commits elsewhere which affects dynamic creation of files/paths which I am having trouble tracking down.

There appears to be some change which is leading a file to be created for something that should be a (parent) directory, which then leads to issues with FileUtils.writeStringToFile ending up with

jvm 1    | Caused by: java.nio.file.FileAlreadyExistsException: config
jvm 1    | 	at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
jvm 1    | 	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
jvm 1    | 	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
jvm 1    | 	at java.base/sun.nio.fs.UnixFileSystemProvider.createDirectory(Unknown Source)
jvm 1    | 	at java.base/java.nio.file.Files.createDirectory(Unknown Source)
jvm 1    | 	at java.base/java.nio.file.Files.createAndCheckIsDirectory(Unknown Source)
jvm 1    | 	at java.base/java.nio.file.Files.createDirectories(Unknown Source)
jvm 1    | 	at org.apache.commons.io.file.PathUtils.createParentDirectories(PathUtils.java:339)
jvm 1    | 	at org.apache.commons.io.file.PathUtils.newOutputStream(PathUtils.java:1066)
jvm 1    | 	at org.apache.commons.io.FileUtils.newOutputStream(FileUtils.java:2454)
jvm 1    | 	at org.apache.commons.io.FileUtils.writeStringToFile(FileUtils.java:3543)
jvm 1    | 	at org.apache.commons.io.FileUtils.writeStringToFile(FileUtils.java:3528)

Will see if I can track down what changed.

@chadlwilson
Copy link
Contributor Author

OK, the problem is that FileUtils.writeStringToFile (via PathUtils.newOutputStream) does not behave consistently with the old FileUtils.openOutputStream implementation when trying to create a file within a symlinked dir.

The below works under the old impl, but now fails.

    @Test
    public void testWriteStringToFileInsideSymlinkedDir() throws Exception {
        Path targetDir = tempDirPath.resolve("target-dir");
        Path symlinkDir = tempDirPath.resolve("symlink-dir");
        Files.createDirectory(targetDir);
        Files.createSymbolicLink(symlinkDir, targetDir);

        File targetFile = symlinkDir.resolve("file").toFile();
        FileUtils.writeStringToFile(targetFile, "data", StandardCharsets.UTF_8);

        assertEquals(FileUtils.readFileToString(targetFile), "data");
    }

I guess the root cause is that File.mkdirs() treats links differently than Path.createDirectories. The former checks that the the thing "looks like" a directory, whereas the latter checks that it really IS a directory (not a symlink to a directory).

I guess this regression is different to the one discussed in this PR so probably should raise a new PR with failing test(s).

@garydgregory
Copy link
Member

Hi @chadlwilson
Creating a new PR would be great. TY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants